Thread Links Date Links
Thread Prev Thread Next Thread Index Date Prev Date Next Date Index

Re: Exact dot product



As you know there are other fast methods for fast and
accurate summation and dot product, including reproducible
results.
If you want to stick to the long accumulator, utilizing
the old idea of overlapping chunks in Malcolm's 1971
Comm. ACM paper may further improve the performance
without loosing something.
"The long accumulator" may be regarded as Malcolm's
approach without overlapping.

Siegfried M. Rump



Am 19.06.2015, 23:02 Uhr, schrieb James Demmel <demmel@xxxxxxxxxxxxxxxxx>:

You might find the attached class project report interesting.
Two grad students here, David Biancolin and Jack Koenig (cc-ed)
used the hardware design tools being designed at Berkeley to
design a long accumulator, and measure its chip area and
power requirements at a fairly deep level of detail. Please contact
them if you have further questions.

Jim Demmel



--
=====================================================
Prof. Dr. Siegfried M. Rump
Institute for Reliable Computing
Hamburg University of Technology
Schwarzenbergstr. 95
21071 Hamburg
Germany
phone +49 40 42878 3027
fax +49 40 42878 2489
http://www.ti3.tuhh.de

and

Guest Professor at Waseda University
Faculty of Science and Engineering
3-4-1 Okubo, Shinjuku-ku
Tokyo 169-8555
Japan
phone/fax in Japan +81 3 5286 3330

---
Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft.
https://www.avast.com/antivirus