Thread Links			Date Links
Thread Prev	Thread Next	Thread Index	Date Prev	Date Next	Date Index

Re: ExactDotProduct

To: Ulrich Kulisch <Ulrich.Kulisch@xxxxxxxxxxx>
Subject: Re: ExactDotProduct
From: James Demmel <demmel@xxxxxxxxxxxxxxx>
Date: Tue, 03 Nov 2009 11:40:05 -0800
Cc: stds-1788@xxxxxxxx
Delivered-to: mhonarc@xxxxxxxxxxxxxxxx
In-reply-to: <4AEB390F.7050703@xxxxxxxxxxx>
List-help: <http://listserv.ieee.org/cgi-bin/wa?LIST=STDS-1788>, <mailto:LISTSERV@LISTSERV.IEEE.ORG?body=INFO%20STDS-1788>
List-owner: <mailto:STDS-1788-request@LISTSERV.IEEE.ORG>
List-subscribe: <mailto:STDS-1788-subscribe-request@LISTSERV.IEEE.ORG>
List-unsubscribe: <mailto:STDS-1788-unsubscribe-request@LISTSERV.IEEE.ORG>
References: <4AEB390F.7050703@xxxxxxxxxxx>
Sender: stds-1788@xxxxxxxx
User-agent: Thunderbird 1.5.0.14 (Windows/20071210)

My point was that even a single hardware register is implicitly sorting,bucket sortingby exponents, because it needs to be able to add and possibly canceloperands with

overlapping mantissas.

Imagine a list of summands
    (x(1), -x(1), x(2), -x(2),....,x(n),-x(n), x(n+1))
whose sum is x(n+1), and where no two summands +-x(i) and +-x(j) overlap in

their mantissas unless i=j (for a given floating point format, thislimits n, but in my emailI said the algorithm had to be correct for any format). Suppose they aresupplied in some

scrambled order. Then to get the correct sum you need to find the summands

exceeding x(n+1) in magnitude, and realize that they cancel, and findthe summandssmaller than x(n+1) in magnitude, and realize that they do not matter.This seems

tantamount to sorting, at least from a theoretical point of view.
Jim Demmel

Ulrich Kulisch wrote:

Jim Demmel wrote:
> Though I have not written this down formally, I think that analgorithm> that is correct for any underlying number of mantissa and exponentbits
> must in effect do as much work as sorting (by the exponents).
There are plenty of sketches for hardware circuitries of the exact dotproduct in my book ([9] in the proposal). Non of them uses sorting.The result is independent of the sequence in which the summands areadded. All these circuits are simple and extremely fast (like vectoroperations on conventional vector processors). Of course, thesetechniques can also be implemented in software.
Best regards
Ulrich Kulisch

References:
- ExactDotProduct
  - From: Ulrich Kulisch

Prev by Date: Re: ExactDotProduct
Next by Date: ExactDotProduct
Previous by thread: ExactDotProduct
Next by thread: ExactDotProduct
Index(es):
- Date
- Thread