Thread Links			Date Links
Thread Prev	Thread Next	Thread Index	Date Prev	Date Next	Date Index

Re: interval arithmetic

To: Ulrich Kulisch <ulrich.kulisch@xxxxxxx>, stds-1788 <stds-1788@xxxxxxxxxxxxxxxxx>, Götz Alefeld <Goetz.Alefeld@xxxxxxx>, Gerd Bohlender <bohlender@xxxxxxx>, Van Snyder <Van.Snyder@xxxxxxxxxxxx>
Subject: Re: interval arithmetic
From: James Demmel <demmel@xxxxxxxxxxxxxxxxx>
Date: Sat, 2 Jan 2016 16:53:57 -0800
Cc: James W DEMMEL <demmel@xxxxxxxxxxxx>, Hong Diep Nguyen <hdnguyen@xxxxxxxxxxxxxxxxx>, Peter Ahrens <peter.ahrens@xxxxxxxxxxxx>
Delivered-to: mhonarc@xxxxxxxxxxxxxxxx
In-reply-to: <5686373B.8090709@kit.edu>
List-help: <https://listserv.ieee.org/cgi-bin/wa?LIST=STDS-1788>, <mailto:LISTSERV@LISTSERV.IEEE.ORG?body=INFO%20STDS-1788>
List-owner: <mailto:STDS-1788-request@LISTSERV.IEEE.ORG>
List-subscribe: <mailto:STDS-1788-subscribe-request@LISTSERV.IEEE.ORG>
List-unsubscribe: <mailto:STDS-1788-unsubscribe-request@LISTSERV.IEEE.ORG>
References: <5686373B.8090709@kit.edu>
Sender: stds-1788@xxxxxxxx
User-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:38.0) Gecko/20100101 Thunderbird/38.4.0

Since a motivation of this work is making floating point addition
associative, i.e. bit-wise reproducible no matter what the order
of summation is, I'd like to point out some related work that
accomplishes this, to a precision chosen by the user (including
close to the usual bound). The algorithm does just one pass over
the data (in any order), uses only conventional IEEE 754 operations,
and uses only a limited number of registers, depending on the chosen

precision. Our base implementation can get an absolute error bound ofroughly

(*)   n*macheps*2^(-28)*max_i |s_i| + 7*macheps*|S|
where s_1,...,s_n are the summands, and S is the exact sum, so up to about
2^(-28) times smaller than conventional summation, using just 6
double precision floating point registers, and costing about 7*n flops.
The slowdown compared to conventional (highly tuned) summation ranges from
3x (n=1M on 1 processor)  to 1.2x (n=1M on p=1024 processors, each processor

getting 1024 summands). Increasing the number of registers from 6 to 2*K(for K>3)

lowers the error bound to roughly
(**)      n*2^(40*(1-K))*max_i |s_i| + 7*macheps*|S|
and raises the cost to roughly (3*K-2)*n flops. It is possible to imagine
additional specialized floating point operations to accelerate this; we have
spoken to one hardware vendor about this possibility (at their request).

For details, and a preliminary release of the ReproBLAS (Reproducible BLAS)
based on this approach, see
    http://bebop.cs.berkeley.edu/reproblas/index.php
We are close to a new release, and also a more comprehensive description
of the algorithms, error bounds and software.

This email exchange raises the natural question of whether this (andrelated)algorithms could be modified to provide interval bounds. It seemsstraightforwardto use the error bound (*) or (**) to create guaranteed and reproducibleupper and

lower bounds, and I expect that they could be tightened.

This is joint work with Diep Nguyen and Peter Ahrens. There is a growingamountof other related work (see our references), motivated by the fact thateven yourmulticore laptop no longer guarantees reproducible results because ofnonassociative

floating point addition, combined with nondeterministic execution order.

In fact Intel recently released a new version of MKL with "CNR" =ConditionalNumerical Reproducibility that guarantees reproducibility, provided thattheuser guarantees to use the same number of cores. This was (obviously)motivated by

customer demand, a primary reason being debugging. MKL CNR works by
guaranteeing the order of execution, but this approach does not scale up
to large numbers of processors.

Jim Demmel



On 1/1/16 12:22 AM, Ulrich Kulisch wrote:

Dear friends,

I wish you all a happy and successful year 2016!
For my more recent mails to the group I did not get a siungle reply.Nevertheless I attach a not yet published paper:
High Speed Associative Accumulation of Floating-point Numbers andFloating-point Intervals
It may help one or the other to understand a secret for success ofinterval arithmetic.
Best regards
Ulrich

References:
- interval arithmetic
  - From: Ulrich Kulisch

Prev by Date: Applications of interval arithmetic, examples
Next by Date: Re: interval arithmetic
Previous by thread: Re: interval arithmetic
Next by thread: Applications of interval arithmetic, examples
Index(es):
- Date
- Thread