[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [STDS-754] reproducibility of tininess detection for binary formats



Guillaume Melquiond <guillaume.melquiond@xxxxxxxxxxx> wrote on 15/10/2007 
07:29:20:

Le dimanche 14 octobre 2007 à 15:51 -0700, David Hough 754R work a
écrit :

One extreme definition of exponent spill is:
the magnitude of a non-zero finite unrounded result is larger than
the largest normalized number in the destination format
or smaller than the smallest normalized number in the 
destination format.

The opposite extreme definition is:
the numerical result rounded to the destination format differs from 
the
numerical result rounded to a hypothetical format with the 
same precision
as the destination format but unbounded exponent range.

But the first one depends only on the magnitude of the unrounded exact
result.     The second one also depends on the current rounding mode.
Neither one will satisfy everybody all the time, but I prefer the 
first
because it's easier to undo an unwanted exception when it is 
signaled than to 
test for a unsignaled exception on every operation where it 
might have been
signaled.

Could you detail a situation where the second definition will not
satisfy everybody? It seems like this is the definition that any user
actually expects. For example, if a developer checks the sticky flags
(or trap exceptions) in order to ensure that a sequence of operations
did not go astray (bounded relative error as predicted by the model),
then both definitions work. But only the second one has the advantage of
never raising a false positive.

I can.  Consider an implementation that provides a Subnormal flag (which 
means: the magnitude of the result of the operation was less than that of 
the smallest normal number (Nmin, = b**emin)).  In such an implementation, 
the first definition trivially defines Underflow as Subnormal & Inexact. 

This is easy to document and to explain to users: "Underflow is a when the 
magnitude of a result is less than Nmin and cannot be represented 
exactly".  To me, 'this is the definition that any user actually expects'.

The second definition is much more complicated; it is no longer a simple 
conjunction of the two concepts -- it now has a special case which depends 
on the rounding mode.  "Underflow is when the magnitude of the result is 
less than Nmin and cannot be represented exactly, except in the case when 
the inexactness causes the result to be rounded to Nmin -- in which case 
we pretend the result was not subnormal and we do not underflow after 
all". 

[Overflow already has the similar unpleasantness, but perhaps that's the 
least of its problems :-).]

mfc





Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU

754 | revision | FAQ | references | list archive