[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

reproducibility of tininess detection for binary formats



It turns out that the other way -- detect tininess before rounding -- is
actually easier to provide when the native behaviour is "after rounding",
as far as I can tell.  This suggests that it should be what is required in
reproducible mode -- and it would match the decimal rule, which is nice.

In this mode, every Underflow exception can be handled normally; none has
to be suppressed or covered up.  It is necessary however to check whether
additional underflow cases need to be simulated: those with an unrounded
tininess.  If underflow did not occur, the result must have been rounded
up to Nmin -- so it is sufficient to detect Nmin results, and to check
that these would have underflowed when rounding towards zero!

If rounding modes are static, this is really easy.  If rounding is towards
zero, nothing need be done.  For directed rounding, only one of +Nmin and
-Nmin needs to be checked; for round-to-nearest, both possible results
must be checked -- but those branches will be predicted not-taken and
cost very little.  When the branch *is* taken, repeat the operation with
rounding towards zero and alternate exceptions disabled (because we don't
want to reach the exception handler with the wrong result and the wrong
current rounding mode, on machines that use a global FP control register).
Actually, if default underflow handling was in effect and the Underflow
flag was already raised, there was no need to do anything.

If the repeated operation in round-towards-zero mode yields a different
result, we have an inexact tininess-before-zero case, and in fact the
Underflow and Inexact flags should have been raised.  The original result
and rounding-mode setting should be restored, and for default exception
handling we're done.  For alternate exception handling the exception will
have to be simulated (after having restored result and rounding mode).

If rounding modes are dynamic, the fully general case has be to be handled
in each instance, so there may be a bit more code involved, but not much
more action -- and the common execution penalty is still just going to be
those non-taken (and correctly predicted) branches.

Proper implementation of after-rounding detection is more expensive because
state has to be captured before the operation, so that unwanted underflow
can be covered up without side effects.  On IBM's System z the whole thing
can be hidden in an always-physically-enabled Underflow trap handler, which
can resume in-line after fixup if necessary, but on System p this would not
work because even a taken trap sets the Underflow exception flag -- which
could not be undone without having captured the prior state of the flag.

Others would have to evaluate the mechanisms for other platforms.

Michel.

P.S.  I will revise my Ballot comments (which have not been submitted
      officially -- I wanted to get reactions anyway) to recommend that
      reproducible tininess detection be required after all -- but in
      the same way as for decimal formats, namely before rounding.

      I think I will also remove my objection to standardising fma()
      behaviour in reproducible mode because I have convinced myself
      that the cost is acceptable -- and if we deal with tininess, we
      might as well deal with fma() as well.
Sent: 2007-10-14 04:03:55 UTC

754 | revision | FAQ | references | list archive