[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: differences between implementations of IEEE-754 basic operators



Now that we know that the real question was about emulating one IEEE-754
implementation (PowerPC) on another (x86 assembler), together with the
knowledge that the emulated platform traps and aborts on Invalid Operation
and Overflow, and never looks at the IEEE Underflow and Inexact flags (or
uses Underflow or Inexact traps), the situation is much simpler, as there
remain only a few tricky cases.

FMA can indeed be emulated, in which case the variant being emulated can
be handled appropriately.  The emulator's use of flags is private and would
not affect the emulated flags except as required.

Later we find out however that the platform to be emulated is NOT fully
754-compliant:  abrupt underflow to zero, and no infinities or NaNs.

At issue here are the precise thresholds at which those events occur.

NaNs are not an issue as we expect Invalid Operation always to trap.

Overflow may not be an issue either, provided the overflow trap happens
at the same threshold on both platforms.  The emulator would simply
trap also, and emulate the parameter-less trap, so infinities won't
be generated.  (Infinities and NaNs may have to be detected in inputs,
and presumably made to signal the Invalid Operation trap.)

Underflow is tricky however.  At issue is the underflow threshold, in
particular tininess detection before/after rounding.  If we can assume
that the PowerPC E500 uses the same rule as other PowerPCs, it will
presumably flush to zero when tininess is detected before rounding.
I don't know about x86 SSE, but x87 detects tininess after rounding,
so that a (rounded) result of Nmin could have been flushed to zero
by the emulated platform, but the emulator would not see an underflow.

One way to deal with this is to check every result that could have
underflowed to see if it is Nmin, and if it is, to repeat the operation
with rounding towards zero.  If the x86 flags include Inexact and
Incremented indicators, an Nmin with both of those set would also be
a case where the result would have to be flushed to zero.  All actual
underflow indications would of course also be made to flush to zero.
Since we have to watch every result in-line anyway, we (the emulator)
could actually avoid the need for underflow traps -- it would however
have to clear the sticky IEEE flags before the operation.

Michel.
Sent: 2008-07-05 00:39:42 UTC


754 | revision | FAQ | references | list archive