[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bit-for-bit vs bug-for-bug reproducibility



David Hough brought this up:

Customers have rejected AMD-based systems because they haven't delivered
identical results to Intel-based systems.  They might be convinced that
the results are just as good, but they aren't the same, and nobody enjoys
tracking down these kinds of discrepancies.

The only way to avoid this kind of trouble is for encapsulated primitives
to have the means to produce correctly-rounded results.  Indeed, as David
also said, Intel themselves would be hobbled.

When IBM was considering IEEE FP support for their mainframes in the late
1990s, a bindary/decimal conversion instruction was under consideration to
provide a complete hardware (well, microcode in this case) implementation
of the 1985 standard.  I pointed out that a hardware instruction had (in
my opinion) no choice but to go beyond the 1985 standard, and had to round
correctly across the entire exponent range, precisely to avoid the need to
be bug-for-bug compatible in the future.  We ended up with a software
solution instead, but treated it with the same respect as hardware.  (Ten
years later I knew enough actually to do it in "hardware", with PFPO.)

Again, this falls within the scope of an acceptable approach to defined
reproducibility: define the primitives and constructs that DO provide
reproducibility, and provide control over their availability.  As Florent
also mentioned, this means that correctly-rounding library functions must
be available -- but it would be acceptable if correct rounding is limited
to a reasonable subdomain, shifting part of the burden onto the programmer.
I realise that, for this approach to be portable, there will have to be
an agreed-upon lower bound on correctly-rounded domains.  (Luckily, by the
time 754R will be ready, most functions can already support the full domain,
at least for a subset of formats.  Decimal128 may have to wait a bit...)


There is in fact a lot of technical work to be done to ensure that we have
the proper mechanisms for controlled reproducibility.  It would be unwise
to rush this into the standard at this point, which is why we cannot do
much better than stating a sense of direction, and perhaps take up the
issue in a future standard.  It would be a shame to scuttle 754R at this
point by burdening it with infeasible (not just unrealistic) requirements,
or incompletely-thought-out partial solutions.

Michel.
Sent: 2007-07-15 02:21:11 UTC

754 | revision | FAQ | references | list archive