[
Date Prev][
Date Next][
Thread Prev][
Thread Next][
Date Index][
Thread Index]
Re: Meeting the Scope and Purpose of P754
Bob Davis wrote:
(...)
Begin below with your proposed solution.
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
1.x.y ReproducibleResults Clause
I'm not sure I understand the question but I'll try to answer.
Background: we are designing portable, correctly-rounded elementary
functions, for which reproducibility is simply a must.
Nobody wants to parallelize such computations (except in the limited
sense of using the SIMD, aka multimedia instructions in processors,
which have a deterministic semantics). Therefore the issue that
parallelism brings in fundamental non-determinism is irrelevant. It is
like saying "since you cannot ride your bicycle to travel to the States,
you should abandon the idea of using it for commuting" (I did hear that,
too).
We use C, and we are almost happy with it: C has a mostly deterministic
semantics concerning the order of evaluation. Unfortunately it has no
determinism concerning the precision of the evaluation: you declare a
float but the compiler is free to replace it with a double if it is no
slower (that's an "executive summary"). Funnily enough, on recent IA32
systems with SSE instructions, single precision becomes faster again,
and the accuracy of your program may be degraded on a newer system
compared to an older system. Same situation for the doubles with the
advent of SSE2.
It makes things quite messy: in our case we declare only doubles (for
portability), but registers are double-extended and memory location are
double-precision, so in effect the result of the computation is affected
by register allocation, which is itself modified by inserting a printf()
(see Monniaux).
Why are registers double-extended? because the FPU status is a global
register that is, from the C point of view, the responsability of the OS
- note that other languages take the responsability of it.
So to answer Bob's question from my limited point of view: the
requirements should be those copypasted from the C99 standard to cover
order of evaluation, priorities, etc.
Plus the missing bit: non-ambiguous rules on the precision of the
evaluation (which should cover implicit casts). I think all the
necessary building blocks are more or less already in the draft.
I believe it could be done in a way that would not go against the C99
standard. C99 allows the compiler to use any precision larger than the
declared one, but does not mandate to do so AFAIK (Guillaume or Vincent
will correct me on this one). In other words, we could get a new flag to
gcc that would be compatible with the -c99 flag, just preventing some
optimisations in some systems.
Finally, and this may come as a shock to some, we could also use
Fortran: the Fortran standard states that (executive summary again) "the
compiler may do whatever expression rewriting it considers
"mathematically equivalent", but if the user has written parentheses he
shall respect them". So when we need determinism (it was the case for
the LHC@Home project) we just add parentheses everywhere. Then there is
the issue of precision, I am not sure of what Fortran does because with
the compiler we used the status flag was set to double precision, and we
had no problem. Again, asking from Fortran a new layer of conformance
imposing an evaluation order would not go against compliance with the
Fortran standard.
I'm not sur I have added anything new to the discussion. But it
definitely is an important issue to us.
Sorry again for considering the issue from my very limited point of
view. No idea about exceptions for instance.
Florent