[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Meeting the Scope and Purpose of P754



Bob Davis wrote:
(...)
Begin below with your proposed solution.
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
1.x.y ReproducibleResults Clause


I'm not sure I understand the question but I'll try to answer.

Background: we are designing portable, correctly-rounded elementary functions, for which reproducibility is simply a must. Nobody wants to parallelize such computations (except in the limited sense of using the SIMD, aka multimedia instructions in processors, which have a deterministic semantics). Therefore the issue that parallelism brings in fundamental non-determinism is irrelevant. It is like saying "since you cannot ride your bicycle to travel to the States, you should abandon the idea of using it for commuting" (I did hear that, too).

We use C, and we are almost happy with it: C has a mostly deterministic semantics concerning the order of evaluation. Unfortunately it has no determinism concerning the precision of the evaluation: you declare a float but the compiler is free to replace it with a double if it is no slower (that's an "executive summary"). Funnily enough, on recent IA32 systems with SSE instructions, single precision becomes faster again, and the accuracy of your program may be degraded on a newer system compared to an older system. Same situation for the doubles with the advent of SSE2.

It makes things quite messy: in our case we declare only doubles (for portability), but registers are double-extended and memory location are double-precision, so in effect the result of the computation is affected by register allocation, which is itself modified by inserting a printf() (see Monniaux). Why are registers double-extended? because the FPU status is a global register that is, from the C point of view, the responsability of the OS - note that other languages take the responsability of it.

So to answer Bob's question from my limited point of view: the requirements should be those copypasted from the C99 standard to cover order of evaluation, priorities, etc. Plus the missing bit: non-ambiguous rules on the precision of the evaluation (which should cover implicit casts). I think all the necessary building blocks are more or less already in the draft.

I believe it could be done in a way that would not go against the C99 standard. C99 allows the compiler to use any precision larger than the declared one, but does not mandate to do so AFAIK (Guillaume or Vincent will correct me on this one). In other words, we could get a new flag to gcc that would be compatible with the -c99 flag, just preventing some optimisations in some systems.

Finally, and this may come as a shock to some, we could also use Fortran: the Fortran standard states that (executive summary again) "the compiler may do whatever expression rewriting it considers "mathematically equivalent", but if the user has written parentheses he shall respect them". So when we need determinism (it was the case for the LHC@Home project) we just add parentheses everywhere. Then there is the issue of precision, I am not sure of what Fortran does because with the compiler we used the status flag was set to double precision, and we had no problem. Again, asking from Fortran a new layer of conformance imposing an evaluation order would not go against compliance with the Fortran standard.


I'm not sur I have added anything new to the discussion. But it definitely is an important issue to us.

Sorry again for considering the issue from my very limited point of view. No idea about exceptions for instance.

   Florent

754 | revision | FAQ | references | list archive