Re: Mixing reproducible and non-reproducible code
On 2012-12-05 10:26:08 +0000, N.M. Maclaren wrote:
> On Dec 5 2012, Vincent Lefevre wrote:
> >Not necessarily, if the reproducibility is based on the behavior,
> >not how it is obtained. For instance, in tightest accuracy mode,
> >results will be reproducible at the individual function level, but
> >how it is computed will differ from one implementation to another
> >one.
> >
> >Perhaps an additional requirement would be that the behavior be
^^^^^^^^^^^
In case this was not clear, I meant here a requirement on the
"reproducibility mode", *not* that an implementation is required
to implement this mode.
> >sufficiently documented so that anyone could build an implementation
> >on his system, where the same results will be obtained.
>
> We are clearly at cross-purposes. There are two aspects, of which the
> first is:
>
> If we specify that the standard is defined only for programs with
> known, straightforward ways of achieving reproducible results
> (i.e. those that use only trivial functions in trivial ways, the
> standard becomes useless.
>
> If we demand absolutely reproducibility for some operations and
> not others, we are (a) conflicting with many language's numeric
> models and (b) imposing a burden on the implementation difficulty
> that delivers nothing useful for the vast majority of programs.
Well, it is up to the user to choose the reproducibility mode
or not, depending on his needs. I recall that reproducibility must
remain optional. And if the user wants the reproducibility mode,
it is up to him to write code that will allow the efficiency he
wants, which may depend on the specification by the implementation.
For instance, consider an implementation based on floating-point
numbers with a huge exponent range and providing reproducibility
with correct rounding (tightest accuracy mode) on all functions
and on the full domain (like MPFI). If a user writes:
C = 2^1000000000 /* assume it to be exact */
xx = [C,C]
yy = sin(xx)
then it will take a lot of time, possibly with too much memory.
So, the user would have to change his code.
Note: similarly if the user has code based on random numbers (e.g.
for the Monte-Carlo method), he will have to choose a reproducible
PRNG.
> The second builds on that, and is based on the well-known fact that
> there are lots of commonly used procedures that have no known
> accuracy properties and no known, feasible way of achieving a
> pre-required accuracy. In some cases, it is known that the time
> (and even space!) complexity is super-exponential in the inverse
> of the accuracy.
>
> If we specify the accuracy of the best known and published feasible
> method, we are preventing any future upgrade or other system from
> using a better algorithm (whether 'better' means faster, more
> accurate or whatever).
I completely agree. That's why I've suggested something based on the
behavior (in the sense something like correct rounding) instead of
some implementation/algorithm. The advantage of correct rounding (or
tightest accuracy) is that one cannot have a better accuracy for the
individual function (with a fixed number format / interval type).
It has some drawbacks (as mentioned above[*]), but there are
workarounds.
[*] The problem of sin/cos/tan on huge arguments is not primarily
due to correct rounding, though. Even faithful rounding in small
precision is affected.
> We are also requiring languages to locate such a method, and
> specify it as part of their standard. No way, Jose! I can tell
> you what several ISO working groups would say, and it would be
> best not put in Email.
There is nothing required. Reproducibility is optional. And I don't
think any language standard forbid an implementation to generate
reproducible code if the user cooperates (e.g. he must not read in
/dev/random under Linux or call an equivalent library function).
> We are also forbidding some classes of algorithm entirely, such
> as parallel Monte-Carlo integration, which is actually the best
> way of calculating some functions!
Then do not use them if you need reproducibility.
> If you want reproducibility as an option that consenting languages
> can specify, fine. Only Java and a few others will adopt it.
If a language standard doesn't have anything about reproducibility,
it isn't major a problem. Implementations can still provide
reproducibility as an option.
> Making it mandatory as an option is a sure way of getting 1788 rejected
> or grossly abused by most of the important ones.
I wonder what you mean by "mandatory as an option". There should be
nothing mandatory for the implementer and for the user.
But reproducibility can be specified as an option, and if an
implementation claims to support it (e.g. via a special mode),
then it shall not lie. That's all.
Actually I'd rather see something informative. It may be too early
to have something normative.
> That is the mistake that 754 has made, twice, and the results are
> clear.
I don't understand. Correct rounding was specified for individual
arithmetic operations, and it works quite well for what it is
specified. If one needs something completely reproducible, then
one needs clear bindings and so on, but this is out of the scope
of IEEE 754. And of course, it also needs some cooperation from
the user.
--
Vincent Lefèvre <vincent@xxxxxxxxxx> - Web: <http://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)