Thread Links Date Links
Thread Prev Thread Next Thread Index Date Prev Date Next Date Index

Re: Question on performance



On Mon, Oct 11, 2010 at 8:09 PM, Arnold Neumaier
<Arnold.Neumaier@xxxxxxxxxxxx> wrote:
> Lee Winter wrote:
>
>> Also users have to understand that aggressive optimization may replace
>> the expression (x!=x) with (false) because the optimization process
>> uses a more primitive FP algebra than '754 requires.  Such optimizers
>> also tend to ignore the sign of an underflow because they treat all
>> underflows as zeros and the sign of zero is not useful in the
>> primitive FP algebra they use.
>
> So an optimizing compiler may also change ~(x<y) into x>=y, thereby altering
> without notice the result when x or y is NaN?

Yes, that is a common symptom of an optimizer with an inadequate FP algebra.

The term of art for this issue is "nan-safe" computing.  Many
compilers are not.  Even more are not in their default operation.

<rant>
This is one of the key reasons that I have consistently advocated four
distinct parts of IA:

1.  Interval Arithmetic: here arithmetic means the operations "above"
naming, listing, and counting theory and "below" algebra, thus the
primitives.

2.  Interval Algorithms: these include all functions commonly
implemented in a math library as well as algorithms that are
specifically designed to utilize the properties that are specific to
intervals as opposed to other numeric data types.

3.  Interval Applications: these include all systems that produce
and/or consume interval data whether human or machine readable.

4.  Interval Algebra: this is the weakest part of any IA system. I
know of only two reasonably comprehensive systems, one by Pryce and
one by Hansen & Walster.  Every implementation of IA has some kind of
interval algebra, but few provide a rigorous definition of their
algebra.

My analysis of the two systems mentioned above is limited to the
perspective of IEEE-754 et seq. and it is stale by about three years.
In those analyses I found the Pryce proposal quite flexible, but not
quite sufficient to handle the more subtle aspects of '754.  It
appeared to me that the Pryce algebras are intended to handle a wide
variety of underlying FP hardware that implement various subsets of
the full capability of '754.

The Hansen & Walster proposal is more than sufficient to handle all of
'754, but, being aimed at a hardware implementation of IA, it requires
a number of features that '754 lacks, and so would need software
emulation of those features, which would be a serious performance
cost.

As a last resort my implementation of IA uses an algebra directly
extended from '754, but I cannot claim that there is a definition of
that algebra that is both concise and mathematically rigorous.  It was
merely necessary and minimally adequate.

Thus it is my belief that an interval standard lacking a rigorous
definition of the underlying interval algebra would be worse than no
standard.
</rant>

>
> This would make it dangerous to rely on properties of the IEEE standard
> regarding NaN.

In many compilers it is beyond dangerous: hopeless is the term I would
use.  The situation is effectively binary: either your
hardware+software platform is nan-safe or it is not.

>
> Can one enforce different levels of optimization on different parts of code,
> and hence force a compile to respect these issues?

A reasonable question, but one that lacks a reasonable answer.  It
depends on the compiler.  Some compilers are nan-safe, and thus can be
optimized without risk.  Some that are or can be made to be nan-safe
impose unreasonably large performance costs.  Some generate code that
is nan-safe, but have run-time libraries that are not nan-safe.

Most compilers do not have an optimization levels that turns off _all_
mathematical transformations.  In fact I know of none.  It may be that
no such compiler is possible because the translation from source to
object code often requires some minimal level of transformative
freedom.

As for the ability to apply different optimization levels, all
sensible compilers allow separate compilation and linkage of the
resulting object modules.  But not all compilers support inline
optimization control over sections of a single translation unit.  And
some of those that do support inline optimization have subtle
differences between the effect of optimization switches applied to the
entire translation unit and inline optimization switches.  It is my
understanding that many of those subtleties are related to the
multi-pass nature of modern compilers.  Keeping track of each section
of code at the source code level becomes difficult once those sections
have been transformed into the internal representations for which
source code position is just a tag used in diagnostic messages.

A similar problem affects other kinds of inline optimizations.  For
example C++ templates are often compiled after the completion of the
translation units that utilize the templated code.  When the template
code is compiled and object code generated it might use the
optimization settings that happened to be in effect when the template
was defined, when the template was used, when a particular translation
unit began, or even when a particular translation unit ended.

For the above reasons I believe that nan-safety is a mandatory aspect
of verified computing and thus an implicit assumption of any possible
'1788 standard.  It is my preference that nan-safety be an explicit
assumption/requirement.

Lee Winter
Nashua, New Hampshire
United States of America