Thread Links Date Links
Thread Prev Thread Next Thread Index Date Prev Date Next Date Index

Re: (long) sNaNs not what they could be...



On Fri, Oct 15, 2010 at 6:01 PM, Dan Zuras Intervals
<intervals08@xxxxxxxxxxxxxx> wrote:

[...]

>        I will outline it.
>
>        But even that outline will take some time to
>        explain.
>
>        What we were after was a 'touch it & die NaN'.
>        Something which even dereferencing would cause
>        an invalid trap.  Presumably to a debugger or
>        to signal the use of an uninitialized variable.
>
>        The method would be to fill memory with this
>        fatal signalling NaN so that any read access
>        would explode the mine.  If your first use of
>        memory was to write to it, you were safe.  But
>        if you read from it you would die the death of
>        the uninitialized memory signalling NaN.
>
>        Sounds simple enough.
>
>        Why was that hard to do?
>
>        Well, on most systems the load instruction is
>        not typed.  It has a width but not a type.
>        Thus, if I am reading, say, a 32-bit quantity
>        out of memory I generally don't know whether
>        it is an integer, 4 characters, a single
>        precision floating-point number, or part of
>        a larger structure (either a larger floating-
>        point number or some non-floating-point
>        structure).
>
>        So, die on load was not really feasible.
>
>        No matter, it would be sufficient to die on
>        first floating-point touch.
>
>        Which would be fine if all floating-point
>        touches went through the floating-point ALU.
>        Alas, on many systems (Intel included) there
>        are 3 floating-point operations that do not.
>        They are copy, negate, & absolute value.
>        They can avoid the ALU because they are just
>        bit copies with a possible modification of
>        the sign bit.  And they are singled out in
>        the standard for this reason.
>
>        Of these the most important is copy.  It is
>        used on assignment.
>
>        Or not.  You see, modern optimizers are such
>        that copies are generally eliminated except
>        in rare cases.  So, even if we were to 'arm'
>        copies to trigger invalid we can't count on
>        them actually being there.
>
>        It is a bit more complex, but much the same
>        can be said for negate.  Unary negates are
>        mostly eliminated by manipulation of prior
>        or subsequent add-like operations (changing
>        add to subtract or one kind of FMA to
>        another).
>
>        Absolute value is generally safe but also
>        not often used.
>
>        So these operations provide a hole through
>        which an uninitialized value can slip
>        unnoticed.
>
>        No matter, we'll get them on the first
>        arithmetic operation.
>
>        But wait a minute, just what was the value
>        of that uninitialized NaN?
>
>        The one we were seeking was the all 1's NaN.
>        The reason for this was that, for most
>        computers, it is easy to fill memory with
>        all zeros or all ones.  Or even all copies
>        of some particular byte.  But filling memory
>        with all values of anything more complex
>        than that involves copying from a register
>        or one place in memory to another.  And that
>        is a much slower operation.
>
>        So, we wanted all 1's.  That would make it
>        easy & fast.
>
>        But, as it happened, we recommended (in the
>        sense of 'should') that the all 1's NaN be
>        a quiet NaN.  The reason for this is that
>        the most common thing one does with a
>        signalling NaN is to quiet it.  If we had
>        to do that by turning a "I'm a signalling
>        NaN" bit from a 1 to a zero there was a
>        danger of turning a NaN (with only that
>        bit set) into an infinity.
>
>        The technical term for this was "It would
>        be bad".
>
>        So the (strong) recommendation was that the
>        bit that distinguished signalling NaNs from
>        quiet ones take on the values 0 for signalling
>        & 1 for quiet.  That way there would always
>        be a quiet NaN to 'land on' when one quiets
>        some valid signalling NaN.
>
>        So the all 1's NaN would not do.  It had to
>        be something else.  It had to be something
>        that had ones in some places (where the
>        exponent was) & at least one zero elsewhere
>        (the signalling bit).
>
>        But, single precision floating-point numbers
>        have 7-bit exponents.  Doubles have 11.  And
>        quads have 15.  In each case the position of
>        the signalling bit is (recommended to be) 2
>        bits to the right of the right most exponent
>        bit.  Counting the sign bit (just for byte
>        alignment) that means that 10, 14, or 18 bits
>        matter.  The rest don't.
>
>        But that means that we have to fill memory
>        with a value that presumes we know which type
>        will be incorrectly referenced there.
>
>        How can we know that?
>
>        Further, some systems align 16, 32, & 64 bit
>        memory references on 16, 32, or 64 bit aligned
>        memory locations.  Some don't.
>
>        So not only do we have to know which type will
>        be incorrectly referenced, we have to know its
>        memory alignment.
>
>        If we get either one wrong, the bit pattern
>        will just look like some otherwise innocent
>        floating-point number.
>
>        As the reference is presumed to be incorrect in
>        the first place, how can we know how or why it
>        is incorrect?
>
>
>        Let's see,  I may have missed something but I
>        think that's most of it.
>
>        Some of them may not apply to your computer.
>
>        But I guarantee some of them do.
>
>        So...
>
>                -- We can't count on systems triggering
>                an invalid trap if they encounter a
>                signalling NaN because loads are not
>                required to be typed.
>
>                -- We can't count on knowing when we are
>                touching a NaN because copies (& negates)
>                are not required to go through the ALU.
>
>                -- Even if we only count on trapping on
>                arithmetic operations, some (like negate)
>                are optimized out.
>
>                -- We can't fill memory with all 1's
>                because that is a quiet NaN.
>
>                -- The kernal people won't fill it with
>                any more complex pattern because it is
>                noticably slower to do so.
>
>                -- Even if they would, we could only
>                catch invalid references to a NaN of a
>                known type & alignment.  All others
>                would slip through as some other number.
>
>        When all was said & done, the remaining diagnostic
>        value of what could be done if you met all these
>        limitations was considered to be of far less value
>        than the limitations themselves.
>
>        So we had to give up on it.
>
>        Still, some enterprising compiler writer or debugger
>        writer out there COULD do something along these lines.
>        It wouldn't buy them much but it would be interesting.

What you have described is why SNaNs are not particularly useful as a
memory initializer.  But AFAICT none of the above reasons indicate
that SNaNs are inappropriate or ineffective for constructing FP
variables.  I intend to continue offering them as a non-default
option.  Did I miss something?

Lee Winter
Nashua, New Hampshire
United States of America