Re: (long) sNaNs not what they could be...
On Fri, Oct 15, 2010 at 6:01 PM, Dan Zuras Intervals
<intervals08@xxxxxxxxxxxxxx> wrote:
[...]
> I will outline it.
>
> But even that outline will take some time to
> explain.
>
> What we were after was a 'touch it & die NaN'.
> Something which even dereferencing would cause
> an invalid trap. Presumably to a debugger or
> to signal the use of an uninitialized variable.
>
> The method would be to fill memory with this
> fatal signalling NaN so that any read access
> would explode the mine. If your first use of
> memory was to write to it, you were safe. But
> if you read from it you would die the death of
> the uninitialized memory signalling NaN.
>
> Sounds simple enough.
>
> Why was that hard to do?
>
> Well, on most systems the load instruction is
> not typed. It has a width but not a type.
> Thus, if I am reading, say, a 32-bit quantity
> out of memory I generally don't know whether
> it is an integer, 4 characters, a single
> precision floating-point number, or part of
> a larger structure (either a larger floating-
> point number or some non-floating-point
> structure).
>
> So, die on load was not really feasible.
>
> No matter, it would be sufficient to die on
> first floating-point touch.
>
> Which would be fine if all floating-point
> touches went through the floating-point ALU.
> Alas, on many systems (Intel included) there
> are 3 floating-point operations that do not.
> They are copy, negate, & absolute value.
> They can avoid the ALU because they are just
> bit copies with a possible modification of
> the sign bit. And they are singled out in
> the standard for this reason.
>
> Of these the most important is copy. It is
> used on assignment.
>
> Or not. You see, modern optimizers are such
> that copies are generally eliminated except
> in rare cases. So, even if we were to 'arm'
> copies to trigger invalid we can't count on
> them actually being there.
>
> It is a bit more complex, but much the same
> can be said for negate. Unary negates are
> mostly eliminated by manipulation of prior
> or subsequent add-like operations (changing
> add to subtract or one kind of FMA to
> another).
>
> Absolute value is generally safe but also
> not often used.
>
> So these operations provide a hole through
> which an uninitialized value can slip
> unnoticed.
>
> No matter, we'll get them on the first
> arithmetic operation.
>
> But wait a minute, just what was the value
> of that uninitialized NaN?
>
> The one we were seeking was the all 1's NaN.
> The reason for this was that, for most
> computers, it is easy to fill memory with
> all zeros or all ones. Or even all copies
> of some particular byte. But filling memory
> with all values of anything more complex
> than that involves copying from a register
> or one place in memory to another. And that
> is a much slower operation.
>
> So, we wanted all 1's. That would make it
> easy & fast.
>
> But, as it happened, we recommended (in the
> sense of 'should') that the all 1's NaN be
> a quiet NaN. The reason for this is that
> the most common thing one does with a
> signalling NaN is to quiet it. If we had
> to do that by turning a "I'm a signalling
> NaN" bit from a 1 to a zero there was a
> danger of turning a NaN (with only that
> bit set) into an infinity.
>
> The technical term for this was "It would
> be bad".
>
> So the (strong) recommendation was that the
> bit that distinguished signalling NaNs from
> quiet ones take on the values 0 for signalling
> & 1 for quiet. That way there would always
> be a quiet NaN to 'land on' when one quiets
> some valid signalling NaN.
>
> So the all 1's NaN would not do. It had to
> be something else. It had to be something
> that had ones in some places (where the
> exponent was) & at least one zero elsewhere
> (the signalling bit).
>
> But, single precision floating-point numbers
> have 7-bit exponents. Doubles have 11. And
> quads have 15. In each case the position of
> the signalling bit is (recommended to be) 2
> bits to the right of the right most exponent
> bit. Counting the sign bit (just for byte
> alignment) that means that 10, 14, or 18 bits
> matter. The rest don't.
>
> But that means that we have to fill memory
> with a value that presumes we know which type
> will be incorrectly referenced there.
>
> How can we know that?
>
> Further, some systems align 16, 32, & 64 bit
> memory references on 16, 32, or 64 bit aligned
> memory locations. Some don't.
>
> So not only do we have to know which type will
> be incorrectly referenced, we have to know its
> memory alignment.
>
> If we get either one wrong, the bit pattern
> will just look like some otherwise innocent
> floating-point number.
>
> As the reference is presumed to be incorrect in
> the first place, how can we know how or why it
> is incorrect?
>
>
> Let's see, I may have missed something but I
> think that's most of it.
>
> Some of them may not apply to your computer.
>
> But I guarantee some of them do.
>
> So...
>
> -- We can't count on systems triggering
> an invalid trap if they encounter a
> signalling NaN because loads are not
> required to be typed.
>
> -- We can't count on knowing when we are
> touching a NaN because copies (& negates)
> are not required to go through the ALU.
>
> -- Even if we only count on trapping on
> arithmetic operations, some (like negate)
> are optimized out.
>
> -- We can't fill memory with all 1's
> because that is a quiet NaN.
>
> -- The kernal people won't fill it with
> any more complex pattern because it is
> noticably slower to do so.
>
> -- Even if they would, we could only
> catch invalid references to a NaN of a
> known type & alignment. All others
> would slip through as some other number.
>
> When all was said & done, the remaining diagnostic
> value of what could be done if you met all these
> limitations was considered to be of far less value
> than the limitations themselves.
>
> So we had to give up on it.
>
> Still, some enterprising compiler writer or debugger
> writer out there COULD do something along these lines.
> It wouldn't buy them much but it would be interesting.
What you have described is why SNaNs are not particularly useful as a
memory initializer. But AFAICT none of the above reasons indicate
that SNaNs are inappropriate or ineffective for constructing FP
variables. I intend to continue offering them as a non-default
option. Did I miss something?
Lee Winter
Nashua, New Hampshire
United States of America