Re: Empty interval representations & Motion 13...
> Subject: Re: Empty interval representations & Motion 13...
> From: John Pryce <j.d.pryce@xxxxxxxxxxxx>
> Date: Mon, 3 May 2010 16:57:56 +0100
> To: P1788 <stds-1788@xxxxxxxxxxxxxxxxx>
>
> <I noticed that I had used "(in)valid" in two senses: (a) of
> a level 2 datum, as meaning "decorated interval whose 'valid'
> property has a certain value", (b) of a level 3 object, meaning
> "representing a level 2 datum, or not doing so". I have changed
> it to use "well-formed" and "ill-formed" for the level 3 meaning.
> Please discard the previous version.>
>
> P1788, Dan, Jürgen
>
> On 22 Apr 2010, at 17:17, Jürgen Wolff von Gudenberg wrote:
> > Dan
> > your representation may simplify comparisons but on the other hand
> > it complicates arithmetic.
> > Juergen
Juergen, this may be so. Or it may merely require
you to test for things you currently let slide due
to the 'stickyness' of NaNs. I'm not sure.
In any event, it is just a suggestion. It was never
so well thought out as to qualify as a panacea.
But I will explain why I made the suggestion below.
I hope all will forgive me if I wax verbose on the
subject.
> >
> > Dan Zuras Intervals schrieb:
> >> I have been thinking about something for some time now
> >> & Ulrich's recent revision of Motion 13 has brought it
> >> up again.
> >> It is the issue of the representation of the empty
> >> interval.
> >> Many of you out there seem to presume that the proper
> >> representation for the empty interval is [NaN,NaN].
> >> And this may turn out to be the best thing for us.
> >> But let me point out that we have scrupulously avoided
> >> NaNs in any form at all so far.
> >> And I believe this is a good thing.
> >> NaNs cause trouble where ever they appear. They even
> >> cause trouble when they MIGHT appear. At the very least,
> >> the diligent function writer must take into account the
> >> possibility that it might happen somewhere in his code.
>
> This makes me realise there is a defect of the statement
> in 6.1 of the text:
> "An implementation may choose any means to represent a level 2
> interval datum x, provided that it shall be possible to retrieve
> the bounds of x exactly."
>
> It should say, I think,
> "An implementation may choose any means to represent a level 2
> interval datum xx, provided that it shall be possible to retrieve
> the bounds of *each nonempty* xx exactly."
>
> The bounds inf(), sup() of any other datum are neither here nor there:
> - Mathematical convention, at least in analysis, usually takes those
> of xx=Empty to be inf(xx)=+oo, sup(xx)=-oo.
> - The bounds of any invalid interval datum (they are all indistinguishable,
> so I increasingly think they/it is just NaI under another name) must
> be returned as some floating point datum; the only sensible value is NaN.
>
> Dan, what is worrying you?
Oh, many things worry me, John. That Juergen
counts on any NaN that appears in a interval
to be part of a 'valid empty' & not due to
some undiagnosed problem. And that you count
on the following to be true. There is more
but let's start with those two.
> NaN doesn't exist at level 2. Here's a diagram, for the case
> of binary operations.
>
> level 2 datums xx1, xx2 xx1 op xx2 = yy + ???
> | ^ + ^
> v | + |
> level 3 objects XX1, XX2 XX1 OP XX2 = YY + ZZ
> -------------Well-formed objects----------------+--Ill-formed objects--
>
> where the downward map is
> XX = repOf(xx) = object that represents xx (one to many,
> and not onto) and the upward map is
> yy = repdBy(YY) = datum represented by YY" (many to one,
> i.e. a function, and onto).
> There is also "OP = implementation of op".
>
> Some objects don't represent any datum -- the ill-formed ones
> (ZZ in diagram).
> Now, I trust implementers to meet the following contract.
> - Ensure that every constructed object is well-formed,
> i.e. is a representation of some datum.
> - Make each OP a correct implementation of its op,
> i.e. ensure that the above diagram commutes.
> (Questions of accuracy mode come in, but do not
> affect well-formed-ness.)
>
> Since yy is a datum, for any input datums xx1 and xx2, it then
> follows that YY must be a representation of yy, hence YY is
> well-formed.
>
> If any ZZ's occur it can *only* come from "beyond the model"
> events, such as storage corruption or use of illicit machine-code
> routines, etc.
>
> Whether NaN may be a possible value of a field in a level 3
> object neither increases or decreases the risk of what Dan
> calls "trouble". I don't think we should impose on implementers
> any particular representation of things like the empty set.
>
> John
John,
I agree with all of this except for one statement.
"I trust implementers to meet the following contract."
I am not so trusting.
I trust implementers to TRY to meet that contract.
I even trust implementers to BELIEVE they have met that
contract.
I will even go so far as to say I trust that there are
implementers out there who will do their very best to
TEST that they have met that contract.
But none of that assures me that the contract has been
met.
And as we are in the business of assuring OUR customers
that the contract has been met, we have to do something
to assure ourselves that we have done everything in our
power to make it so. And know when it is NOT so.
It is kind of like a former president of ours once said:
"Trust, but verify." It is a very silly thing for a
diplomat to say. And this particular president said many
very silly things in his time. But this has merit for us
all the same.
In this case, the fault, dear friends, lies not in our
empties but in our NaNs.
You see the thing that worries me is that NaNs can creep
into our calculations from the unlikeliest of places &
for the unlikeliest of reasons. I won't enumerate them
here but I trust all who are experienced floating-point
programmers have their stories to tell. I have mine.
We can demand that our customers create only valid intervals
using only validated means but we cannot assure ourselves
that this has been done. Even without any sense of
carelessness or malice, mistakes happen. Mistakes that
should not be ignored.
We can demand that our implementers follow your contract
that assures that from valid intervals sprout only valid
intervals. But mistakes happen & it might not be so. And
when it is not so we should know about it, not ignore it.
We can represent empties as [NaN,NaN] as I believe Juergen
& others would have us do, but it does not mean that the
appearance of [NaN,NaN] means empty. And if we allow such
things to be propagated without note, they can disappear
from a calculation which permits empty as a valid
intermediate result. All without anyone knowing that it
ever happened. All without knowing that the rocket is
about to blow up.
If we are to assure our customers of the validity of their
results we must at least be able to detect when they are
NOT valid.
It is for this reason I suggested [+oo,-oo] as a
representation for the empty interval. Not because it
worked out for the comparisons. That was a (largely)
lucky accident. But because it has no NaNs in it.
And if we can define our standard such that every valid
interval has no NaNs in it, then the appearance of a NaN
can be detected as an error that won't be ignored.
THAT was why I made the suggestion.
It is ignoring NaNs when they appear that scares me.
Does it not scare you?
Dan