Re: (long) A new proposal for midpoint et al...
> Date: Tue, 21 Feb 2012 14:17:02 +0100
> From: Vincent Lefevre <vincent@xxxxxxxxxx>
> To: stds-1788@xxxxxxxxxxxxxxxxx
> Subject: Re: (long) A new proposal for midpoint et al...
>
> On 2012-02-18 16:37:37 -0800, Dan Zuras Intervals wrote:
> > Level 1: mid(X) = for non-empty X only: if (X==Entire) then 0
> > else (inf(X) + sup(X))/2
>
> For X == Entire, 0 is a good choice at Level 2, in particular because
> 0 is the only center of symmetry for most number formats and interval
> types. I'm not sure whether the midpoint of Entire should be defined
> at Level 1 because some properties are no longer true for Entire[*],
> but since there isn't a Level 1 implementation, this probably doesn't
> matter.
>
> [*] For instance, for any interval X different from Empty and Entire,
> and any real number k, midpoint(X + k) = midpoint(X) + k. (This is
> true for the semi-unbounded intervals, with the usual rules on the
> extended real numbers.)
As you mention below, I didn't say that 0 is the middle of
Entire. I said 0 is as good a choice as any.
For any Real number there are as many to the left of it as
to the right. So, in a very real sense (forgive the pun)
the Reals, although ordered, have no middle.
The choice of 0 is arbitrary, no better nor worse than any
other Real number. But far better than NaN.
I find it interesting that you exclude both Empty & Entire
from your translation rule above. If we can settle on an
arbitrary non-NaN midpoint for Empty, something like that
may be needed.
>
> > Property: X \subset Y ==> if (inf(X)==inf(Y)) then mid(X) <= mid(Y) &
> > if (sup(X)==sup(Y)) then mid(X) >= mid(Y)
> >
> > Coercion: mid_F(X) = min(max(round2Nearest_F(mid(X)),-Fmax_F),Fmax_F)
> > Property still holds so long as weak monotonicity
> > holds for your arithmetic. That is true for 754
> > & all other arithmetics extant (that I am aware of).
> > But it was false for some sloppy arithmetics in the
> > past. One must take care that the arithmetic is
> > carried out in a fashion that does not overflow.
> > This includes making sure that (-inf + +inf)/2 does
> > not happen as an intermediate result. But both can
> > be accomplished in more than one way. Property is
> > trivially true for semi-infinite intervals in that
> > all have the same midpoint which is to the correct
> > "side" of all finite subsets. Further, the midpoint
> > is an interior point so long as X HAS points in its
> > interior that are representable in F. Otherwise,
> > the midpoint might be an endpoint or even exterior
> > to X altogether. Note that this is a problem with
> > F not a problem with the definition of midpoint.
> > Applying the Principle of Least Astonishment, it is
> > my firm belief that a finite element of F should
> > always be returned rather than a NaN on the grounds
> > that "interiorness" is a derived rather than
> > fundamental property of midpoint & is violated only
> > when the user insists on it. That is, either by
> > trying to split an unsplitable interval or by
> > insisting on representing the midpoint in a
> > floating-point type inadequate to the task.
> > (Aside: this is mostly equivalent to Michel's
> > midpoint.) The midpoint of Entire is arbitrary.
> > Any finite value would do. But 0 is as good as
> > any other & splits Entire right down the middle.
>
> splits Entire right down the middle... *at Level 2*.
As far as any narrowing or B&B procedure goes, yes.
>
> > Someday we may need to make a similar argument
> > concerning midpoint(Empty). Can we choose 0?
>
> 0 doesn't make sense at Level 1. And Level 2, NaN should be available.
> So, why not choosing NaN?
My argument is going to be a philosophical one more than
mathematical or numerical. (It will also be long. :-)
In 754, we created the concept of NaN to give an answer
when no floating-point number stood for a better answer.
(There were similar concepts before but not widely known.)
Such a thing was needed because computers had gotten to
the point where PhDs no longer sat in front of them to
know what to do when a calculation halted on an error.
If proper error analysis was to be done the programmer
had to account for it in the program itself. Tests had
to be made. Branches taken. If possible, repairs to
intermediate calculations had to be done. And on & on.
It was an innovative idea that, alas, was never properly
implemented. Partially due to there not being any
portable/standardized way of dealing with the various
errors. And partially due to (varying degrees of) lack
of understanding of what was needed in each case.
It made sense at the time & still does, really. But it
has caused no end of trouble. In the years since, both
users & programmers have (on the average) become less
familiar with both mathematical analysis & numerical
analysis. We had hoped that the existence of a standard
floating-point would permit MORE familiarity with such
things. We were wrong.
And things show every indication of getting worse in
the future for standards like 1788. Not only will the
programmers know less math & less floating-point, they
will likely know almost nothing about interval analysis.
Hell, we'll be lucky if future interval programmers
know anything about PROGRAMMING.
I have said from the start that part of our job in
writing this standard will be pedagogical. It is not
enough to get the standard right. We will also need to
teach people how it works. Or, more to the point, how
they should use it properly & how they should properly
interpret the results.
All that having been said, we have a chance to make a
standard which is error free. It is the existence of
the empty set that is partly responsible for this.
Since our universe of discourse consists of contiguous
sets of extended Real numbers, the empty set permits
us to do something that we could not do in floating-
point. Namely, not return an answer when there is no
sensible answer. Empty says to the user, "There are
no Real numbers that come out of this calculation."
Like NaN in floating-point, our arithmetic is (for
the most part) Empty preserving. So if one comes up,
chances are the user will find out about it.
Now, this does not eliminate error analysis. After
all, the user is probably going to want to know WHY
there is no answer to the problem. But tracking
that back to its source is no harder nor easier than
before.
In this context, there is no need for the interval
equivalent of NaN: the NaI. Not at level 1. And,
if we do our jobs right, not at level 2 either.
But the table 4 functions are something of a problem.
As they return floating-point results they throw us
back into the problem of what to return when there
is no sensible answer to return.
Now, we COULD punt & say to the programmer, "By
executing a table 4 function you left the interval
world of assured computing & re-entered the world
of error-riddled floating-point. You have voided
the warranty. You cannot go back to the interval
world & count on anything to be correct. Anything
bad that happens to you from now on is your own
damn fault."
As prejudicial as I make this sound, there is some
merit to taking that position. This is what having
mid(Empty) = NaN would mean to the user.
But, we could also return a harmless but arbitrary
answer. It might be that ANY numeric answer would
do as well as any other. So, for example, splitting
Entire "down the middle" & return 0 as its midpoint
does no harm. And, since the intended purpose of
finding a midpoint is (generally) to split an
interval, splitting Entire into [-inf,0] & [0,+inf]
is as good as splitting it anywhere else. And it
is better than NaN.
For 2 reasons.
First, as you pointed out, although there is no
natural "center" to the Real number line, even the
untrained programmer would find 0 to be both
natural & reasonable. Perhaps even expected.
And, second, if we enforce the notion that asking
for the center of the Real number line is an
UNreasonable thing to do by returning a NaN, the
programmer would rightly be astonished at that.
Indeed, the programmer might be astonished at
anything OTHER than 0 as that answer.
So, picking 0 as an answer entirely arbitrarily
is a reasonable & natural thing to do. In SPITE
of not making any mathematical sense.
Now, Vincent, I realize that you are not proposing
that we do anything else than return mid(Entire)=0.
But I make this (admittedly weak) argument in the
hope that you might consider the possibility that
ANY arbitrary answer for mid(Empty) might be better
for the user than NaN.
And better for us too. For if we are careful & see
to it that there is no way of introducing NaNs into
the structure of intervals, we free ourselves as well
as the user from having to deal with all the problems
they bring with them.
So, my argument for a numeric mid(Empty) is
mathematically even weaker than the one for Entire.
But it is no more nor less arbitrary. And I believe
that any arbitrary numeric answer is better, for us
all, than NaN.
If you like, we could have mid(Entire)=+0 &
mid(Empty)=-0. Or 37. Or 42. I don't really care.
But I believe remaining an error free standard is
more than worth the occasional arbitrary answer.
Especially if we carefully document it as such.
That's my argument.
I'll refrain from further comment on the subject
further down in this note.
>
> > ----------
> >
> > Level 1: mag(X) = { least m>=0 such that m >= |x| for all x in X }
> >
> > Property: X \subset Y ==> mag(X) <= mag(Y)
> >
> > Coercion: mag_F(X) = roundUp_F(mag(X))
> > Property still holds although intervals of sufficient
> > finite magnitude may have +inf reported if F has
> > insufficient dynamic range to express that magnitude.
> > Other than Empty, [0,0] is the only interval with a
> > 0 magnitude. All others are strictly positive. John
> > has argued that the m>=0 should be removed which would
> > render mag(Empty) = -inf. A case can be made for this
> > but I, personally, prefer mag(Empty) = 0 on Least
> > Astonishment grounds. Still, I would have no great
> > objection to the change.
>
> 1. For sup_Rbar { |x| | x in xx }, that would be mag(Empty) = -oo.
> 2. For sup_Rbar+ { |x| | x in xx }, that would be mag(Empty) = 0,
> where Rbar+ = { x in Rbar | x >= 0 }.
>
> Since Rbar is the general reference set, I would say that (1) is
> better, but I don't have a strong opinion on the subject.
Well, since the user would be expecting a function called
mag to be returning answers in Rbar+, I would vote for 0.
But I also don't have a strong opinion on the subject.
>
> > ----------
> >
> > Level 1: wid(X) = if (X==Empty) then 0
> > else if (X==Entire) then +inf
> > else sup(X) - inf(X)
>
> Perhaps wid(Empty) could be undefined. 0 doesn't make much sense,
> and note that sup(X) - inf(X) = -oo for X = Empty. So, wid(Empty)
> could also be -oo (better than 0, IMHO).
Since both points & Empty have zero measure among
the Reals, 0 make as much sense to me as anything.
But, if not, I would rather see -inf than NaN.
At least you can justify -inf.
>
> > Property: X \subset Y ==> wid(X) <= wid(Y)
> >
> > Coercion: wid_F(X) = roundUp_F(wid(X))
> > Property still holds if weak monotonicity holds
> > in your arithmetic. Sufficiently wide finite
> > intervals may have an infinite width returned
> > if F has insufficient dynamic range to express
> > that width. And care should be taken that the
> > expressions +/-inf - +/-inf either never arise
> > or are dealt with properly (easy to do). The
> > "else if" clause is only needed for systems
> > that choke on +inf - -inf & return a NaN (there
>
> In the standard, this should be mentioned in a footnote, for instance.
It will. This is a proposal to you (plural) not
proposed text. John will write it as he sees fit.
>
> > are still such systems around). After that,
> > there is guaranteed to be one finite term in
> > the "else" clause. At least at level 1. John
> > also argues that wid(Empty) = -inf on much the
> > same grounds as mag(Empty) = -inf. I prefer 0
> > on Least Astonishment grounds but would have no
> > great objection to the change.
>
> See my opinion above. But if you choose 0, you should use the
> following definition:
>
> wid(X) = sup_Rbar+ { a - b | a in X, b in X }.
How is that different from sup(X) - inf(X)?
The same argument must be made (or special
cased) for Empty.
>
> > ----------
> >
> > Level 1: rad(X) = if (X==Empty) then 0
> > else mag(X - mid(X))
>
> I would say that the definition should be equivalent to wid(X)/2
> at Level 1, for Empty in particular.
Alas, Nate & I tried something along these lines &
it didn't work. See the discussion of equation (22)
either below or in our old proposal.
Same argument for Empty.
>
> Your definition is invalid for semi-unbounded intervals because
> X - mid(X) is undefined for such intervals at Level 1. But it
> seems to be OK at Level 2.
That is so. It is part of why I describe the coercion
to level 2 as "a bit tricky".
>
> > Property: X \subset Y ==> rad(X) <= rad(Y)
> >
> > Coercion: rad_F(X) = if (X==Empty) then 0
> > else mag_F(X - mid_F(X))
> > This one is a bit tricky. Note that the coercion
> > is subtly different from all the others & is needed
> > for both weak monotonicity & (22) to hold. Given
> > that, property still holds if (1) mid_F returns the
> > nearest representable finite number in F rather than
> > a NaN & (2) weak monononicity holds in your arithmetic.
> > Some finite intervals will return an infinite result
> > if F is unable to represent the finite value. Care
> > should be taken that +/-inf - +/-inf either never
> > arises or is dealt with properly (easy to do).
> > Further, Nate's (22)
> >
> > X \subset [mid_F(X) - rad_F(X), mid_F(X) + rad_F(X)]
> >
> > holds so long as not both mid_F(X) & rad_F(X) are
> > infinite. It would also hold for Empty if mid(Empty)
> > is not a NaN. But chances are, that's too much to
> > ask for. Equation (22) is going to be needed if we
> > want to use X -> <mid_F(X),rad_F(X)> to preserve
> > containment on conversion to mid-rad. We already
> > have X -> [inf_F(X),sup_F(X)] for inf-sups. We will
> > need both. Note that we can have rad_F(X) > wid_F(X)/2
> > in some cases. Indeed, we can have rad_F(X) = wid_F(X).
> > I could make a good mathematical argument for why this
> > is so but only another mathematician would fall for it.
> > In the end, it is because wid() & rad() are defined
> > very differently. For the moment, we cannot have
> > rad(X) > wid(X) but if John's arguments about Empty
> > hold sway we will have rad(Empty)=0 > wid(Empty)=-inf.
>
> If wid(Empty) = -oo, then one should define rad(Empty) = -oo.
>
> > Perhaps an argument against it. Perhaps not. Note
> > that we cannot have rad(Empty)=-inf & still preserve
> > containment on conversion to mid-rad.
>
> I don't understand. Empty is a subset of everything. So,
> the containment property will necessarily be satisfied.
> In particular, for X == Empty,
>
> X \subset [mid_F(X) - rad_F(X), mid_F(X) + rad_F(X)]
>
> would give:
>
> Empty \subset [+oo,-oo] = Empty
Actually, your line of reasoning would lead to
Empty \subset [NaN - -inf, NaN + -inf] = [NaN,NaN]
Thus the special cases involving Empty.
>
> which is better (more accurate) than Empty \subset [0,0].
Well, Empty \subset [0,0] is at least true.
Arbitrary but true.
>
> --
> Vincent Lefèvre <vincent@xxxxxxxxxx> - Web: <http://www.vinc17.net/>
> 100% accessible validated (X)HTML - Blog: <http://www.vinc17.net/blog/>
> Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)
Let me think about what you've said here.
I don't agree with the introduction of NaNs nor
the use of negative wid, mag, & rad for Empty.
But perhaps I can find some way of living with
the latter.
Let me see what I can do.
Dan