Thread Links Date Links
Thread Prev Thread Next Thread Index Date Prev Date Next Date Index

Re: Midpoint paper (2012-02-08 version)



> Date: Fri, 10 Feb 2012 19:09:23 -0800 (PST)
> From: Dmitry Nadezhin <dmitry.nadezhin@xxxxxxxxxx>
> To: <intervals08@xxxxxxxxxxxxxx>
> Cc: <stds-1788@xxxxxxxxxxxxxxxxx>
> Subject: Re: Midpoint paper (2012-02-08 version)
> 
> > > I desired such a property:
> > > (*) midrad([u,v]) is finite floating-point number for finite [u,v] inte=
> rval=
> > > s (explicit or implicit).
> > > The definition (15) may return infinite midpoint for some finite implic=
> it i=
> > > ntervals.
> >
> > Alas, while this may be true in one direction for
> > some specific pair of explicit & implicit types,
> > I believe it is true that finite intervals of one
> > flavor may be required to become infinite or
> > semi-infinite in another if only to preserve
> > containment.
> 
> I had a mistake in formulation (*).
> I want to say
> (*) midpoint([u,v]) is finite floating-point number for any finite [u,v] in=
> terval from IR .

	Ah, I see.

	This time it is *I* who should apologise for going off
	on a long tirade due to an incorrect interpretation.

	You will notice mention of the fact that, for finite u
	& v, midpoint([u,v]) = (u+v)/2 is computed in a manner
	as to avoid overflow (in the sum).  So long as u, v, &
	the result are all in the same floating-point format
	this is always possible.  There are several methods.

> 
> The midpoint([u,v]) is not mapping from one interval type to other interval=
>  type.
> It's a mapping from interval type T to number format F.

	I see that now.

	Well, if T uses the same number format as F then it is
	covered above.  If T uses a number format subsumed by
	F (that is, in either [u,v] or <m,r> all of u, v, m, &
	r can be exactly represented in F) then we're OK still.

	It is only when the elements of T can represent values
	of a greater dynamic range or precision that problems
	arise.  Then all of overflow, underflow, & inexact are
	possible.

	But this is a floating-point property between F & the
	formats used within T.  It has nothing to do with the
	definition of midpoint.  And I don't see how defining
	a particular formatOf midpoint_F changes that.

> I will put the name of result number format F into subscribe: midpoint_F([u=
> ,v]).
> And I want that this operation considers only bounds of the argument [u,v],=
>  not its format T.
> 
> It is similar to the way how interval arithmetic operation are defined:
> add_T([u1,v1],[u2,v2])=hull_T([u1+u2,v1_v2])
> add_T may map any pair of intervals (T',T'') to the interval format T .
> It considers only bounds of arguments, not their formats.
> 
> I would like to tell a little about my background.
> 
> <<<<<<<<<<
> Our JInteral project develops a JVM (Java/Scala/Jython etc) interval librar=
> y.
> The bounds of our intervals are represented by values of our class Extended=
> Rational .
> 
> ExtendedRational = Rational + {NegativeInfinity, PositiveInfinity} .
> The mathematical value of our Rational objects are rational numbers.
> However, it is adaptive. It's internal representation depends on value.
> There are a few internal subclasses of Rational:
> * RationalImpl(n: BigInt, d: BigInt, e: Int) . Its value is (n/d) * 2^e. d =
> > 1 .
> * BinaryImpl(n: BigInt, e: Int). Its value is n * 2^e . n != 0 and (n * 2=
> ^e) is not a normal value of Binary64 .
> * NormalDoubleImpl(d: Double). Its value is d . d is a normal value of Bina=
> ry64 .
> * Zero. Its value is 0 .
> There are exact operations like add(x: Rational, y: Rational): Rational
> And there are operations like add(F: BinaryFormat, rm: RoundingMode)(x: Rat=
> ional, y: Rational): Rational = F.round(rm, add(x,y))
> They are indexed by IEEE 754-2008 binary number format F=(precision,emax,=
> emin) and by IEEE 754-2008 rounding direction.
> Each call of these operations adaptively uses algorithm according to actual=
>  subclasses of x and y values.

	In the context of scalar values only, similar number
	systems have been defined before.  One must exercise
	a little care in defining things.  Expecially when
	computing non-rational functions like sqrt() & exp().
	But generally, it works as you intend.

> 
> The class Interval of our library has two public subclasses
> * Nonempty(inf: ExtendedRational, sup: ExtendedRational) . inf <= sup, in=
> f < PositiveInfinity, NegativeInfinity < sup .
> * Empty.
> There are exact operations
> add(x: Interval, y: Interval): Interval = {
>   (x, y) match {
>     case (Nonempty(xi,xs), Nonempty(yi,ys) => Nonempty(xi + yi, xs + ys)
>     case (Nomempty, Empty) => Empty
>     case (Empty, Nonempty) => Empty
>     case (Empty, Empty) => Empty
>   }
> }
> And there are operations indexed by IEEE 754-2008 binary number format F .
> add(F: BinaryFormat)(x: Interval, y: Interval): Interval = {
>   add(x,y) match {
>      case Nonempty(i,s) => Nonempty(F.roundFloor(i), F.roundCeiling(s))
>      case Nonempty => Nonempty
>   }
> }

	I get the previous case but I'm not sure I understand
	this one.  Rather than go off on another tirade, let
	me ask: is it the case that floating-point based
	intervals remain within the same class of floating-
	point intervals when you add them together or do
	you switch to one of the subsuming ExtendedRational
	subclasses on things like inexact or over/underflow?

	Both policies can be justified.  The former on speed.
	The latter on correctness.

> 
> There are exact number operations
> exactMid(x: Interval): Rational = {
>   x case {
>     case Nonempty(i,s) => (x + y)/2
>     case Empty => throw new DomainException
>   }
> }
> And there are operations
> doubleMid(x: Interval): Double
> floatMid(x: Interval): Float
> There specification is stil uncertain for us.

	One can define these formatOf operations as simply
	convert2Double(midpoint(X)) or convert2Float(midpoint(X)).

	One can do SLIGHTLY better in some contexts if the sum
	(u+v) is rounded to the desired result type directly
	rather than converted after the sum.  It avoids a small
	double round that may or may not occur depending on
	your input type.

> >>>>>>>>>>
> 
> I know that our library could be out of scope of P1788
> because its Intervals are not tagged by any interval datatype.
> However, I search opportunity to be as close as possible to the coming P178=
> 8 standard.
> And sometimes I try to influence the P1788 development, so that our library=
>  becomes closer to it.

	From your description of it I'd say, yes, it will
	likely be out of scope of 1788.  But it is not an
	unreasonable extension.  And it is an extension
	similar to some that are used by some interval
	people.  So you may find customers among them.

	As for how you define things outside 1788, I think
	you are doing well so far.  It kind of depends on
	the specifics of how you manage arithmetic among
	the subtypes of ExtendedRational.  But a little
	care taken there would go a long way towards a
	very useful implementation.

> 
> > So, in its full generality, I believe seeking (*)
> > is a forlorn hope.  I'll have to think about
> > whether or not there is less restrictive but still
> > useful form of (*) that is always true.
> 
> It's a pity that (*) can't be accepted.
> I will be glad to explore your suggestions about alternative of (*). 

	Forgive me.  I didn't understand (*) at the time.

	If (*) is a formatOf form of midpoint, I think it
	can be implemented in a manner that would behave
	at least as well as convert2F(midpoint(X)).  That
	is, it can be made to return an answer within a
	half ULP of the exact midpoint when that value
	does not overflow or underflow.

	I am a bit concerned about overflow though.

	At present, we have defined midpoint in such a
	manner as to always return a finite value.  When
	the interval format & the result format are the
	same, that is always possible & required if the
	midpoint is to be an interior point of the input.

	But a formatOf midpoint might overflow.  You
	could fix that by clamping its value as follows

		midpoint_F(X) = min(max(convert2F(midpoint(X)),-Fmax),Fmax)

	where Fmax is WRT F.

	But if you do that the midpoint might not be an
	element of X at all.

	For example, if I can represent X = [Fmax^2,+inf]
	in the original format then midpoint_F(X) = Fmax
	which is NOT in X.

	(BTW, the same thing can happen on underflow.  One
	might return 0 for the midpoint of [eps^3,eps^2]
	when eps is the underflow threshold of F.)

	It looks like a problem for formatOf midpoints to me.

	Any thoughts?  Nate?  John?  Anyone?

> 
> > > Then I read the Michel's post. It was about semi-finite interval of wid=
> er f=
> > > ormat,
> > > but it came to me the same may happen
> > > with midpoint_F([u,v]) Level 2 operation that maps infsup intervals [u,=
> v] o=
> > > f wider IEEE-754 format Fw to
> > > narrower IEEE-754 format F.
> > >
> > > So the question that bothers me is:
> > > What properties do we expect from width(X)/midpoint(X)/radius(X) when
> > > X is of implicit interval type or X is of explicit interval type of wid=
> er n=
> > > umber format ?
> >
> > I presume the question is asking: if X is in format F
> > & X' = convertFromF2F'(X) is in format F', what can be
> > said about width(X)/midpoint(X)/radius(X) in relation
> > to width(X')/etc.?
> 
> No, this is not my question. My question is about properties of operation
> midpoint_F(X) where F is some number format, and X belongs
> to some explicit or implicit interval datatype T .

	Yes, I was wrong.

	Well, when the correct midpoint can be represented
	in the result format, it can be made to be the
	nearest represeable number in that format.  But I
	am concerned that any formatOf midpoint of a type
	narrower in dynamic range than the input format
	might return a value NOT in the original interval.

	I can't see any reasonable way around that.  Either
	the formatOf midpoint is clamped at Fmax or we
	allow it to be infinite.  Either way, it need not
	be in the interior of the interval in question.

	The more I think about it, the more I think its
	just something we have to live with.  That is,
	we can define formatOf functions in the obvious
	manner but we need to put a caveat within the text
	of the standard to the effect that they may not
	perform their intended function when expressed in
	a format of less dynamic range or precision than
	is available in the input format.

	Again, it is a problem between the 2 formats in
	question & not really a problem of the specific
	function that exposes it.

> I don't compare midpoint_F and midpoint_F' operations now
> and I don't convert X from T to other interval datatype T' .
> 
> > As for the infinite & semi-infinite cases, I believe
> > that the fact that explicit & implicit types actually
> > represent quite different classes of intervals will
> > mess thing up.  As will Michel's observation about
> > different precisions.  Other than weak monotonicity
> > within a format & Nate's equation (22), I'm not sure
> > what other properties remain that might contribute
> > to some formal proof.
> >
> > Is it properties needed for formal proofs you seek
> > or something else?
> 
> I would like to make specification of number functions simpler.
> And then formal proofs will be easier. One way to do it is to define
> them in such a way that they are indexed by number format, but they are
> independent ot interval datatype of argument.
> 
> > Nate & I are proposing width/midpoint/radius that
> > return floating-point answers in the same format
> > as the interval inputs.  (If that is not in the
> > text somewhere it should be.)
> 
> Yes. And I like how your definition works in this case.
> 
> > One could have a format specific (like 754's
> > formatOf) form of midpoint, for example, that
> > returns a specific format no matter what the
> > format of its input.  Indeed, some ways that John
> > might write up this proposal would have them
> > lying around.
> 
> Yes. I'd like such definition of midpoint.

	As you can see by my comments above, I'm not sure
	one can be made that has all the properties we
	desire.  Unless someone can think of something
	better, I think we can define formatOf functions
	in the obvious manner but we will have to mention
	that there are caveats with their use that are
	problems with the formats not the functions.

	Is that sufficient, do you think?

> 
> > But how (beyond a single extra rounding error)
> > would such a function differ from
> > convert2F(midpoint(X'))?  All the finite &
> > semi-infinite inputs would convert with or
> > without over/underflow as a function of the
> > relative exponent ranges of the two formats WRT
> > their floating-point behavior not WRT their
> > interval behavior.  At least I think so.
> 
> I understand this in such a way. Can Level 2 definition of midpoint_F(X) be=
>  decomposed
> in midpoint_F(X)=convert2F(midpoint(X)) where convert2F is F-dependent an=
> d midpoint is F-independent.
> I'd say that yes.
> 
> midpoint([u,v]) = if (u == -oo && v == +oo) then 0
> else if (u == -oo) then -oo
> else uf (v == +oo) then +oo
> else (u + v)/2

	Nate & I looked at this definition but rejected it on
	the grounds that +/-oo is not in the interior of any
	interval.  Further, trying to split an interval about
	infinity would create 2 pieces: the original interval
	& [+inf,+inf] which is NOT in our system.

	Therefore, we require that midpoint alway return a
	finite number.

	That this may become a problem for formatOf midpoint
	is outside our proposal & something we never considered.

> 
> convert2F(t) = let t' = roundNear_F(t) in
>   if (t' == -oo) then -Fmax
>   else if (t' == +oo) then +Fmax
>   else t'

	While I wouldn't use this convert in general,
	it correctly clamps the result to a finite
	number for the case of formatOf midpoints.

	It suffers from the midpoint not necessarily
	being an element of the input interval but
	that is another issue.

> 
> > Indeed, while the issues you raise are valid
> > & may be important in some contexts, they are
> > really issues of the incompatibility of the
> > various formats more than issues of the nature
> > of these particular functions.
> >
> > In other words, how do they differ from the
> > same issues WRT add, subtract, multiply, &
> > divide?
> 
> It seems that the issues don't differ from the same issues of arithmetic in=
> terval operations.
> And I suggest the approach that is similar to hull_T(x op y) approach.

	I agree.

> 
> > Let me know if you can think of some explicit
> > change to width/midpoint/radius that would
> > help with any of these concerns.
> 
> I suggested the change to midpoint_F .
> I would like to explore your variants of midpoint too.
> 
> For now, I don't see any need to change width/radius.
> 
>   -Dima
> 

	OK, I'm going to resist your suggestion that we
	include formatOf varients of midpoint in our
	proposal on the grounds that I don't understand
	the implications.

	Indeed, let me mention that this entire discussion
	is outside the scope of the proposal that Nate & I
	are making.

	But, that having been said, you raise valid issues
	that apply to ALL formatOf functions.  John, I
	think some caveats are going to be needed to cover
	the sorts of problems that can happen to any
	formatOf function that is strictly narrower in
	range & precision than its inputs.

	It is something that we did not mention in 754.
	Our attitude at the time was "If it hurts, don't
	do that".  But I think it would be prudent of 1788
	to put out warning signs to the naive user rather
	than let them fall off the cliff.

	Floating-point guys are willing to see evolution
	in action.  We should not be so cruel.


			    Dan