Thread Links Date Links
Thread Prev Thread Next Thread Index Date Prev Date Next Date Index

mid() and rad() functions (was: Comments on P1788_Level2RoughDraft.pdf)



On 2012-01-19 14:30:31 -0500, Michel Hack/Notes wrote:
> Subtopic:  mid() and rad() functions for "implicit" interval types.

Actually the last part of the discussion applies to any interval types.

> An implementation that supports an implicit type with interesting
> properties could have its own set of functions (not subject to 1788)
> to exploit its features, e.g. exact midrat() and radrat() functions
> for intervals with exact rational endpoints, using an appropriate
> result type (e.g. a struct).

If they correspond to the mid() and rad() functions as specified by
P1788, they could be subject to P1788. After all, rational numbers
are numeric types (possibly parameterized by their size) just like
floating-point. A struct would just be the (invisible) encoding
(MPFR floating-point numbers are also represented in memory via a
struct). You can have +Inf (1/+0), -Inf (1/-0) and NaN (0/0).

> > But you didn't answer my real question: why is this restriction
> > *artificial*?
> 
> I figured that if midpoint and (absolute, not relative) radius used
> different precisions, the higher-precision type should be treated
> as if it had the same exponent range as the lower-precision type.
> I call that an "artificial" restriction because it is not natural.

Well, this depends on the types. In MPFR, different variables with
different precisions share the same exponent range, like double-double:

> Use of double+double for midpoint and plain double for radius does
> not raise this issue because the exponent range is the same.  This
> would also be the case for S/360 HFP.
> 
> > > >>  My other concern is with the last sentence.  I suppose it is
> > > >>  possible (at Level 3) to encode Empty, Entire and semi-unbounded
> >                                ??????
> > > >>  intervals, e.g. by using selected negative values for the radius.
> > > >
> > > > That would be Level 4 (Level 3 is still quite abstract).  And it
> > > > depends on whether the radius field can hold negative numbers.
> > >
> > > If a midrad type encodes semi-unbounded intervals with a defined finite
> > > endpoint, I don't care HOW it encodes this.
> >
> > You were talking about the encoding above.
> 
> Different uses of "encode".  The one at level 4 is the actual encoding.
> What I was talking about later was essentially a level-2 way of returning
> more information than simply NaN -- no applicable radius:
> >> What I'm talking about is what the rad() function should return for
> >> semi-unbounded intervals and for Empty -- irrespective of encoding.
> 
> Logically, rad() could return an ENUM to explain why there is no
> applicable radius, or the radius (a nonnegative value) is there is one.
> But those are two different types in most languages (the exception is
> languages where everything is a string of characters, e.g. Rexx).

There are other exceptions, such as languages where variables
(objects) are tagged with their actual type.

> I was just pointing out that since a radius is non-negative, if it
> is returned as a floating-point number, its interpretation can
> easily be overloaded (without violating most languages' type) as I
> suggested.

This is a possibility, but I rather see this as a hack, possibly
making code less readable, with a higher risk of bugs. I don't
think that standards should encourage that, unless there is a
clear benefit (which is not the case here, IMHO).

> > Or it could be undefined (at Level 1), which would mean NaN (at Level 2).
> 
> Right -- but we could also avoid a loss of information.  It would be very
> easy to write code such as: (in C)
> 
>    r = rad(xx);
>    m = mid(xx);
>    if (r < 0) then switch (r) {      /* Unusual case */
>      case -1:    /* Handle empty                     */
>      case -2:    /* Semi-bounded; m = upper endpoint */
>      case -3:    /* Semi-bounded; m = lower endpoint */
>    }

Something like the following would be better:

   r = rad(xx);
   m = mid(xx);
   if (r < 0) switch (r) {      /* Unusual case */
     case RAD_EMPTY:          /* Handle empty                     */
     case RAD_UNBOUNDED_L:    /* Semi-bounded; m = upper endpoint */
     case RAD_UNBOUNDED_U:    /* Semi-bounded; m = lower endpoint */
     case RAD_UNBOUNDED:      /* Handle entire                    */
   }

where RAD_* are negative constants.

Still, I think that NaN in all these cases would be better: if you
forget to check r, you will detect the error more easily.

   r = rad(xx);
   m = mid(xx);
   if (r != r) switch (interval_class(xx)) {  /* Unusual case */
     ...
   }

BTW, this would even probably be faster if r is in multiple precision
(e.g. with a multiple-precision inf-sup type... like in MPFI): NaN
can be tested very quickly, and the case tests would be done on
native integers instead of multiple-precision values.

-- 
Vincent Lefèvre <vincent@xxxxxxxxxx> - Web: <http://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)