Thread Links Date Links
Thread Prev Thread Next Thread Index Date Prev Date Next Date Index

Re: A proposal for the next motion



John Pryce wrote:

I propose the following motion. Like Motion 3, it is not about structure and procedure of the standard, but about content. I aim in the next few weeks to turn the text below into a LaTeX document that can be submitted for inclusion in the standard text.

Since, like in all previous proposals of motions, there is a definite
lack of clarity in this proposal I want to propose having before the formal voting on a motion (either before or after the discussion period) an official wordsmithing phase (at least in a wordsmithing subgroup of which I'd be happy to be a member), where a preliminary proposal is welded into a form suitable for voting.

It is not a good style (nad mifght be cause for criticism by IEEE,
I supposse) to relax during the discussion periond the meaning
of a motion without having precisely formulated what it is
that is voted upon. And it makes a very bad impression if further
motions have the fate of Motion 4....

Thus a proposal should come first in an obligatory tentative form,
and voted should be on an official, polished version of the
tentative proposal. If this procedure needs formal voting to be
agreed upon, this should be fddone as soon as possible.



Note to the officers: this is significantly changed from the previous draft. I have responded to various comments, and have tightened up various definitions.

John Pryce
====================================

===Motion P1788/M0005.01_Level_2_Mixed_format===
Proposer: John Pryce
Seconder: Dan Zuras

===Motion text===
P1788 shall support mixed-format interval arithmetic (Definition D2) as specified in 1 to 11 of the Rationale below. Specifically:

(1) An implementation (of this standard) shall specify which interval formats (Definition D1) it supports, and shall provide mixed-format arithmetic for all supported interval formats. (2) An implementation shall be called 754-conforming if: (i) its underlying system is 754-conforming (Definition D3) and if the implementation supports intervals of each of the five basic 754 floating point formats: binary32, binary64, binary128, decimal64, and decimal128; or (ii) it is functionally indistinguishable from case (i).

This many should be _required_??? Much more slender implementations are
sufficient for essentially all applications.

I propose to require for 754-conformance only that at least one of these interval formats is provided, and that hull operations for point values
in the other formats are provided.


(3) The relation between a floating point format and its interval format shall be as defined in 7 and 8 below. (4) There shall be conversions between any two supported interval formats, as defined in 9a below. (5) An implementation shall specify a set of "interval-supported elementary functions" (Definition D7). For any such function and any supported format F there shall be an interval version of that function in that format, which shall enclose the sharp interval F-extension of that function as defined in 9b below, but is otherwise not specified by this motion. This motion also leaves it open whether every point elementary function (Definition D3) is interval-supported. (6) A 754-conforming implementation may provide other interval formats beside the basic five, and appropriate conversions and operations: for instance, in variable-precision arithmetic. (7) An implementation may provide more than one interval elementary function in a given format: for instance, to offer different trade-offs between speed and tightness.

Constraint: Items (3,4,5) above ensure that the Basic Theorem of Interval Arithmetic (BTIA) holds for mixed-format calculation in any conforming implementation, as outlined in 10, 11 below. The BTIA is fundamental to interval computation; therefore, an amendment to this motion is only acceptable if it preserves the BTIA.

This motion does not cover:
- What elementary functions shall be, or are recommended to be, provided by
  an implementation.
- Issues of the construction of intervals, from real numbers, text strings, etc. - Issues of definition of real-valued functions of intervals, such as whether
  finding the midpoint of the whole line should return NaN, or generate an
  exception, etc.
- Issues of tightness of enclosure, as in the 'accuracy modes' of Vienna§3.2. - Flags, exceptions or other means of signalling that an elementary function
  has been evaluated outside its domain.

===Rationale===
===============
This divides into Foundations; Terminology; Definitions; Introduction; Level 1 description; and Level 2 description.

Foundations.
------------
Motion 2 and the associated position paper, defining a levels structure as a guideline for the standard.

Motion 3, which defines P1788's number system and set of intervals.

Terminology.
------------
754§X denotes section X in the standard IEEE754-2008. Vienna§X stands for section X in the final version of the Vienna proposal, 19 December 2008.

Definitions.
------------
[Note. These are intended to be part of a standard set of definitions, abbreviations and acronyms of this standard, similar to 754§2. I have not yet put them in any systematic order.]

D1. An "infsup interval format" means the set IR_F of F-intervals for some floating point (abstract) format F, as defined in 7 below. When used without qualification, "interval format" means an infsup format -- by contrast with, say, a midrad (midpoint-radius) format.

D2. "Mixed-format" interval arithmetic refers to operations on intervals where the formats of the operand interval(s) and the destination interval may not all be the same. Note. It is implied by the theory below that the formats of the lower and upper bounds of an individual interval are the same.

D3. A "754-conforming system" is a programming environment in which the default floating point arithmetic (whether implemented in hardware, software or a mixture) conforms to IEEE754-2008. [Note. A hardware processor (e.g. the Cell processor) may not be 754-conforming in itself, but a 754-conforming system may be built on top of it with software assistance.]

D4. A "point elementary function" is a point function, in the sense of 4 below, that is provided by an implementation. The set of these is the implementation's "point elementary function library" ("point library" for short). These terms may be qualified by a format, e.g. "binary64 point library". [Note. The four basic arithmetic operations count as elementary functions.]

D5. An "interval elementary function" and the "interval elementary function library" ("interval library" for short), possibly qualified by a format, are defined similarly.

D6. An interval elementary function is "algebraic" if it is an interval extension of a point function. See 5 below.

This is poor terminology. Usually, exp is not considered to be an algebraic elementary function. Why not simply define as follows?

D6': An interval elementary function is an interval extension of a
point function. See 5 below.


D7. A point function is "interval-supported" if an interval extension of it exists in the interval library. [Note. An interval-supported function need not be in the point library. A function in the point library need not be interval-supported. It is left open whether the standard makes requirements on these issues.]

Introduction.
-------------
The theory below aims to provide a simple foundation for interval computation in mixed formats, for instance mixing binary and decimal intervals in arithmetic expressions, as well as potential compile-time "exact denotations" of intervals such as pi + [-0.1,0.1].

We believe the concepts given here, though general, are easily grasped and easily taught. They show that the theoretical difficulties of mixed-format interval computation are none, and the practical difficulties -- illustrated by the example in 10 below -- are minor.

The notation used is similar to that of the Vienna proposal, which is suitable for a plain-text document such as this. I believe the functionality defined below is essentially what the Vienna proposal defines, or rather what it would define if it supported mixed-format computation explicitly: see the end of Vienna§1.1.

I have avoided the Vienna terms "L-numeral" and "B-numeral", which are valuable notions but place an emphasis on representations that I think is unnecessary in this context.

I believe there is nothing in this motion and the theory below that hinders the implementation of various forms of non-standard intervals -- Kahan, modal, etc. -- as discussed at the end of Vienna§1.2.

What about mid-rad and D1?



Level 1 description.
--------------------
This section recapitulates what has already been decided.

1. R is the set of reals. R* is the set of extended reals
    R* = R union {-oo, +oo},
  where we use -oo and +oo to denote negative and positive infinity.
Following 754, any member of R* is called a number: it is a "finite number"
  if it is in R, else an "infinite number".

Thus real number = finite number, nonreal number = infinite number.


2. Following Motion 3, the set of "textbook" intervals (Vienna§1.2) in R,
  denoted IR, comprises the empty set together with all intervals
    x = { x in R | xlo <= x <= xhi }               (*)
  where xlo <= xhi, -oo <= xlo < +oo, -oo < xhi <= +oo.

  Following standard mathematical notation, the interval (*) is denoted
[xlo, xhi] if xlo and xhi are finite, but denoted (-oo, xhi] or [xlo, +oo)
  or (-oo, +oo) in the case of infinite endpoints, to indicate that -oo
  and/or +oo is not a member of the set.

This is an incompletely specified, quite dubious convention.

What is the notation if one does not know whether xhi is finite?

Why should it be required to change the notation once it becomes
known that xhi is infinite?

The usage of square brackets for all intervals, as in Intlab,
    [xlo, xhi] := { x in R | xlo <= x <= xhi }
is uniform and much prefereable.



  The empty set is denoted by Empty. The whole of R is denoted by Entire.

3. The (interval) hull of an arbitrary subset ss of R, written hull(ss), is the
  smallest member of IR that contains ss.

I'd call it the tightest member, since small is ambiguous and may mean numerically small.


4. A "point function" is a (possibly partial) multivariate real function: that is, a mapping f from a subset D of R^n to R^m for some integers n,m >= 0. (When n=0
  and m=1, we have a "named real constant".)

When not otherwise specified, a scalar function is assumed, i.e. m=1. If m>1,
  the function is called a vector function.

  D = D_f is the domain of f.

  To specify n, call f an "n-variable point function", or denote f as
    f(x_1, ..., x_n).

The "range" (or "exact range") of f over an arbitrary subset ss of R^n

Will there also be an inexact range? Or why this double definition?


is the
  set
      range(f, ss) = {f(x) | x in ss and x in D_f}.

It seems intended that this implies range(sqrt, [-1,1]) = [0,1].


Note. Here, f is a mapping, not an expression. Expressions are dealt with in
  11 below.

5. Unless otherwise specified, an "interval function" is a mapping ff from
  (IR)^n to (IR)^m for some n >= 0. To specify n, call ff an "n-variable
  interval function", or denote ff as
    ff(xx_1, ..., xx_n).
  As with point functions, m=1 is assumed unless specified otherwise.

An interval function is "algebraic" if it is an interval extension of some
  point function, as defined next. Examples of non-algebraic functions are
  the interval hull and intersection functions.

I object to calling the trancendental function exp(xx) an algebraic function.

I propose to differentiate between an interval function, which is
an interval extension of a real function, and an interval mapping,
which is a mapping between interval spaces. Thus exp(xx) would be an
interval function, whereas hull(xx_1,xx_2) is an interval mapping.


  Given an n-variable point function f, an "interval extension" of f is any
  interval function ff such that
    ff(ss) contains range(f,ss)  for any subset ss of R^n.
  The "sharp interval extension" of f is defined by
    ff(ss) = hull(range(f,ss))   for any subset ss of R^n.

Equivalently, for the case where ff has separate arguments ss_1, ..., ss_n,
  each being a subset of R, an interval extension satisfies
    ff(ss) contains range(f,ss_1, ..., ss_n),
  and the sharp interval extension satisfies
    ff(ss_1, ..., ss_n) = hull(range(f,ss_1, ..., ss_n)),
  for any ss_1, ..., ss_n.
  This is an alternative notation for the case where ss is the cartesian
  product of the ss_i.

In particular, when f is a binary operator "op" written in infix notation,
  this gives the familiar definition of its (sharp) interval extension as
    xx op yy = hull({x op y | x in xx, y in yy, and x op y is defined}).

  Notes.
  - Example. With these definitions
      xx*{0} = {0} and xx/{0} = Empty for any nonempty interval xx.
  - All interval functions used here are automatically defined for all
arguments (e.g. sqrt([-1,4]) = [0,2], sqrt([-2,-1]) = Empty). It is left
    open what exceptional action may be taken by an implementation on
    evaluating ff(ss), when ff is an extension of a point function f, and
    ss is not a subset of D_f.

But according to the above, an extension of sqrt must produce for
sqrt([-1,4]) an answer containing [0,2]; otherwise - according to
the above definitions - it does not deserve its name extension.

Thus the wording implies that exceptional action can only be
_in_addition_ to returning such a value.



Level 2 description.
--------------------
6. A floating point "abstract format" ("aformat") F is a finite subset of R*
  such that at least -oo and +oo are elements of F.

Note what this implies:

Calling it a subset precludes
1. the possibility of differentiating between +0 and -0.
2. having NaN's.
The Vienna Proposal takes a more liberal view,allowing F to be a
finite set, together with a partial value map into R^*.
(The wording in the Vienna Proposal is optimized in many places for maximal conceptual clarity. The future standard should be an
improvement over the Vienna Proposal rather than a fuzzy version of it!)

One should also require 0 to be in F; otherwise one gets into
trouble with interval extensions of abs(x)...


<754>A format in the 754 sense shall be identified with the aformat comprising those extended-real numbers that are exactly representable in that format,
  where -0 and +0 both represent 0.

The 754 format does not conform to your definition of an abstract format a few lines above, since the set of IEEE floats are not a subset of R^*.


  Notes.
- In view of this definition, 754's -0 and +0 are considered identical. Also, in 754 decimal formats, numbers in the same cohort are considered identical.
  - For examples, we use the abbreviations b64 to mean 754's 64-bit binary
format (C's "double"), and d64 for 64-bit decimal. Similarly b32 is 32-bit
    binary (C's "float").
  </754>

7. Given an aformat F, the set of F-intervals in R, denoted IR_F, comprises the
  empty set together with all textbook intervals whose endpoints are in F.

This makes sense only if F contains real numbers. I might be wrong,
but in my understanding, real numbers are different from their representation in an abstract format.


  Notes.
  - This definition implies some form of infsup, that is (lower bound,
  upper bound) representation of intervals. Variants of this exist, see for
  instance Vienna§1.6.
  - It also implies that it is not just wrong, but actually meaningless at
this level, to speak of an interval that has NaN as an endpoint or has its
  lower bound greater than its upper bound.

An "F-interval datum", by analogy with the definition of floating-point datum
  in 754§3.2, is either an F-interval or possibly, should P1788 support it,
  an abstract object NaI, "Not an Interval".

  An "interval datum" is an F-interval datum of any relevant aformat F.

  At this level we consider there is just one NaI for all aformats. (At the
  representation levels this would probably not be the case.)

Actually, given what was said above, one considers that there is at most
one NaI for all aformats. But for the sake of consistency,
it should be allowed to have several zeros at this level if and only if
one also allows possibly several NaI's at this level.


8. The (interval) F-hull of an arbitrary subset ss of R, written hull_F(ss), is
  the smallest F-interval that contains ss.

again ''smallest'' --> ''tightest''


  Notes.
  - The set hull_F(ss) always exists, because F is finite and contains -oo
    and +oo.
  - Always, hull_F(ss) contains hull(ss). If aformat G contains aformat F
in the sense of subsets of R* -- equivalently in the 754 case, if format G is wider than F in the sense of 754§2.1.36 -- then hull_F(ss) contains
    hull_G(ss).

9. We now give a mathematical characterisation of an ideal "mixed format
interval arithmetic".

In place of ''ideal'' it should say ''sharp'', or the other way around. there is no need to introduce two words for the same concept.


To illustrate how such mixed format arithmetic may
  be implemented in practice, we present examples assuming a 754-conforming
  system, in which "format" shall mean a 754 supported format (754§2.1.52).
  Text that assumes this is enclosed between the tags
    <754>  and  </754>.
  It is assumed that the compiler (or other relevant processor) knows the
  radix of any format.

  Relevant features of 754 are that floating point computational operations
(754§2.1.11) are regarded as having a specified destination format. Further, many operations are "formatOf" operations (754§5.1), for which a correctly rounded result (in a chosen rounding direction) is produced for any operands
  _of the same radix_ as the destination format.

9a. Interval format conversion.
  If F is an aformat, the "conversion" to format F shall mean the operation
  that maps an interval xx of any other supported format to its F-hull,
      yy = hull_F(xx).

<754>This interval operation can in all cases (whether xx has the same radix as F or not) be implemented in terms of one of the floating point operations
      formatOf-convertFormat
  defined in 754§5.4.2, with the appropriate outward rounding.
  </754>

9b. Mixed-format interval elementary operations.
  Given an aformat F and an n-variable point function f, an "interval
  F-extension" of f is any interval extension of f whose values are always
  F-intervals. The "sharp interval F-extension" ff_F of f is defined by
      ff_F(ss) = hull_F(range(f,ss))    for any subset ss of R^n.
  For the case where ss is the cartesian product of ss_1, ..., ss_n,
  each ss_i being a subset of R, this is written as
      ff_F(ss_1, ..., ss_n) = hull_F(range(f,ss_1, ..., ss_n)).

  If xx_1, ..., xx_n are interval datums, of arbitrary formats, then

I'd add write for clarity: ''then the sharp interval F-extension''


      ff_F(xx_1, ..., xx_n)             (1)

(1) has already a meaning in this document (towards the top of it).
A more descriptive label should be used and referenced further below.


is given by the previous equation with the xx_i substituted for the ss_i, if
  all the xx_i are intervals (and therefore are subsets of R).
If not -- that is if at least one argument is NaI -- then, by definition, the

By which definition? Or is _this_ the definition?


  result is NaI. The NaI case will not be considered further.

When f is an operator written in infix form, a corresponding notation is used,
  e.g.
  <754>
      zz = xx +_b64 yy
means that the intervals xx and yy, of arbitrary formats, are combined by the
  mathematical interval addition operation to give an intermediate result
      xx + yy,
  that is enclosed in the smallest b64 (double-precision) interval to give
  the delivered result zz.
  </754>

10. <754>Suppose f is a "formatOf" elementary floating point operation. When all the operands xx_i have a format with the same radix as F, then the interval result (1) can be evaluated in terms of the "formatOf" operation f with the appropriate outward rounding. (This is clear for the four basic operations
  and SQRT; it may be less obvious for transcendental functions.)

It is not less obvious but wrong unless _optimal_ directed rounding
is available.


  If some xx_i have a different radix from that of F, format conversions as
defined in 9a are necessary and the situation becomes more complicated. As an example, consider forming the product of two d64 (decimal64) intervals xx and yy, to give a b32 (binary32) result zz. We may convert xx and yy to b32 first,
  which may be written:
      zz1 = hull_b32(xx) *_b32 hull_b32(yy)
  or may form the product in d64 and then convert:
      zz2 = hull_b32(xx *_d64 yy).

  Example. Let xx and yy be the singleton intervals [x,x] and [y,y] where
x = 5^16 and y = 5^(-16), both of which are exactly represented in d64. When
  the b32 product is formed in each of the two ways above we find
      zz1 = [3f7ffffd, 3f800001]  in hexadecimal
  (note the number 1 is 3f800000 hex), whereas
      zz2 = [1,1]  exactly.
  Thus zz1 is too wide by 3 ulps at its lower bound.
  </754>

It is probably not the role of P1788 to say which of these patterns of radix
  conversion, or something else, should be done in practice.

We'd not specify the details but must say something about the required
accuracy, or how to communicate to users the accuracy they can expect.
(This is why the Vienna Proposal has the options 'tightest' and
'accurate'. Requiring accurate probably leaves the intended freedom,
and gives users assurance of high quality.)


What is important in the example above is that each of zz1 and zz2 encloses the mathematical (and impractical to compute) result of the sharp interval
  b32-extension of the multiplication function as defined in 9b:
    zz = xx *_b32 yy.
  For zz1 this is because hull_b32(xx) and hull_b32(yy) enclose xx and yy
  respectively. For zz2 it is because xx *_d64 yy encloses the mathematical
  result xx * yy.

  The property being used here is that for any F, the sharp F-extension is
inclusion-isotonic. This argument is completely general (whether the system
  is 754-conforming or not) and underlies the next item.

11. Interval evaluation of expressions.
Let E be an explicit real expression in arguments x_1, ..., x_n. It can be
  written as a sequence of steps
      v = e(u_1, ..., u_k)              (2)
  where e is a point elementary function, and each u_i is either one of the
  inputs x_j or a previously calculated v. One (or for a vector expression,
  more) of the v's is nominated as the output of E. This defines a point
  function from R^n to R^m:
      (y_1, ..., y_m) = y = f(x) = f(x_1, ..., x_n).
  The (natural) domain D_f of f is the set of points x in R^n for which
  evaluation succeeds, in the sense that for each step (2), the argument
  (u_1, ..., u_k) lies in the domain of e.

  Let an interval aformat be specified for each of the input arguments
  x_1, ..., x_n. For each e in the sequence of steps (2), let an interval
  aformat be specified as its destination format. Then, given arbitrary
  interval inputs xx_1, ..., xx_n of the specified formats, the "interval
  evaluation" of E consists of a sequence of steps

Probably: ''_an_ interval evaluation'', not _the_, since the nature of
ee_F is not specified.

On the ther hand, taking 9b fully seriously, ee_F denotes the
sharp F-evaluation, which might not even be implemented!?

Thus there seems to be some inconsistency in the notation defined.


      vv = ee_F(uu_1, ..., uu_k)              (3)
  where
  *     ee_F is an interval F-extension of e.
  *     uu_j are interval results from previous steps.
  This defines a (possibly vector) interval function
      (yy_1, ..., yy_m) = yy = ff(xx) = ff(xx_1, ..., xx_n)           (4)
where each input xx_j has the aformat specified for x_j, and each yy_i has
  the destination aformat specified for the operation e that produces it.

  Note.
- This discussion only applies to interval expressions that are algebraic,
    in the sense of 5 above. Expressions that contain, hull or intersection
    operations, or involve point-valued functions of intervals such as the
    midpoint, are excluded.

It is much better make this restriction part of the definition of an expression!



It is left open here -- but has a significant influence on quality of implem- entation -- just what interval F-extension ee_F is chosen in each step (3).

Conflicting with the defined meaning of ee_F pointed out above.


For expressions just involving the four basic operations, and when only one
  format is involved, the standard may require that ee_F is the sharp
F-extension. In more complicated situations the choice of ee_F may depend on
  the compiler, the optimisation level, switches or pragmas defined by the
  standard, or other factors.

If you leave things open one paragraph ahead, you don't need to discuss
here how we might close this gap. This is only confusing.


<754>At a step (3) where the input intervals have the same radix as the output, it is natural that ee_F be implemented by an interval version of a "formatOf" operation as described in item 10, but this is not necessary. At a step where some input has a different radix from the output, it is left unspecified how
  the needed radix conversions are done (the example in item 10 illustrates
  alternatives).
  </754>

However these ee_F operations are implemented, it is easy to see that Moore's
  Basic Theorem of Interval Arithmetic holds for any such mixed-format
  calculation:

Theorem (BTIA). For any interval inputs xx = (xx_1, ..., xx_n), the result yy = ff(xx) of (4) encloses the exact range, range(f, xx) of the real function
  f over xx.



I sincerely hope that this proposal goes through some iterations
of refinement before it is seconded and thereby accepted as the
text to vote upon!


Arnold Neumaier