Re: Motion 31: V04.2 Revision of proposed Level 1 text
> Subject: Re: Motion 31: V04.2 Revision of proposed Level 1 text
> From: John Pryce <j.d.pryce@xxxxxxxxxxxx>
> Date: Tue, 7 Feb 2012 15:28:39 +0000
> To: stds-1788 <stds-1788@xxxxxxxxxxxxxxxxx>
>
> P1788, in particular Vincent, Dmitry, Dan
>
> On 26 Jan 2012, at 13:19, Vincent Lefevre wrote:
> > On 2012-01-25 11:07:04 -0800, Dmitry Nadezhin wrote:
> >> 2) fma(x,y,z)
> >> The operation fma is mentioned in definitions, but the document
> >> don't clarify the status of interval version fma. Is it required,
> >> recommended or none ?
> >
> > At Level 1, it is the same as x * y + z. I wonder whether it should
> > be defined. Level 2 could introduce a generic concept of fused
> > operations, and recommend or require some of them.
> > (If there is a dot product, why not a sum of n intervals?)
> >
> Oh dear, fma is meant to be required.
>
> >> 3) negate(x) = -x
> >> The unary minus is an operator in most programming languages.
> >> Shouldn't we require interval version of negate ?
> >
> > This is equivalent to 0 - x, and since { 0 } can be exactly
> > represented by [0,0], negate(x) may not be really necessary.
> > Ditto for recip(x), since { 1 } can also be exactly represented
> > by [1,1].
>
> There are good arguments in either direction. We need a consistent policy on when derived operations (ones expressible in terms of existing ones) should have a separate definition in the standard.
>
> If you take the argument against fma to its conclusion, you'll decide not to give a definition for atan (derivable from atan2) or sinh and cosh (derivable from exp). IMO this is going too far.
>
> My attempt at a consistent policy is informal and incomplete:
>
> (a) If an operation is defined (for floating point) in all the common scientific languages,
> that's a good reason include it. That means include negate(x) (as required).
>
> (b) If it is a required operation in 754, we should make it required. That means fma should
> be required. Also, fma is a formatOf operation (754§5.4.1) so an implementation, that
> supports the mentioned types in a 754-conforming way, shall support, e.g.,
> ww_binary64 = fma_binary64(xx_binary128, yy_binary32, zz_binary64)
>
> with the indicated inf-sup types of the inputs and output, with a "tightest" result, i.e.
> the binary64infsup hull of the mathematical result. Other types shall provide fma, but
> are neither required to make it "tightest" nor required to support mixed-type operations.
>
> (c) Some cases puzzle me. We have sqr(x)=x^2 required because in interval arithmetic it
> behaves much better than x*x. But it is the case of (required) pown(x,p) when p=2. And
> the case for recip, which is pown(x,p) when p=-1, is even weaker.
>
> Should we remove these (and similar cases such as rsqrt(x)) from the Level 1 lists, and
> say they just rename existing operations? I regard negate(x) as different, since
> defining it as 0-x converts a 2-real argument to a 1-real argument operation.
>
> Dan, I think you have the experience to devise a consistent policy for this issue. Are you willing to submit a motion on it? I think my arguments (a,c) are just about Level 1, but (b) straddles Levels 1 and 2.
>
> John
Way back when 754 was still a dream we discussed the
need for negate(x) on the grounds that it might have
to behave in a particular manner WRT the sign of zero
in the context of directed roundings & branch cuts in
the complex plane. But, after a discussion that
lasted far far longer than the issue called for, we
decided that the sign of zero should be the same as
the expression +0 - x. So that argument was rendered
moot but we included negate(x) anyway on the grounds
that it was just a sign flip in an era when floating-
point was done in software.
The sign of zero is moot once again in the context
of intervals. But I think your point (a) that most
languages have a unary minus is still valid.
We did NOT include a reciprocal(x). While it is
true that a + negate(b) == a - b under all
circumstances, the expression a*reciprocal(b) != a/b.
The left side has two rounding errors while the right
side has only one.
Again, this is moot for intervals at level 1. But
I suspect that A*reciprocal(B) tends to be slightly
wider than A/B at level 2. Though, you guys are the
experts.
The requirement for FMA is new to 754-2008. The
exact value is the correctly rounded a*b + c as a
single expression. There is a new error associated
with it: FMA(0,inf,NaN) = NaN need not signal
Invalid whereas 0*inf + NaN = NaN does.
Again, this is moot for intervals. But for the same
reason that Ulrich wants his dot product, the level 2
FMA will be narrower than A*B + C. So your point (b)
is good in this case.
Further, while I think it is a good policy to include
those things that 754 included, I really think we
should examine each one for its suitability in 1788.
The more I learn about intervals the more I see it
is not just floating-point on 2 numbers. In many
cases it will be straightforward that the interval
version of a 754 function is justified if only
because it is constructed out of the floating-point
version. However, in many cases it will either be:
(1) constructed in some entirely different manner
or (2) turn out to be something strange when removed
from the context of floating-point. We need to take
some care in this.
Then your point (c) is well taken that some functions
like square(X) = X^2 takes on a special meaning in the
interval context that is qualitatively different from
X*X. In Clause 9 of 754, we make a point of defining
the various power-like functions so that things like
x^2 & 1/x come out right. We should take the same
care in 1788 WRT how these things behave for intervals.
Our users will not thank us for this attention to
detail but they will surely curse us if we get it
wrong.
In analogy to square(X), might also consider some
polynomial evaluation functions on the grounds that
real improvements are possible over the naive
polynomial expression. I don't know if there are
other things which make no difference in floating-
point but can be improved in intervals but I suspect
there are. Trot them out if you know them.
Finally, I think all of your points (a), (b), & (c)
should be considered in both level 1 & level 2 contexts.
We will make some decisions at level 1 but may need to
consider the level 2 functions to make others.
(For example, Nate & I are currently trying to work
out a problem with Dmitry's approach to radius()
which came about because of the difference between
the level 1 & level 2 definitions of this function.
It is not always a slam dunk.)
As for me making a motion on the topic, I think I am
unqualified to do so for some of the reasons I've
outlined. I may be able to advise others on the
floating-point implications of standardizing some
larger expression. But I don't know the algorithms
like you (plural) do.
Dan