Thread Links Date Links
Thread Prev Thread Next Thread Index Date Prev Date Next Date Index

Re: Motion 31: V04.2 Revision of proposed Level 1 text



> Subject: Re: Motion 31: V04.2 Revision of proposed Level 1 text
> From: John Pryce <j.d.pryce@xxxxxxxxxxxx>
> Date: Tue, 7 Feb 2012 15:28:39 +0000
> To: stds-1788 <stds-1788@xxxxxxxxxxxxxxxxx>
> 
> P1788, in particular Vincent, Dmitry, Dan
> 
> On 26 Jan 2012, at 13:19, Vincent Lefevre wrote:
> > On 2012-01-25 11:07:04 -0800, Dmitry Nadezhin wrote:
> >> 2) fma(x,y,z)
> >> The operation fma is mentioned in definitions, but the document
> >> don't clarify the status of interval version fma. Is it required,
> >> recommended or none ?
> > 
> > At Level 1, it is the same as x * y + z. I wonder whether it should
> > be defined. Level 2 could introduce a generic concept of fused
> > operations, and recommend or require some of them.
> > (If there is a dot product, why not a sum of n intervals?)
> > 
> Oh dear, fma is meant to be required.
> 
> >> 3) negate(x) = -x
> >> The unary minus is an operator in most programming languages.
> >> Shouldn't we require interval version of negate ?
> > 
> > This is equivalent to 0 - x, and since { 0 } can be exactly
> > represented by [0,0], negate(x) may not be really necessary.
> > Ditto for recip(x), since { 1 } can also be exactly represented
> > by [1,1].
> 
> There are good arguments in either direction. We need a consistent policy on when derived operations (ones expressible in terms of existing ones) should have a separate definition in the standard. 
> 
> If you take the argument against fma to its conclusion, you'll decide not to give a definition for atan (derivable from atan2) or sinh and cosh (derivable from exp). IMO this is going too far. 
> 
> My attempt at a consistent policy is informal and incomplete:
> 
> (a) If an operation is defined (for floating point) in all the common scientific languages,
>   that's a good reason include it. That means include negate(x) (as required).
> 
> (b) If it is a required operation in 754, we should make it required. That means fma should
>   be required. Also, fma is a formatOf operation (754§5.4.1) so an implementation, that
>   supports the mentioned types in a 754-conforming way, shall support, e.g.,
>       ww_binary64 = fma_binary64(xx_binary128, yy_binary32, zz_binary64)
> 
>   with the indicated inf-sup types of the inputs and output, with a "tightest" result, i.e.
>   the binary64infsup hull of the mathematical result. Other types shall provide fma, but
>   are neither required to make it "tightest" nor required to support mixed-type operations.
> 
> (c) Some cases puzzle me. We have sqr(x)=x^2 required because in interval arithmetic it
>   behaves much better than x*x. But it is the case of (required) pown(x,p) when p=2. And
>   the case for recip, which is pown(x,p) when p=-1, is even weaker.
> 
>   Should we remove these (and similar cases such as rsqrt(x)) from the Level 1 lists, and
>   say they just rename existing operations? I regard negate(x) as different, since
>   defining it as 0-x converts a 2-real argument to a 1-real argument operation.
> 
> Dan, I think you have the experience to devise a consistent policy for this issue. Are you willing to submit a motion on it? I think my arguments (a,c) are just about Level 1, but (b) straddles Levels 1 and 2.
> 
> John

	Way back when 754 was still a dream we discussed the
	need for negate(x) on the grounds that it might have
	to behave in a particular manner WRT the sign of zero
	in the context of directed roundings & branch cuts in
	the complex plane.  But, after a discussion that
	lasted far far longer than the issue called for, we
	decided that the sign of zero should be the same as
	the expression +0 - x.  So that argument was rendered
	moot but we included negate(x) anyway on the grounds
	that it was just a sign flip in an era when floating-
	point was done in software.

	The sign of zero is moot once again in the context
	of intervals.  But I think your point (a) that most
	languages have a unary minus is still valid.

	We did NOT include a reciprocal(x).  While it is
	true that a + negate(b) == a - b under all
	circumstances, the expression a*reciprocal(b) != a/b.
	The left side has two rounding errors while the right
	side has only one.

	Again, this is moot for intervals at level 1.  But
	I suspect that A*reciprocal(B) tends to be slightly
	wider than A/B at level 2.  Though, you guys are the
	experts.

	The requirement for FMA is new to 754-2008.  The
	exact value is the correctly rounded a*b + c as a
	single expression.  There is a new error associated
	with it:  FMA(0,inf,NaN) = NaN need not signal
	Invalid whereas 0*inf + NaN = NaN does.

	Again, this is moot for intervals.  But for the same
	reason that Ulrich wants his dot product, the level 2
	FMA will be narrower than A*B + C.  So your point (b)
	is good in this case.

	Further, while I think it is a good policy to include
	those things that 754 included, I really think we
	should examine each one for its suitability in 1788.
	The more I learn about intervals the more I see it
	is not just floating-point on 2 numbers.  In many
	cases it will be straightforward that the interval
	version of a 754 function is justified if only
	because it is constructed out of the floating-point
	version.  However, in many cases it will either be:
	(1) constructed in some entirely different manner
	or (2) turn out to be something strange when removed
	from the context of floating-point.  We need to take
	some care in this.

	Then your point (c) is well taken that some functions
	like square(X) = X^2 takes on a special meaning in the
	interval context that is qualitatively different from
	X*X.  In Clause 9 of 754, we make a point of defining
	the various power-like functions so that things like
	x^2 & 1/x come out right.  We should take the same
	care in 1788 WRT how these things behave for intervals.
	Our users will not thank us for this attention to
	detail but they will surely curse us if we get it
	wrong.

	In analogy to square(X), might also consider some
	polynomial evaluation functions on the grounds that
	real improvements are possible over the naive
	polynomial expression.  I don't know if there are
	other things which make no difference in floating-
	point but can be improved in intervals but I suspect
	there are.  Trot them out if you know them.

	Finally, I think all of your points (a), (b), & (c)
	should be considered in both level 1 & level 2 contexts.
	We will make some decisions at level 1 but may need to
	consider the level 2 functions to make others.

	(For example, Nate & I are currently trying to work
	out a problem with Dmitry's approach to radius()
	which came about because of the difference between
	the level 1 & level 2 definitions of this function.
	It is not always a slam dunk.)

	As for me making a motion on the topic, I think I am
	unqualified to do so for some of the reasons I've
	outlined.  I may be able to advise others on the
	floating-point implications of standardizing some
	larger expression.  But I don't know the algorithms
	like you (plural) do.


				Dan