[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Suggestion of annex D



Florent,

your proposal looks good to me, but you need to address the problem of when the limit is zero. I have put a rough suggestion at the end of this reply. Also a principle for -0 is required when the domain does not include negative values. These two items are the main reason for this reply.

Florent de Dinechin wrote:
Most of our special case questions are not problems of floating-point arithmetic, but problems on more general mathematics.
The principled view is that all these cases should return a NaN, which programmers test for and deal with as is appropriate to their application. The practical view is that for certain cases certain definite values are more convenient for most programmers, saving them the need to test for NaN. Those who don't like the choice would have to test for the special case before the calculation, rather than test for NaN afterwards. This is not too much of a burden provided the test is not too difficult, i.e. not something that would require as much work as calculating the original function. But ultimately it is a matter of taste and thus of consensus.
For example, concerning pow, there is a consensus among the mathematicians that 0^0=1 is convenient.
Which makes a general rule more general, hence less fiddly and likely to lead to errors.
There is no such consensus on 0^infty, 1^infty, etc.
Actually, 0^infty should equal 0, by the limiting principle you describe below. Perhaps you meant infty^0. In this case almost identical arguments can be used justify infty^0 = 1.
The previous function is undefined there.
.
.
.
Once this clarification is made, the second idea in this suggestion is that the operation returns NaN for any input on which the function is undefined. This is the easy consistent option, but it differs from the C99 standard.
Disagreeing with the C standard, or any other standard for that matter is the correct choice when they specify bad choices. The result will be that users will have the choice of the mor econsidered ieee 754 function or an inferior version.

I have only seen the draft C standard, but its suggestion for pow(x,y) does not inspire confidence. My main objection is that it should be split into two functions, one taking integer exponents and the second taking real exponents, but only applying them to non-negative numbers.
.
.
.
Now to the main purpose of my reply.
For an n-ary function f and floating-point numbers X1,..Xn,
when the limit of f(x1,...xn) as (x1,...xn) tends to (X1,..Xn) exists and is a real number c , the operation associated to f(X1,...Xn) shall return c rounded in the prevailing rounding mode. When this limit is an infinity, the operation shall return this infinity. When this limit does not exist, (X1,..Xn) shall be considered as an invalid operand. In the case when one of the Xi is +0 (resp. -0), the limit shall be considered using positive (resp. negative) values only of the corresponding xi.
]
Do you intend this applies to all limit points, i.e. not just to those involving infinity? I think this is the correct choice, but then you must consider discontinuities, the most common of which are branch points. These cases would always require special consideration. Fortunately, there is a strong consensus for most branch points. Just as fortunately, the consensus is usually covered by the cases of the type x tends to +0 and x tends to -0.

The case when the limit is zero needs to be dealt with. What is needed is something like: when the limit at (X1, ..., Xn) is 0, then if f(x1,...xn) >= 0 in a region around (X1, ... , Xn), then a value of +0 should be returned, otherwise if it is <= 0 in a region around (X1, ... , Xn), then -0 should be returned, otherwise +0 should be returned. This suggestion has a bias towards +0, i.e. if f(x1, ... , xn) is identically zero around (X1, ... , Xn), then return +0 and if f(x1, ... , xn) returns both strictly positive and strictly negative values as it approaches arbitrarily close to (X1, .. Xn), return +0. In the first of these cases, -0 would be preferable if the function is < 0 immediately outside the region where it is identically zero.

Another important case is when the limit as x tends to zero is zero, for a function which is undefined for x < 0. My strong preference is that f(-0) should equal f(+0) equals f(zero). I am using "zero" to indicate the mathematical value, as opposed to the floating point constructs =/-0. However, the only example in the standard of this case I can see is sqrt(), in which case sqrt(-0) is specified to be -0. I do not know the justification for this choice, but is seems problematical, as -infty's can turn up unexpectedly where programmers would assume they can only get positive values.

Peter Henderson

754 | revision | FAQ | references | list archive