Thread Links Date Links
Thread Prev Thread Next Thread Index Date Prev Date Next Date Index

Re: motion elementary functions



Dan, Hossam, P1788 members

On 13 Oct 2009, at 14:02, Dan Zuras Intervals wrote:
	We had great difficulty deciding on the correct definition
	of x^y for 754.  In the end, including all of pown, powr,
	& pow was the compromise I LEAST desired.  And I, like
	Hossam as I remember, argued for pow, the most general
	of the three.

I am VERY sympathetic to the idea of having one pow, which is to be the most general, in some sense. In principle... But I have doubts. Three different definitions of power are being put forward:

(a) Naive: x^n for integer n, defined for n > 0 as x*x*...*x (n times), and extended to n <= 0 for nonzero x, in the unique way that makes the index law x^(m+n) = x^m x^n hold.
(b) Exp-based: x^y defined as exp(y log(x)).
(c) Rational: the more subtle definition x^r for rational r, based on (a) and on the n-th root function.

Let Q_o (o for odd) be the set of rationals that have odd denominator when reduced to lowest terms. The set Z of integers (zero included) is a subset of Q_o because n=n/1 in lowest terms.

Domains of definition:
(a) "x^n" is defined for all (real x, integer n) except for (0,n)
        with n < 0.
    Yes: 0^0 is defined and is 1. This is, of course, convention,
    but x^0 = 1 for all x != 0 in this definition, so it is
    perverse not to extend it by continuity to x = 0.
(b) "x^y" is defined for all (real x>0, real y).
    But 0^0 is unequivocally NOT defined here. For good reason,
    as the joint limit as (x,y) -> (0,0) does not exist.
(c) "x^r" is defined, for rational r = p/q in lowest terms, to be
          (q-th root of x)^p
        whenever this exists.
        It actually exists iff
          q-th root of (x^p)
        exists, and they are equal, so either can be used as
        the definition. Specifically, x^r exists for
        -   all (real x>0, rational r)
        -   (x=0, all rational r)
        -   all (real x<0, r in Q_o).
    I don't think this quite matches what Dan said. But for x<0
   It can even be evaluated efficiently as -exp(y*log(|x|)
    seems true.

It is easy to prove
Theorem.
These three functions agree. That is, at any (x,y) for which more than one of them is defined, they take the same value.

Corollary.
We can define a function Pow(x,y) on the union of the domains of these functions, to equal whichever of them is defined at that point.

Hossam, Dan:
Is this Pow(x,y) the same as the pow(x,y) that you want P1788 to use?

(***) Or, horrors, do you actually want to use the definition -exp (y*log(|x|) for ALL x<0 and all y??

My doubts
---------
Recall Michel's sage words:
On 3 Sep 2009, at 11:42, Michel Hack wrote:
The various examples where domain violations are mishandled if the
wrong choice is made with regard to returning or not returning NaI
demonstrate that the choice is needed.  The problem now is that
this requires somebody to make the choice, i.e. understand what is
going on.  In other words, the programmer has to THINK (oh my, how
old-fashioned can you get).

This all-singing-all-dancing Pow(x,y) is in the PL/I tradition of "give a meaning to more or less everything, and worry later what inconsistencies may arise". But there are genuine subtleties. Take an example like Hossam's:

(i)    double z = Pow(-32,0.2);
A clever compiler will do this in exact arithmetic and define z = -2 by using definition (c) above. A not so clever compiler may convert 0.2 to binary FP, and produce z = undefined, presumably NaN?

(ii)   double y = 0.2;
       double z = Pow(-32,y);
This will certainly give z = undefined. Unless doing (***) above, for which I see no mathematical justification.

(iii) interval_b64 yy("[0.2]"); //constructor of double interval from string
       interval_b64 zz = Pow(-32,yy); //(point)^(interval)
Here yy is an interval containing the exact real 0.2, so necessarily of nonzero width, so zz is also of nonzero width and cannot possibly be [-2,-2] exactly.

(iv)   interval_b64 xx(-32); // constructs [-32,-32]
       interval_b64 zz = Pow(xx,0.2); //(interval)^(point)
A VERY clever compiler may produce zz = [-2,-2] exactly here, but I wouldn't bank on it.

Of course in radix 10 it's a different story (at least for y's that can be expressed exactly in decimal FP). But for binary FP I can't see that this very general Pow will do more than cause grief to users who thought, wrongly, that it saved them from having to THINK. That's why I like the distinction of integer power x^n and real power x^y into different functions.

No doubt I have misunderstood Hossam and Dan. So please put me right.

Best wishes

John