Thread Links Date Links
Thread Prev Thread Next Thread Index Date Prev Date Next Date Index

Re: text2interval again / a digression on fixed-point iterations, Mathematica Intervals, point-value vs interval constants



On 3/8/2013 2:14 AM, Arnold Neumaier wrote:
On 03/08/2013 08:39 AM, Richard Fateman wrote:
On 3/7/2013 8:52 PM, Michel Hack wrote:
Richard Fateman wrote:
On 3/7/2013 12:51 AM, Arnold Neumaier wrote:

That is, something like:

set_rounding_mode(negative_infinity)
y:=sin(x)

   is not the kind of thing supported anyplace I know of, to get
a lower bound on sin(x). Perhaps you know of such a system?

One can do it in Java and in C++.

The documentation does not say that sin/cos/ ... etc observe the rounding modes
at all.

I looked at the source code for sin e.g. http://www.netlib.org/fdlibm/k_sin.c and there is no mention of observing the rounding mode. It is certainly not the case that evaluating a polynomial P at x with rounding mode set to toward-negative-inf will produce a result that is a lower bound on P(x). e.g. (1-x^2), if x^2 is rounded down.

I found a sqrt routine that makes an effort to find out the rounding mode for one step in its calculation, but, as far as I can tell with a cursory inspection, it is not doing so to affect the final result, but to converge to the expected round-to-nearest regardless of
the existing rounding mode.



Perhaps I am being unrealistic in thinking that intervals can peacefully
coexist with
machine scalars, and the standard should encourage that, not handicap it.

There exist several interval packages (e.g., in Java and C++) using IEEE directed rounding support to provide rigorous results.

The standard is supposed to unify the different approaches and to remove some inefficiencies of the current packages that used to need ugly and/or slow workarounds.
I have not looked at these recent libraries (except the ones my students or I happen to write.) Using IEEE directed rounding may be slow and apparently in some environments not
reliable.


But if you insist on doing number parsing as part of the standard, then
"1/10" is a far more explicit and obvious notation than "0.1".

Parsing arbitrary rationals or arbiotrary decimal floats makes not much of a difference, neither in programming effort nor in the resulting speed.
I agree. After all 0.123 could be written 123/10^3. The difference is in semantics of the notation "0.1" inside text2interval and outside, in the host language. If the host language literal constants allow 1/10, then it is plausible for 1/10 to mean the same thing everywhere.


Note that text2interval is supposed to be used mainly for conversion from human-written text in some program (though it could be exploited for other tasks).
Then it probably doesn't do the job.
It seems to me that it is defined to convert from some human mathematician's notes into an interval program.
Or perhaps from an interval program to an interval program.
If the human-written text is in some program NOT an interval program, then 0.1 means something different in the program than in text2interval. This is illustrated below in a fixed-point iteration. (that you supplied).



If the mathematician was so inclined to prove something about x in
[1/10, 2/10] with a computer, then he (or she) should first make
sure that [0.1,0.2] meant the same as [1/10,2/10].

For a mathematician, these indeed mean precisely the same.
That would be a mathematician who

(a) Is accidentally or deliberately oblivious to computing technology.
[This is not a moral judgment -- I certainly do not require
mathematicians to be computer literate. CF Gauss
for example.:)  ]

or

(b) Knows about computer
representation of numbers like 0.1  but has learned that
this should be discarded  when
written as "0.1"  in an interval library.


or
(c)  is a bit vague when it comes to numbers --- perhaps
is an Applied mathematician working on an investigation
in which the exact meaning of 0.1 only needs to be
"close enough for government work."

By contrast, many people using computers, even mathematicians
doing scientific computing or experimental mathematics, and
familiar with programming and such, might truly be confused,
at least if they cared enough, that 0.1 and "0.1" were ever so
slightly different, and why they have to discard their previous
understanding.




Indeed, interval methods were partly invented in order that one can do rigorous mathematics though a computer with inexcact computations is involved. See, e.g., my paper on computer-assisted proofs,
     http://www.mat.univie.ac.at/~neum/ms/caps.pdf
thanks, I was not aware of this paper.

You emphasize (correctly, in my view) that the success of interval
methods is in part due to the simplicity of expressing such computations
in (say) Matlab.  The transparency of such proofs often
benefits from  essentially the same program texts being used to express
algorithms and to execute the interval versions.  And, I think, having
the constants in each program text mean the same thing.


There are, of course, other mechanical ways of proving theorems, but it
is nice to see some examples of intervals doing a job.

Sadly  (for mathematicians) the real market for proofs is in
proving that computer programs conform to specifications.  People in
general are much more concerned that a bridge will not
fall down than about mathematical conjectures.  Presumably
global optimization plays an important role in some practical
matters too.  A plan that targets mathematicians to the exclusion
of others (like engineers, programmers)  may not be ideal.

Unless I am missing something in the problem statement,
I think the solution to systems of (algebraic) inequalities can
be decided rigorously by cylindrical algebraic decomposition (CAD)
in computer algebra systems.  Interval methods may be a
vastly superior technique since CAD is slow.  I suppose this
deserves more study.




Are we going
to protect the mathematician from all possible representation-
related issues in the scalar world outside intervals?
YES -- that's precisely what text2interval() can do: make the program
portable to environments with different underlying representations,
while at the same time allowing the properties of the representation
to be fully exploited.
This goal does not seem objectively satisfiable unless I am
misunderstanding
something.  It seems to me that programs that work in double-float may
fail in single-float.  Not in the validity of an
enclosure, but in the usefulness of the answer, namely tightness. If
your result is to
test to see if two intervals intersect, your program may not work the
same everywhere.

Yes. Decisions based upon an interval program must have three-way branches, depending on whether a positive decision, a negative decision, or no decision is reached. This is not really different from the usual try-catch mechanism where the inability of a program to give an answer (since it may fail) is accounted for explicitly.

Not reaching a decision is not counted as a failure in logic, but only as a failure in deciding a problem.
OK, I accept your re-definition of "decision" for this purpose.
As thare are undecidable problems in mathematics anyway, the failure to decide something is not that severe.
I guess that here I disagree with your last statement on severity.

The fact that some class of problems is in general undecidable means to me that no one can expect you to provide a fool-proof algorithm to solve every member of that class of problems. It does not
give you permission to fail on (easily solvable) members of that class.

The problem of "equivalence to zero of a symbolic expression" is undecidable, yet almost
everything done by computer algebra systems relies on making such decisions.



What correctly programmed interval methods (and related methods based on directed rounding) however assure is that _whenever_ a decision is arrived at, it is provably correct.

Moreover, for many problems one can restart an algorithm in case of failure with higher and precision until one gets the answer. Indeed, this is done for quite a number of computational geometry algorithms, where the correct performance depends on making certain branching decisions in a way guaranteed to be mathematically correct - otherwise
visibly wrong artifacts may be produced.

Until one gets the answer or runs out of patience..


Perhaps one can show that in some sense a better representation will
always get
an answer that is at least as good as a worse representation.

There are certain such results, based on inclusion monotony of interval arithmentic, and there are other results based on asymptotic properties that ultimately arbitrarily accurate bounds are obtainable.
See, e.g., my book on interval methods,
    A. Neumaier,
    Interval Methods for Systems of Equations,
    Encyclopedia of Mathematics and its Applications 37,
    Cambridge Univ. Press, Cambridge 1990.



The value is determined by the mathematical problem posed, which is
usually in text format.
I really don't see this the same way you do. There are applications in
which some physical problem starts with some uncertainty.  A scientist
will not know the uncertainty exactly and so will add some "slop" to it.

Of course, there are also these applications. In this case the precise bounds would not matter, but even in this case it is useful to know that the result obtained is correct for the problem exactly as specified.

In contrast, pure floating-point calculations may result in answers that are arbitrarily off the true answer, and sometimes without warning.
A simple example is the harmless-looking program
    x=0.2;
    for k=1:30, x=6*x-1; end;
which has the exact result 0.2. But the computed result in double precision floating point arithmetic is orders of magnitudes away.

This would be revealed by an interval computation of the form
    xx=1/text2interval('0.2');
    for k=1:30, xx=4*xx-2; end;
?? typo? Do you want the same program?? Anyway, I'll stick with the first one)
However, your suggestion to use instead
    x=0.2;
    xx=num2interval(x,x);
    for k=1:30, x=4*x-2; end;
would
- either (if num2interval does not move the bounds) give an initial xx
  not containing the correct data,

Of course in my view the initial data expressed as 0.2 or to be explicit about double float, 0.2d0, would not be 1/5. It would be 0.200000000000000011102230246.... if expressed in higher precision.

So you are incorrect in claiming that the number is not correct. The number is what it is, and
in particular it is not the same mathematical quantity as 1/5.

And the operation of x=6*x-1
is subject to the ordinary rules of computer arithmetic, and is entirely predictable and computable
given some axiomatic encoding of the arithmetic.

hence no guarantee about the
  subsequent behavior would be possible;
Actually the behavior is entirely deterministic, and many people derive excellent
guarantees about subsequent behavior.  They conclude from various
methods that you are familiar with, that their answers are within certain bounds! For example, that a routine for sin() returns an answer within 1ULP without doing any
interval arithmetic at all.

Now of course they may have made mistakes in their derivations and their
proofs may be wrong.  It is possible to verify some properties heuristically
by testing, sometimes. And it is not easy.

Are there hazards with proof by interval computation?  Sure.
Some people will create sequences of interval
arithmetic operations which are not the same as the point-value computations
they are attempting to enclose because either the interval operations
they compose or their compiler composes -- of calls to a library --
are the incorrect sequences,  or perhaps they use different initial values
 from the point-value computations, which may be the ones they want
to prove.

Perhaps one example of the mismatch is the one that you provide.

  You have apparently proven that
0.2d0 is a fixed point of that iteration when it is not.

Exact computation with the rational number equal to 0.2d0,
namely x= 3602879701896397/18014398509481984
followed by
for k:1 thru 30 do x:6*x-1;
results in
the value  41178229774373/16777216  or about 2454413.8..
not 1/5



You may be interested in this, though. In Mathematica 9.0.0.0 which I just tried,

x=0.2
Do[x = 6*x - 1, {k, 1, 30}]

results in x set to 6.5451*10^6   (no surprise, at least to  people who
might run this example in double-float in other languages)

But if the initial value in Mathematica is set to

x = 0.20000000000000000000

the final value is
0.*10^3

.... How could that be?

It happens because Mathematica does significance
arithmetic for software floats, and in this case it decides that the number has
lost all significance and so it prints it as 0.
This is not just any old zero,    since x+100 is also 0.
The vendor of Mathematica would like to convince you that significance
arithmetic is like interval arithmetic but uh, better?

I've argued with the Mathematica people that what they are doing is not
a good idea, but they insist it is a feature.

For what it is worth, if in Mathematica one initializes x to an Interval around 0.2d0
the iteration returns  Interval[{-1.30902*10^7, 1.7999*10^7}].
initializing around
x = Interval[0.20000000000000000000000000]
returns
Interval[{0.*10^-3, 0.39}]

initializing with Interval[1/5]  returns
Interval[{1/5, 1/5}]

initializing with Interval[{1/5, 11053696126774156250317/703687441776641}]
which includes BOTH 1/5 and 0.2d0 returns
Interval[{1/5, 11053696126774156250317/703687441776641}
which is about [1/5, 1.57*10^7]

Go figure.

Anyway, 0.2d0 is not a fixed point of the iteration.  1/5 is a fixed point.

- or (if num2interval moves both bounds by one ulp) approximately
double the width of the final result.
starting with Interval[{1/5-eps,1/5+eps}]  where eps is about 5.55*10^-17,
gives an interval (still using Mathematica...) of
Interval[{-2.9453*10^7, 3.43618*10^7}]


The same kind of problems may arise in calculations where it is far less easy to analyse the reasons for this behavior.
Yes

The standard should make these things computable reliably and without unnecessary wide results.
Except as I've shown above, it is computing reliably about a different problem, in the case where the problem being analyzed is one that has to do with the evaluation of a function on a computer
(point-value wise).





  This affects integers as well as various
approximations to real-number arithmetic.  By providing appropriate
primitives we can avoid this issue, to the extent that representation
issues will affect the quality of an implementation (e.g. tightness),
but not the correctness (i.e. containment).
It seems to me that correctness is easily achieved by providing routines
that always return Entire, since
that will enclose any valid answer.

True but in the standard as currently proposed it can be proved that the overestimation in interval function evaluation tends to zero as the width of the interval and the accuracy of computation go to zero; a
prerequisite for successful applications to global optimization
This is fine in theory but the arithmetic in a computer is necessarily of finite precision and
limiting processes don't work when you are trying to slice a ULP. Perhaps if
the whole enterprise were cast in terms of arbitrary-precision arithmetic, such a proof would make more sense. But now I'm just being increasingly argumentative,
and should immediately apologize.  Sorry.



I suppose there is a circumstance  in which just
returning Entire (without bothering to compute anything!) is incorrect.

It is never incorrect in an interval context, but it can often be avoided, and it is avoided more often if one does not widen intervals unnecessarily (such as moving each bound of a round-to-nearest result by one ulp, thus doubling the width unnecessarily).
OK.


That is, if
the program never halts, and you didn't notice that, so returning Entire
was incorrect.

If a program never halts it cannot return Entire, since returning requires halting.
I was thinking that if the "real" program never halted, then the "fake" program that returned Entire was in this case incorrect. The real program would not return, and so a faithful interval-correct program would have to loop infinitely too.



Anyway, even a better version of correctness is still quite easy at
least for bounded
intervals and rational operations. Quality matters.

  I recall, but cannot spot some discussion of 0.5 vs 0.50.

According to general consensus, these are the same numbers, both in
mathematics and in computer science. So no discussion is needed.
I think physicists disagree. So does Mathematica, ( as you may understand,
it is not in my opinion a paragon of good design.. but )  which
distinguishes (as I illustrated above) numbers by how many digits are
provided.  That is 0.2 is the double float 0.2d0  (not 1/5).
0.2000000000000000....0   with more than about 18 zeros is one of
those arbitrary precision floats.  How precise depends on how many zeros.
And that number, regardless of how many zeros are appened, is not 1/5, either.


Sorry for going on for so long..
RJF




Arnold Neumaier