Thread Links Date Links
Thread Prev Thread Next Thread Index Date Prev Date Next Date Index

Re: Definition of intervals as subsets of R - the bad news




Sometimes "undefined behavior" is appropriate.  A C example showing why accessing the wrong side of a union is undefined is
        union  {
                float  f;
                char  c [4];
        }  u;
        char  first;
        u.f = 1.0f;
        first = c [0];

That's undefined for several reasons:
1.  Not all computers use IEEE 754, which didn't exist until a decade and a half after C.  So C can't and doesn't rely on 754 semantics and the float representation is undefined.  In particular:
      1a.  The number of bytes in a float isn't specified.  That doesn't matter here but would in other examples.
      1b.  The size and position of the sign, exponent and fraction in a float isn't specified.
      1c.  Whether a float is normalized isn't specified.
      1d.  The base of a float (binary? decimal? hex? other?) isn't specified.
     So accessing the first byte gives results that depend on the machine architecture.
2.  Some computers are big endian, some are little endian, and some are mixed endian.  So even if you know "float" means "IEEE 754-2008 binary32" and understand the bit pattern for 1.0f, you don't know which byte of that bit pattern is being accessed.  Computers have been built where it would be three of the four possibilities.  The byte ordering is undefined.
3.  The size of a char isn't necessarily one 8 bit byte.  The size of a byte isn't necessarily 8 bits (eg, C can run with a 36 bit word of 4 9 bit bytes).  So the number of bits included in what's stored in first is undefined.

C programmers handle that, usually by relying on implementation defined functions and values.  Java was intended to give the same answers everywhere, so omits unions from the language.  As a result, if you need access to details like the exponent of a float, you use functions written in C.  Each approach has its advantages.

Interval Arithmetic will also have either "undefined behavior" or "implementation defined behavior".  An obvious example:  What if I try to convert "@$Q" to an interval?  Do I get an error message, or an exception, or an error flag?  If I get an error message, exactly what does it say?  If that was something read in from the keyboard, can the program recover and reprompt?  If so, exactly what does the user see?  On Windows?  In MS DOS?  On a cell phone?

We will find places for the "undefined" and "implementation defined" blemishes.  We should minimize them.  In some cases the right way is to delegate the decision to the language.  We should not allow them to violate the fundamental axioms.

- Ian          Toronto IBM Lab   8200 Warden   D2-445   905-413-3411

----- Forwarded by Ian McIntosh/Toronto/IBM on 16/03/2009 06:15 PM -----
Gabriel Dos Reis <gdr@xxxxxxxxxxxxxxxxxxxxxxxx>

14/03/2009 01:56 PM
Please respond to
Gabriel Dos Reis <gdr@xxxxxxxxxxxxxxxxxxxxxxxx>

To
Ian McIntosh/Toronto/IBM@IBMCA
cc
Subject
Re: Definition of intervals as subsets of R - the bad news





On Thu, Feb 19, 2009 at 2:08 PM, R. Baker Kearfott <rbk@xxxxxxxxxxxxx> wrote:
> Ah, I think I'm beginning to see this point of view:
> It might not be so bad to leave explicitly undefined
> certain cases which seldom occur and for which definition
> of them would be confusing and possibly require performance
> hits.

I would think 'implementation defined' is better than 'undefined
behaviour'.  The difference is that the implementation is required
to document the behaviour.

"Undefined behaviour" really taints the whole programs, and should
be avoided as much as possible.

-- Gaby