[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Two technical questions on IEEE Std 754-2008



From: Charles Stevens <charles.stevens@xxxxxxxx>
To: <forieee@xxxxxxxxxxxxxx>
CC: IEEE 754 <stds-754@xxxxxxxxxxxxxxxxx>
Subject: RE: Two technical questions on IEEE Std 754-2008
Date: Wed, 23 Feb 2011 07:35:57 -0700



Much history snipped.  

Hmm. I'm not quite sure what you mean here
but you WILL have to provide for the values
+infinity & -infinity if you want to conform
to 754-2008.

Recognising binary encoding is not necessary.

Actually=2C we are not=2C strictly speaking=2C claiming "conformance" to 75=
4-2008=2C we are making reference to it for the five formats we support in =
data=2C the two formats we support as "standard intermediate data items" in=
 arithmetic and for numeric function return values=2C and for the simple ar=
ithmetic operations.  For this revision=2C we are sticking with the existin=
g definitions for the more complex arithmetic (exponentiation and arithmeti=
c functions like LOG=2C SIN=2C etc.) to avoid conflict with existing implem=
entations and the history of COBOL. 

        Well, if you are not going to claim conformance then
        most of these questions are moot but for the practical
        ones of how to implement anything sensible.

        But let's not be too hasty here.  There may be some
        wiggle room available to you.  Maybe not enough to
        conform.  But enough to make your variance much more
        slight than screwing around with the arithmetic.


We do recognize infinities and NaN's=2C just not during arithmetic.  We rec=
ognize them enough to cause an exception condition at run time if one tries=
.  For either one=2C where both arguments are=2C say=2C decimal128=2C 
    MOVE AN-INFINITY TO AN-ITEM
works just fine=2C but
    COMPUTE AN-ITEM = AN-INFINITY
results in a fatal run-time exception condition (for which the tools exist =
in the language to assist in recovery) .  The first is a bitwise transfer=
=2C the second is treated as arithmetic. 

        Now you see, this is quite similar to the conforming
        behavior of 754 in the case where both the overflow
        & divide-by-zero alternate exception attributes are
        enabled.

        Remember that for the moment.


Class (numeric=2C infinity=2C NaN) tests are all provided for=2C as well as=
 the extraction of signs from infinity.  And=2C by the way=2C we have no in=
terest in the payload of NaN's. 

One of our primary goals is maintaining philosophical continuity with 50+ y=
ears of COBOL history=2C and the concept of infinity=2C and the concept of =
dealing with infinity arithmetically=2C are foreign to the language.  Maybe=
 in a future revision we can provide yet another mode of arithmetic that is=
 FULLY conformant to IEEE 754-2008 (or its successors)=2C but as it is in t=
he FCD=2C we already support FOUR modes of arithmetic: 
    NATIVE (the original):  the implementor does whatever he wants whenever=
 he wants=2C and gives whatever answers he wants
    STANDARD (2002):  Arithmetic is performed with all operands and the res=
ult in a single form=3B unfortunately=2C the ability to specify a data item=
 in that form wasn't included
    STANDARD-DECIMAL (FCD):  Arithmetic is performed using decimal128=2C co=
ntent is always norma from the view of the program
    STANDARD-BINARY (FCD):  Arithmetic is performed using binary128=2C cont=
ent is always normal from the view of the program
In the FCD=2C we also provide the ability to declare user data items in for=
ms defined as equivalent to binary32=2C binary64=2C binary128=2C decimal64 =
and decimal128.  For the "standard intermediate data item" used when standa=
rd arithmetic was in effect=2C there wasn't a corresponding way to declare =
a data item as being exactly in that form.  That's one of the strengths of =
the revisions to the 2002 standard. 

        OK, let's all agree that old man Cobol has been working
        well for most of our lives & is loath to change now.
        Still, you may be introducing yet another arithmetic
        mode here & there may be room to accomodate it.
        The old man need not buy a new house but he might get
        along with new double glazed windows & save some money.

        What I am thinking of is contained in clauses 7 & 8.
        The default behavior is to accept NaNs & infinities as
        any other number without anything more serious than a
        flag being raised.  But what you are suggesting is
        quite close to some of the alternate behaviors that
        are found in clause 8.

        If you are to be at variance with 754, I think you
        might be better off changing the defaults rather than
        varying the fundamentals of the arithmetic.

        Look at these clauses & see if something can be done
        along these lines.


One of the reasons we decided to incorporate features (again=2C not claim c=
onformance with) IEEE 754 is that=2C as far as I know for the first time=2C=
 an industry-standard specification outside of COBOL provided explicit supp=
ort for 31-decimal-digit numeric values that COBOL requires.

        Yep.  That & merging 854 into 754 was why we put it
        there.


And nothing bad should happen should you
encounter one.

Calculations involving infinities are much more meaningful in the scientifi=
c world than they are in the business world.  In COBOL we need to define th=
e EXACT answer as to what happens when you add 1 to +infinity=2C and why th=
at's the RIGHT answer from a business (generally speaking=2C financial) sta=
ndpoint.  While people dealing primarily with dollars and cents do expect t=
o do accurate arithmetic on VERY large values=2C they do not expect to do a=
ccurate arithmetic on INFINITELY large values. 

        I will grant you that the existence of an infinite
        amount of money is bad if only because it infinitely
        dilutes the finite amount of money that I have.

        Still, one does financial calculations on money that
        involve quantities other than drachmas or shekels.

        Much of Wall Street is involved in financial forecasting
        these days.  This involves calculating the probablities
        of things, as well as sums, differences, products, &
        ratios of them.  As the probablities are often very low,
        a Monte Carlo model may be constructed that both
        overflows & underflows in intermediate results & STILL
        manages to come up with valid investment suggestions.

        (Infinite ratios of probablities end up choosing the
        larger of the two & zero ratios choose the smaller.
        One need never know they are there if they are compared
        by less than.)

        To croak on overflows or underflows would prevent an
        otherwise valid financial program from delivering those
        valid financial results.

        I recently ran into this problem in a finite state
        machine monte carlo model of DNA.  It also overflowed
        & underflowed routinely because the probablities of
        any given random string of DNA rapidly become small as
        the string becomes large.  The programmer, knowing this,
        chose to represent the log of the probablities rather
        than the probablities themselves.  Just fine but
        correct handling of sums of probablities involve the
        use of antilogs which caused both infinities & NaNs
        to kill the results.  It was a simple fix to get it
        working again.

        I guess I have digressed a bit.  These programs today
        are run in languages other than Cobol for these reasons
        & others.  If you are making a standards decision to
        give up on those applications, what you are doing may
        be just fine in your context.

        Still, look at clauses 7 & 8.  You may be able to do
        something with them.


As you have observed=2C this is not possible by
looking at the numbers themselves. The two
encodings overlap in their bit values with
each representing different numbers & each
rejecting different bit patterns as senseless.

In order to do what you want sensibly you will
need the AmIBinary?() or AmIDecimal?() predicates
I mentioned before.

We say throughout the draft that we only deal with the decimal encodings=2C=
 in no uncertain terms.  From the user's=2C and the implementor's=2C view=
=2C we don't "know about" binary encodings.  If a user  should manage to pu=
t a binary-encoded decimal into a decimal128 item in=2C say=2C a file=2C or=
 passed into a program across a wire=2C the results of treating the content=
 as a numeric value are not currently specified.  What I'm proposing is tha=
t we make the algebraic value in such a case explicitly undefined and leave=
 it at that. 

        Just fine.  But, as I mentioned, it does not
        solve the problem of invalid data coming in
        from other sources.  In many (or most) cases
        that data will just sail right through entirely
        unnoticed & your programs will offer up what
        amounts to garbage with no error detected at
        all.

        Not a good situation for a language standard
        to explicitly permit.

        Either practically or legally.


. . .

Moreover=2C even if we did all this=2C there's no way for a COBOL program t=
o tell which encoding "somebody out there" used to produce a numeric value =
when they sent it to us as a decimal128 item=2C so the dragons are still ou=
t there. 

        Of course.  But as a standards body you can
        demand of such dragons that they inform you
        of the nature of their number encodings.
        Then you can convert or not as is the case.
        But in this case, failure to conform is
        THEIR fault not yours.  When the market crash
        occurs the lawyers will have to go after
        THEM not you because you warned them what
        would happen if they misbehaved.

        As a standards body you will not have blessed
        a dangerous product.


While you could support the reencoding operation
in software on your decimal encoded machine=2C you
will find that the Intel folks intend to support
binary encoding in hardware. If they also
support conversion to decimal in hardware=2C you
may find it best to ask the sending machine to
convert its data to decimal before it gets to
you. But it works either way.

The Intel plan is nice to know=3B either way=2C the information will have t=
o get into decimal encoding before it gets to COBOL (in a fashion invisible=
 to the user)  whether that's directly supported in hardware or software.  =
COBOL doesn't care.  Does that mean that COBOL programs might run less effi=
ciently on Intel than one might hope they would?  Yeah=2C well ...

        Maybe.  Maybe not.

        In your current instantiation, probably yes.
        Given that you will likely be doing the
        decimal encoded arithmetic in software on
        an Intel machine.  But you can arrange
        things so that future versions of the
        standard are able to play both ways.

        After all, for almost everything you don't
        need to know.  It is only in converting
        from one to the other that you have to ask
        the question:  What are we really doing here?


Here there be dragons & undetected bugs will
bash your program on the rocks.

Yes.  I understand. 

The only winning strategy is not to play.
Don't allow such bit patterns onto your
machine until they are both recognised &
converted.

We can't prevent users from puttng binary-encoded information into a data i=
tem that is documented as requiring the decimal encoding.  And we don't mak=
e specific reference in the FCD to the individual functions available in 75=
4-2008. 

        Well, as I am suggesting, you can document
        a standard way of handling numbers in other
        formats.  It gives you (1) a way of doing it,
        (2) the due diligence of having accounted for
        the problem, & (3) someone to blame when they
        don't follow your standard.


The IsDecimalEncoding() predicate is the
answer.

I'm unclear as to how to require the inclusion of a "predicate" to a 128-bi=
t value coming in from "out there somewhere"=2C say=2C as a field in a reco=
rd without requiring the addition of something to the storage requirements=
=2C and that doesn't seem to me to be practical.  We don't know where the d=
ata is coming from=2C and we have no control over how the sender encoded it=
=2C or whether he's willing to inform us as to which encoding he used in th=
e process. 

        It is quite simple.
        You implement the IsDecimalEncoding() predicate.
        For the moment it will always return True for
        your conforming systems.
        You require other (perhaps non-cobol) systems
        to send you their value of IsDecimalEncoding()
        for their system.
        Then you compare those values & only convert
        the incoming data when they differ.
        You also have to convert the outgoing data if
        you want to play nice with them.

        If you write something along these lines into
        the standard then the problem is solved at
        least at the standards level.  For failure to
        follow this protocol becomes failure to conform
        to your standard & all bets are off for the
        other guy.  Not you.


Maybe I don't understand what is meant here by the term "predicate". 

        You probably do understand.
        I am probably being obtuse in my explanation.
        I am well known for that.


I really appreciate all your help here=2C Dan=2C and I think we're moving i=
n the right direction=2C given the time limits we face and the current stat=
us of our FCD. 

Sincerely=2C

    -Chuck Stevens                                      =


        You are quite welcome.
        Cobol is an interesting case for us.
        Just as 754 is interesting for you.
        But look into clauses 7 & 8.
        You may find a good answer there.
        I'm not sure it will get you up to
        conforming.
        But a variance in the default
        exception attributes will be much
        more slight & therefore much easier
        both to understand & to justify
        than a variance in the fundamental
        behavior of the arithmetic.

        Good luck,

                        Dan


754 | revision | FAQ | references | list archive