RE: round32 ( round64 ( X ) ) ?= round32 ( X )
There is a case where it works like you expect:
round_32( basic_op_64( convert_32_to_64(x_32), convert_32_to_64(y_32) ) )
IBM used this in the initial RS6000.
From: stds-754@xxxxxxxx [mailto:stds-754@xxxxxxxx] On Behalf Of Golliver, Roger
Sent: Thursday, March 31, 2011 4:01 PM
To: Peter Lawrence; STDS-754@xxxxxxxxxxxxxxxxx
Subject: RE: round32 ( round64 ( X ) ) ?= round32 ( X )
It is known, but not well known.
The Java Language Designers didn't know and didn't talk to their own in house
numerics experts, like David Hough.
See above link for some suggestions for how to avoid the double rounding error.
From: stds-754@xxxxxxxx [mailto:stds-754@xxxxxxxx] On Behalf Of Peter Lawrence
Sent: Thursday, March 31, 2011 12:55 PM
Subject: round32 ( round64 ( X ) ) ?= round32 ( X )
my apologies in advance if this is trivial and/or non-sense, but I did
not find the answer in a quick scan of David Goldberg's "What every computer
scientist should know about floating point arithmetic", nor in other more
specifically IEEE-754 documents that I have.
consider the effect of first rounding (round-to-nearest-even) to some number of
bits, followed by another rounding to a smaller number of bits, the question is
is that always the same as directly rounding to the smaller number of bits.
is the following observation mathematically (round-to-nearest-even)
the commas are for readability, the semicolons indicate where rounding
is to take place:
1.aaaa0,10000;0xxx ==> 1.aaaa0,10000 1.aaaa0;10000 ==>
1.aaaa0 round to 10 bits, followed by round to 5
1.aaaa0;10000,0xxx ==> ==>
1.aaaa1 directly round to 5 bits
(the "0xxx", and "0000,0xxx" are some of the bits of some mathematically exact
result which are not all zeros, which would be represented by a non-zero
"sticky bit" in an actual hardware implementation. In the first case the
sticky bit gets truncated, in the second case the sticky bit causes a round
if the above is a correct observation, then
round32 ( round64 ( X ) ) is not always equal to round32 ( X )
which seems sort of counter-intuitive, at least I started out thinking it would
always be, but thought I had better prove it first, and then came up with this
counter example. If it is true, I wonder if it is well known or not.