Re: Encodeing of compressed intervals
Michel,
Thank you for explaining alll these complication.
I didn't notice that qNaN and sNaN encodings in 6.2.1 are SHOULD instead of SHALL.
So I try to reformulate after your comments.
It is necessary to define Level 3 representation of compressed intervals
before definition of their bit-encoding and octet-encoding.
Now I defined it as disjont sum (C union)
Level 3 representation = FP x FP + DEC
When compressed interval x is a bare interval its representation is a pair
of two floating-point numbers (inf(x),sup(x)).
When compressed interval x is a bare decoration its represtentation is
a decoration itself x.
Now we can think about Level 4 bit-encoding.
I take bin32 for example (k=32 w=8).
The supposed length of bit-encoding of compressed interval is 2*k .
We need a tag that distinguishes either this bit string encodes a pair or a bare decoration.
Tag is usualy prefix of a bit string.
Most prefixes are busy by encoding first elements of pair.
The busy prefixes of length (w+1) are (for bin32)
0_00000000 .. 0_11111110
1_00000000 .. 1_11111110
Also busy prefixes of length k are
0_11111111_00000000000000000000000
1_11111111_00000000000000000000000
So we can pick a free prefix of length (w+2)
1_11111111_1 = 1^(w+2)
Now we have spare (2*k-w-2) bits.
We migh choses any postion baseDec w+2 <= baseDec <= 2*k -8
and put 8-bit encoding of decoration into bits[baseDec:baseDec+7].
We may not care how bits[0:k-1] and bits[2:2*k-1] would be interpred as floating-point datum.
I tried to write bit string of a few choices of baseDec (underscores for some readablity only)
baseDec=10
ill 1_11111111_10000000000000000000000_0_00000000_00000000000000000000000
emp 1_11111111_10000001000000000000000_0_00000000_00000000000000000000000
trv 1_11111111_10000010000000000000000_0_00000000_00000000000000000000000
def 1_11111111_10000100000000000000000_0_00000000_00000000000000000000000
dac 1_11111111_10000110000000000000000_0_00000000_00000000000000000000000
com 1_11111111_10001000000000000000000_0_00000000_00000000000000000000000
baseDec=24
ill 1_11111111_10000000000000000000000_0_00000000_00000000000000000000000
emp 1_11111111_10000000000000000000010_0_00000000_00000000000000000000000
trv 1_11111111_10000000000000000000100_0_00000000_00000000000000000000000
def 1_11111111_10000000000000000001000_0_00000000_00000000000000000000000
dac 1_11111111_10000000000000000001100_0_00000000_00000000000000000000000
com 1_11111111_10000000000000000010000_0_00000000_00000000000000000000000
baseDec=56
ill 1_11111111_10000000000000000000000_0_00000000_00000000000000000000000
emp 1_11111111_10000000000000000000000_0_00000000_00000000000000000000010
trv 1_11111111_10000000000000000000000_0_00000000_00000000000000000000100
def 1_11111111_10000000000000000000000_0_00000000_00000000000000000001000
dac 1_11111111_10000000000000000000000_0_00000000_00000000000000000001100
com 1_11111111_10000000000000000000000_0_00000000_00000000000000000010000
Your proposal (0.0 2.0 4.0 8.0 12.0 16.0)
ill 1_11111111_10000000000000000000000_0_00000000_00000000000000000000000
emp 1_11111111_10000000000000000000000_0_10000000_00000000000000000000000
trv 1_11111111_10000000000000000000000_0_10000001_00000000000000000000000
def 1_11111111_10000000000000000000000_0_10000010_00000000000000000000000
dac 1_11111111_10000000000000000000000_0_10000010_10000000000000000000000
com 1_11111111_10000000000000000000000_0_10000011_00000000000000000000000
For decimal formats we can choose prefix of length 6
1_11111
and we have spare (2*k-6) bits.
Do we need to care how substrings bits[0:k-1] bits[k:2*k-1] of encoding of bare decorations
can be interpreted as floating-point datums ?
Or any choice is good ?
Should the prefix be a parameter of compressed interval encoding or we can use always
prefix '1'^(w+2) ?
-Dima
----- Original Message -----
From: mhack@xxxxxxx
To: stds-1788@xxxxxxxxxxxxxxxxx
Sent: Friday, June 27, 2014 7:01:09 AM GMT +04:00 Abu Dhabi / Muscat
Subject: Encodeing of compressed intervals
On Thu, 26 Jun 2014 18:16:20 -0700 (PDT), Dima wrote:
> Let us take encoding of bin32 is a 32-bit string.
> We numerate bits form left to right.
> bits[0] is a sign bit.
> bits[1:8] are exponent bits.
> bits[8:31] are significand field.
Almost: bits[8:31] are the *trailing* significand field.
The full significand includes the hidden unit bit (zero for subnormals).
> The IEEE 754 2008 3.4, 6.2.1 say that the following bitstrings encode sNaNs
> bits[1:9]=111111110 && !bits[10:31]=0000000000000000000000
> and the following bitstrings encode qNaNs
> bits[1:9]=111111111
> The payload is encoded in bits[10:31].
Not quite. The above is the *recommended* way to distinguish SNaN from
QNaN, as some existing platforms (I think HP's PA-RISK) us the opposite
convention. And the payload is *encoded* in bits[10:31], which is not
the same as saying that the payload *is* bits[10:31] interpreted as an
integer. (By contrast, the trailing significand of DFP NaNs *is* the
integer encoded therein -- and there is no SNaN/QNaN ambiguity.)
> We could encode decoration dec in payload of sNaN in such a way
> bits[0:10]=0111111111
> bits[11:18]="decoration octet"
> bits[19:31]=0000000000000 .
We could, but that would (a) go beyond IEEE 754-2008, and (b) this would
be an sNaN on PA-RISC, but qNaN on most other platforms. It would be
better to pick:
bits[0:11]=011111111x1 (where x may be platform-specific)
bits[12:19]="decoration octet" (would also look better in a dump)
bits[20:31]=000000000000 .
This would however completely bypass most platform's getNaNcode() function,
if they had one. I hope that platforms will at some point have such a
function, and its counterpart setNaNcode(), with NaNcodes defined to be
nonnegative integers less than 10**6 to make sure that they are convertible
without loss among all 754-2008 formats.
You would also have to define the rule separately for all 754-2008 formats,
or define it generically based on size, with one rule for BFP and another for
DFP (well, the latter would be trivial since DFP has unambiguous NaNcodes,
but only as *unsigned integers* and NOT as bitstrings -- the latter could
look very strange when DPD encoding is used.)
NONE of these complications arise if we encode compressed decorations as
ordinary small floating-point integers, compatible with full decorations
when interpreted as unsigned integers.
> Some platform names the sign bit as bit[0] and other platforms name
> sign bit as bit[31]. This is not a trouble if sign bit of bin32 is
> at the same position as sign bit of int32.
Bit numbering is primarily a documentation issue (though it does become
significant when there are bit-insert or bit-extract instructions, with
a variable bit index, in the ISA).
> Do you mean that there are some platforms where positions of sign bit
> in bin32 and int32 are different ?
No, the sign bit has never been an issue. Decorations are small unsigned
integers in my view. Also, sign of NaN is explicitly meaningless in 754,
and may be lost during conversions. Indeed, in Java there is a single NaN,
all bits defined, i.e. no room to encode anything.
What I was talking about is how a NaN-supporting atof() or atoff() encodes
NaNcodes:
Platform A: Platform B:
------------------------------------------------------------------
atof("nan(1)") 7FF0 0000 0000 0001 7FFC 0000 0000 0000
atof("nan(2)") 7FF0 0000 0000 0002 7FFA 0000 0000 0000
atof("nan(3)") 7FF0 0000 0000 0003 7FFE 0000 0000 0000
atof("nan(4)") 7FF0 0000 0000 0004 7FF9 0000 0000 0000
atof("nan(8)") 7FF0 0000 0000 0008 7FF8 8000 0000 0000
atof("nan(16)") 7FF0 0000 0000 0010 7FF8 4000 0000 0000
atoff("nan(1)") 7F80 0001 7FE0 0000
atoff("nan(2)") 7F80 0002 7FD0 0000
atoff("nan(3)") 7F80 0003 7FF0 0000
atoff("nan(4)") 7F80 0004 7FC8 0000
atoff("nan(8)") 7F80 0008 7FC4 8000
atoff("nan(16)") 7F80 0010 7FC2 4000
Note that, when the binary64 NaNs of platform A are converted
to binary32, the payload is lost; for platform B it is preserved.
Platform B can also convert those BFP NaNs to DFP and back without
losing the payload (provided it does not exceed the maximum NanCode
of the narrowest format, namely the 999999 of DFP32).
Any conversion would normally also convert sNaN to qNaN. The atof()
of glibc supports the nan(nn) syntax, but appears to generate sNaNs
(as shown above), which is probably a bug.
Michel.
---Sent: 2014-06-27 03:00:12 UTC