Lee Winter wrote:
On that note I recommend that the values representing the SEM
fields be encoded as simple unsigned integers. Such an approach
will not harm communication between 754-based systems but will
enhance communication involving non-754-based systems. I prefer
that the integers be represented in decimal because it is the
default base that trivializes the necessary IO routines.
The text-form interchange format provides this, and also eliminates
the coding issue for Empty as it could simply be given as "Empty".
(The international community already tolerates words like Inf and NaN.)
The cost of transformation between internal and exchange formats must
be considered however, and the space requirement may not be negligible
either when large arrays are exchanged between systems. That's why a
representation along the lines proposed by John makes sense.
It is true that being close to an internal format leads to the temptation
of performing arithmetic on the exchange format. In a typed language we
can keep this under control by defining syntactically separate types for
exchange and internal formats, possibly with implicit cast conversions.
Ideally this approach should also get around NaN propagation uncertainty
-- but if platforms have difficulty preserving NaN payloads across plain
assignments (Java?) the<NaN,Num> encoding suggested by Nate may be what
is needed. Even without propagation rules we have payload definition
rule issues. Ideally we would like to encode small integers, but only
DFP specifies the encoding; for BFP it is not specified at all. (IBM Z,
which supports a BFP<->DFP conversion machine instruction, does specify
the encoding for BFP to make sure small (<2**22) integer NaN payloads
survive across any sequence of radix or size conversions, but I have
not heard of anybody else following suit. Note that this issue also
arises when converting between BFP and decimal strings (e.g. atof(),
when the Posix notation "NaN(nnn)" is evaluated).)
Of course, Motion 29 is deliberately silent on decoration encoding, as
we have not yet settled on the definition. I'm afraid this is in fact
going to be messy, as there are too many possibilities, all of them
with considerable baggage... Perhaps text representations are all that
will survive, after all.
On this tangent, I might note that separating exchange formats from
arithmetic formats does open some new possibilities, such as stealing
a few low-order fraction bits to hold decoration bits. That would
preserve power-of-two sizes and permit relatively painless conversion
between internal and external formats, at least for BFP. (This does
not work for DFP with DPD encoding, though it would be ok with BID.)
Michel.
---Sent: 2011-11-19 22:19:22 UTC