[
Date Prev][
Date Next][
Thread Prev][
Thread Next][
Date Index][
Thread Index]
Re: conversions [was Importing NaNs]
> Unfortunately, there is an increasing problem with wire data,
> which arrives (by file or network) in an endian format that is
> not the memory format selected by the boot mode (it doesn't
> matter if it matches the actual processor). Real applications can
> have a significant fraction of their processing capacity eaten by
> software endianness conversion. This is what I meant by "foreign
> formats", and the problem applies to radix format conversion too.
The Endianness conversion problem would seem to be significantly
simpler than a radix conversion, especially for processors with
a Translate instruction (with which a single instruction suffices
to reverse a byte stream end-to-end).
> Representation of interchange data as text may be formally
> adequate, and practically adequate for file datasets, but the
> conversion load is much too great for practical network and
> streaming applications. All those x/10 and x%10's just won't work
> for sample streams in the gigasample/second range.
>
> So the data has to arrive in binary,
(Unless one has a decimal arithmetic unit, of course. :-))
> and format conversion has to take no more than the order of an
> add time.
>
> ...
>
> It seems to me that a similar problem will exist w/r/t radix. At
> very least, radix conversions should be both defined and
> required, or your native systems will work fine on their own
> data, but not be able to talk with systems using the opposite
> radix save by prohibitively expensive textual representations.
Conversion between text and decimal in software is about 20x-30x
faster than similar conversions between text and binary, because
no radix conversion is required (though it is certainly not
free), so in very many cases a textual representation is
certainly practical. Direct conversion between BFP and DFP is
not necessarily the best approach, too, as complex hardware would
be required. It might be better to provide means to extract
significands and exponents, for example.
So, perhaps, it would seem that the standard only needs to define
either binary to/from decimal or binary to/from string. Decimal
to/from string can be a one-to-one exact mapping.
(It might be desirable for the standard to define such a mapping,
although such a mapping already exists and is in various
standards, and will also be in the XML Schema decimal datatype,
so perhaps does not need to be re-specified.)
> Moreover, the problem arises even in single-radix interchange if
> the binary representations differ - 2's complement vs.
> sign/magnitude for example. Hence I urge the Standard to define
> a canonical binary Wire Form, and require that all conforming
> implementations support conversion between Wire Form and native
> concrete represention in both directions. The Wire Form should be
> one that is relatively easy to convert to any concrete
> representation supported by the Standard - I note that you have
> already precluded some of the more bizarre possible concrete
> representations such as Gray (Grey?) code. It need not have the
> same bitwidth as the corresponding native form, nor even be very
> bit efficient at all, but must be fixed width to avoid
> serializing the stream and to avoid problems with loss of phase
> coherency between transmitter and receiver.
Any such form would be inherently radix 10, radix 2, or some
other radix. In the first two cases, it would be easily
convertible to only one or other set of concrete representations,
and so would have no advantage over the usual (network byte
order) form for one radix and be seriously inferior for the
other. In the third case, it would perform poorly for both
sets, which (though fair) would not encourage implementation.
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Mike Cowlishaw, IBM Fellow
IBM UK (MP5), PO Box 31, Birmingham Road, Warwick, CV34 5JL
mailto:mfc@xxxxxxxxxx -- http://www2.hursley.ibm.com/decimal