Thread Links Date Links
Thread Prev Thread Next Thread Index Date Prev Date Next Date Index

Re: possible decision on interchange representation



John Pryce replied to my posting::
> Maybe, after a similar sentence listing relevant parameters, you can
> suggest, without requiring, a way in which such parameter data could be
> included as a file-header, e.g. as you wrote on 2014 Jun 19, at 14:45:
>
>   "some headers are carefully structured so that they include a 16-bit
>   field known to hold a small (<256) integer, from which it is easy to
>   recover ordinary (non-messed-up) Endianness"

How about the following:  We can define a *standard type signature* -- at
least for 754-conforming implementations (which is what 14.4 is about):
(This text would follow the description of the interchange encoding,
including that of a "small integer" decoration, that I'm working on.)

   A 754-conforming implementation *shall* provide, for each interchange
   encoding, a *standard type signature*, which is a 16-byte string laid
   out as follows:
      ieee1788           8 bytes, Ascii, any case
      bin | bid | dpd    3 bytes, Ascii, defines FP encoding
      '\0'               1 byte, null char (8 zero bits)
      nnnn               4 bytes, size in bytes, stored as native int32

   The size is that of the interchange object, consisting of the two
   floating-point datums and the decoration.  From this the size of
   an individual floating-point datum and of the decoration can be
   derived (assuming only that the former is a multiple of the latter,
   and that the size of the latter is a power of two), as well as the
   Endianness of the representation (assuming that the size is less
   than 2**24, which is presumably large enough to cover any extended
   format being contemplated in the near future).

   The flexibility in the size of the decoration is to accomodate
   alignment and padding issues -- or systems whose "small integer"
   datatype is a word and not a byte.  (It still assumes that a word
   consists of 2, 4 or 8 bytes, which seems general enough today.)

   For example, in C, the corresponding structure describing the type
   used in the Basic Standard (Chapter C) would be:
      { char[12] format = "ieee1788bin"; int32_t itemlength = 17; }

   (If ld=2**d is the size of the decoration, and lf=k*ld is the size
   of a floating-point datum, then itemlength = (2k+1)*2**d, from which
   both k and d can be recovered, and hence also lf and ld.)

   Such a type signature could be included in a header that accompanies
   interchange-encoded intervals for export, to achieve fairly universal
   portability.

We would also need an entry for "standard type signature" in Chapter 2.

When I was thinking about this, it occurred to me that, for bulk data
exchange (where a binary interchange format is desired, as opposed to
standard text representation), it is often the case that all items
would have the same decoration, except possibly for Empty, and that
a portable interchange format for bare intervals would make sense.
(An importer could apply newDec() to each imported bare interval.)

My suggested itemlength-based encoding could almost handle this case,
namely the decoration would be absent if itemlength was not an odd
multiple of 1, 2, 4 or 8.  There is a possibility of ambiguity however
for some unusual combinations (e.g. 24+24+8=56 could also be 28+28+0).
(Note that 754-2008 extendable formats have sizes that are a multiple
of four bytes.)  I'm not sure however that I want to burden a standard
signature with an additional twist to avoid this ambiguity -- assuming
that we'd actually want to support a bare-interval interchange encoding.

(One possibility would be to record the size of the decoration explicitly
in what is now shown as a null byte -- this might make checking slightly
more awkward in C, but really just slightly.  However, so far nobody has
voiced an interest in an interchange format for bare intervals.)

Michel.

P.S.  Baker asked whether I could update the svn repository directly.
      I've never used that, so I don't know, but I would prefer to
      offer plaintext as in the above; others could then integrate
      this in the appropriate manner.
---Sent: 2014-06-22 16:30:46 UTC