Re: Michel's comments on interchange representation
Michel
(and Dmitry, Baker)
On 2014 Jun 18, at 14:28, Michel Hack wrote:
> (John wrote:)
>> Am I missing something crucial here?
>
> Yes! Namely the fact that Endianness is defined for individual typed
> fields, such as int16, int32, int64, float, double, etc. -- and not for
> aggregates. No too long ago, especially for non-IEEE floating-point
> formats, the rules were much messier than simple byte reversal, as for
> example bytes reversed in pairs within the representation.
>
> So we CAN define the interval interchange format as suggested in C6.2,
> and apparently as originally intended, namely as an ordered triple of
> standard objects, each object being represented at Level 4 in its usual
> format for the platform. This does not resolve portability issues due
> to differing Endianness, but it reduces them to the same problem faced
> by almost every other interchange format...
I think you and Dmitry should work this out between you and produce the revised wording.
Baker, as it is substantive, does it need a separate motion?
> So the text of C6.2 is fine, but the example should point out explicitly
> that it shows a Big-Endian layout. Better yet, show both layouts, and
> perhaps mention one or two current machine architectures for each.
So it seems you are suggesting
- Make the main text in §14.4 closer to Ned's wording in C6.2.
- Don't try to remove the portability difficulty, but make it (as you say below) "standard".
That sounds OK to me. Anyone object?
> This brings me to the encoding of the decorations. The current layout
> fits well with the global-bitstring approach (which has perhaps not yet
> been ruled out, as Baker suggests we may need a separate vote), but it
> becomes awkward when we describe the triple in terms of existing formats:
> we just invented a new datatype! In byte-oriented systems a single byte
> does not have Endianness issues -- but on word-oriented machines (which
> IEEE 754 does not rule out) it does raise a whole new set of issues,
> namely how to align the 8-bit item in a larger container. It would be
> much better in my opinion to map decorations on "small integers", which
> is a fairly standard datatype (called "char" in C -- which may however
> have more than 8 bits). Then the decorations would be stored in whatever
> way small integers are stored, and portability issues become standard,
> even though they don't go away.
>
> Next question then is why were the particular values chosen? (I know,
> it's because somebody *was* thinking about concatenating bit strings.)
> ill 0
> trv 32
> def 64
> dac 96
> com 128
>
> Right away, we run into an issue for some implementations: 128 is
> not "small enough" for CHAR when CHAR is considered to be signed!
>
> Would it not have been better to use 0 through 4, as I seem to recall
> we had at one point?
Now 0 through 4 has the advantage of being simple and natural. I was rather attracted to Dmitry's "multiply by 32" (or shift 5 bits left) because of his argument "If an implementer wishes to invent new decorations in between the existing ones, this lets them do it easily".
However that is a possibility for some time in the future, and even then it is not obvious that it will save anyone much work. Whereas "small integer" exists now. And if there are indeed systems which take CHAR to be signed, Dmitry's formula gives a problem.
One could consider a compromise: say shift 3 bits left (or 2, or 4). But KISS applies. If I understand right, Michel's principle is that one should take a decorated interval as a conceptual concatenation of 3 standard datatypes, and not go any nearer bitstrings than that.
I'm OK with that approach, and in that case I favour decorations being "small integers" from 0 to 4, on KISS grounds.
Comments from others please.
John Pryce