Re: P1788 input/output
Am Wed, 26 Jun 2013 13:15:47 +0100
schrieb John Pryce <prycejd1@xxxxxxxxxxxxx>:
> (1)
> It leaves to the implementation the detail of how to read strings
> from an input stream. But it can be very annoying if it varies
> between implementations. Suppose I wish to read a decorated interval
> XX = (xx,dx) from a stream that starts | [1.2, 3.4]_monday
> morning ... | ^ (position of stream pointer, = next character to
> be read) The | characters are in case mail systems lose whitespace at
> the start of the line, which they seem to do.
>
> This is an error because "mon" is not a valid decoration name. Where
> is the stream pointer after it has read XX? It could be any of the
> marked ^ below: | [1.2, 3.4]_monday morning ... | ^
> ^ ^ depending how the parsing is done, and the method of error
> recovery; and I'm sure other positions are plausible. Even | [1.2,
> 3.4]_definitely ... | ^
> which has a valid decoration string "def", isn't trivial. If allowed,
> it sets xx=[1.2,3.4] and dx=def. But do we demand whitespace after
> the "def", i.e. it should valid syntax be | [1.2, 3.4]_def
> initely ...
>
> I hesitate to get 1788 into such language-defined matters but maybe
> we need to say something.
I'd not go down that route as well. Parsing strings is something
languages already support. We only have to deal with how to convert a
string into an interval. Either the string is valid or not. The
position in the stream doesn't need to concern us. If one wants to
parse a stream the user has to provide for appropriate buffering of the
input and present separated strings to the constructors.
> (2)
> At present it specifies
> - *free-form* input: text2interval(s), no conversion specifier;
> - *formatted* output: interval2text(x, cs), conversion specifier cs
> required.
>
> I think we need free-form output. Let's make the cs on output
> optional: omitting it gives some vanilla layout like the specifier
> %g, in C's fprintf.
>
> Ned's experience shows a need for formatted input, in order to read
> files laid out like 1.234 5.678
> 2.345 6.789
> 3.456 7.890
> ...
> (describing the intervals [1.234,5.678] {2.345,6.789]
> [3.456,7.890] ...)
>
> where one *mustn't* read the numbers into floating point values
> because that loses enclosure. Instead it must be equivalent to
> text2interval("[1.234,5.678]"), text2interval("[2.345,6.789]"), ...
>
> My feeling: 1788 should not *require* the above feature because it is
> easy for a user to implement correctly in a couple of lines of code:
> read the numbers as strings embed them in '[', ',' ']' characters
> pass the result to text2interval
As above I'm on your side here.
> But maybe we should say implementations *should* provide such a
> feature?
I wouldn't even go for a should here. The format seems quite arbitrary
to me. What about if the two numbers are separated by two spaces or
four spaces or a tabstop or a simple comma or or or. What about four
numbers in one line specifying a pair of intervals. For the programmer
this is easy to do in accordance with his format. For us it seems an
arbitrary choice to me.
Cheers,
Christian