Re: Motion 46: NO
Vincent, P1788
On 2013 Jul 19, at 11:28, Vincent Lefevre wrote:
> I vote NO on Motion 46 (Interval literals).
>
> The text contains various ambiguities about literals of host language
> and locales.
I agree with most of Vincent's criticisms. Chair, since this is a vote on the ideas, not actual text, are we allowed to carry on as if appropriate friendly amendments have been proposed and accepted?
A few seem to pose a dilemma or at least to require discussion.
> In particular:
>
> langNumLit {number literal of host language}
>
> For instance, if the host language is C, what would a number literal be?
> A number literal in a C source (the C standard says "integer constant"
> and "floating constant")? An input of strtod(), which is the numeric
> version of what text2interval is for intervals? This is different as
> strtod() is sensitive to locales concerning the decimal-point character
> and also case sensitiveness I assume.
>
> Now, are interval literals intended to be sensitive to locales, at least
> in some contexts like in C?
It seems the locale issue will be both tricky and verbose if we address it in detail (Turkish "i" vs "ı", comma vs period in numbers, etc). So I think this standard should apply to the "default locale", and implementations *may* make locale-specific variations. Would that do? My understanding of locales is shaky.
Second, I was well aware that "number literal of host language" is complicated when one gets down to detail, but I continue to think that for the full standard it is better than restricting to some basic syntax. Fuller syntax for full standard; a basic syntax for proposed basic standard, on the lines suggested by Dmitry.
I propose the meaning of "number literal of host language" be defined precisely for the commonest languages. This should go in an Annex, referred to by the interval literal clause.
> What if in the host language, the decimal-point character is a comma ","
> (i.e. the same character as the number separator in infSupIntvl), like
> for strtod() in some European locales such as fr_FR? This leads to an
> ambiguity as the comma would be used for two different purposes.
>
> Moreover, in the second example, about C/C++, I don't think that it is a
> good idea to accept floating-suffixes as said in "This is ignored within
> an interval literal: 1.2345 and 1.2345f both denote the mathematical
> number 1.2345", since in C/C++, 1.2345f only means a float number, which
> is mathematically different from the decimal value 1.2345. Why would a
> user put a floating-suffix for something that is not a float number?
> Accepting floating-suffixes would lead to confusion. Note that these
> suffixes make sense only in a source; they are not accepted by strtod(),
> where they are regarded as an error.
For C, I hadn't realised that "input of strtod()" is different from "integer constant" and "floating constant" in source. Would it do to define "number literal of host language" to mean "input of strtod() in default locale"? I agree with what Vincent says: a floating suffix makes no sense in this context and should be an error.
> I think that the standard should fully specify the format of interval
> literals or specify nothing.
Yes
> If specification is provided, it should
> have a note saying that for case sensitiveness, concepts like locales
> are ignored.
Not sure what this means.
> An implementation may provide functions that accept other
> formats and/or take locales into account, but these functions are out
> of the scope of this standard.
Yes
Can we say *everything* about interval literals is case-insensitive? Then one can abolish the "anycase" notion and say a valid IL is one that, after converting to, say, lower case, obeys the given grammar.
I have no truck with complications such as (if I understand Vincent aright) Turkish I becoming a different letter when converted to lowercase. That's why we should stick to a "default" locale.
> Other minor remarks:
>
> Is it intended to accept "[empty]" but not "[ empty ]" or more generally
> "[" {sp} "empty" {sp} "]"? Ditto for "[entire]".
Good point. What do people think? On output, people might prefer a table of intervals to look like
[1.234, 5.678]
[ empty ]
[2.345, 6.789]
...
rather than
[1.234, 5.678]
[empty]
[2.345, 6.789]
...
These outputs, if produced by interval2text, consist of ILs; which suggests strings of the form "[" {sp} "empty" {sp} "]" should be accepted on input, i.e. should be ILs.
> The term "Constants" for "[empty]" and "[entire]" is rather inadequate
> since "[1]" is also a constant. I would say "Special intervals".
Yes, that is a better name.
> Concerning the uncertain form:
>
> * It is said that r is a non-negative decimal integer literal. Is it
> intended to allow 0?
Yes. 3.45?0 has the same value as [3.45].
> * This isn't really important, but I would accept any case for letters,
> i.e. "d", "u" and "e" used here, for consistency with the other parts.
Yes, see above.
> * In the examples, I would add:
> -10?u [-10,-9.5]
> to make sure ulp is well-understood on powers of the radix and that
> direction characters are also well-understood on negative numbers.
Good example. Yes.
> In the "anycase" example, if locales are chosen to be ignored, it may
> be better to use: anycase("ai") matches any of "ai", "Ai", "aI", "AI"
> (see discussion about the dotless "i" and "I" with dot in Turkish).
See above.
> For decorated intervals, I think that connectChar should be specified
> (see above about full specification).
Shall we fix on "_" then? What do people think?
> And shouldn't decorationLit be case-insensitive?
See above.
John Pryce