Thread Links Date Links
Thread Prev Thread Next Thread Index Date Prev Date Next Date Index

Re: Motion 46: NO



On 2013-07-19 17:59:48 +0100, John Pryce wrote:
> It seems the locale issue will be both tricky and verbose if we
> address it in detail (Turkish "i" vs "ı", comma vs period in
> numbers, etc). So I think this standard should apply to the "default
> locale", and implementations *may* make locale-specific variations.
> Would that do? My understanding of locales is shaky.

I think this is fine. I hope that implementers will understand that
if the standard requires anycase("inf") to be recognized, this means
"inf", "INF", etc. and possibly other locale-dependent strings.
Rejecting "INF" in tr_TR because of the dotless "I" would be
unexpected behavior IMHO.

> Second, I was well aware that "number literal of host language" is
> complicated when one gets down to detail, but I continue to think
> that for the full standard it is better than restricting to some
> basic syntax. Fuller syntax for full standard; a basic syntax for
> proposed basic standard, on the lines suggested by Dmitry.
> 
> I propose the meaning of "number literal of host language" be
> defined precisely for the commonest languages. This should go in an
> Annex, referred to by the interval literal clause.

This should just be recommendations. Some forms may be rejected for
good reasons (e.g. floating-suffixes and octal). Note that in C,
when an integer is expected, the fact that the first digit is 0
doesn't always mean that it will be interpreted as an octal value.
for instance, for the %e printf specifier:

  [...] The exponent always contains at least two digits, and only
  as many more digits as necessary to represent the exponent. [...]

So if the exponent is 8 or 9, it will be represented as 08 or 09,
and of course, this mustn't be interpreted as octal.

> For C, I hadn't realised that "input of strtod()" is different from
> "integer constant" and "floating constant" in source. Would it do to
> define "number literal of host language" to mean "input of strtod()
> in default locale"? I agree with what Vincent says: a floating
> suffix makes no sense in this context and should be an error.

Yes.

> Can we say *everything* about interval literals is case-insensitive?

Yes, this would be much better.

> Then one can abolish the "anycase" notion and say a valid IL is one
> that, after converting to, say, lower case, obeys the given grammar.

Yes. In Unicode, "Default Case Folding" (used for case-insensitive
matching) is based on the lower case version, so that it is better
to choose this one:

  Unicode Default Case Folding is built on the toLowercase(X)
  transform, with some adaptations specifically for caseless matching.
  Context-dependent mappings based on the casing context are not used.

The case folding rules are given by:

  http://www.unicode.org/Public/UNIDATA/CaseFolding.txt

> > Is it intended to accept "[empty]" but not "[ empty ]" or more generally
> > "[" {sp} "empty" {sp} "]"? Ditto for "[entire]".
> Good point. What do people think? On output, people might prefer a
> table of intervals to look like
>   [1.234, 5.678]
>   [   empty    ]
>   [2.345, 6.789]
>   ...
> rather than
>   [1.234, 5.678]
>      [empty]    
>   [2.345, 6.789]
>   ...

Yes, the former looks nicer.

> > Concerning the uncertain form:
> > 
> > * It is said that r is a non-negative decimal integer literal. Is it
> > intended to allow 0?
> Yes. 3.45?0 has the same value as [3.45].

But [3.45?] will be [3.445,3.455]. It is a bit unusual that a missing
(or default) value isn't equivalent to 0.

> > For decorated intervals, I think that connectChar should be specified
> > (see above about full specification).
> Shall we fix on "_" then? What do people think?

I think so.

-- 
Vincent Lefèvre <vincent@xxxxxxxxxx> - Web: <http://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)