Thread Links			Date Links
Thread Prev	Thread Next	Thread Index	Date Prev	Date Next	Date Index

P1788: M0046 Interval Literals PASSES

To: stds-1788 <STDS-1788@xxxxxxxxxxxxxxxxx>
Subject: P1788: M0046 Interval Literals PASSES
From: "Corliss, George" <george.corliss@xxxxxxxxxxxxx>
Date: Sun, 4 Aug 2013 12:07:46 +0000
Accept-language: en-US
Cc: "Corliss, George" <george.corliss@xxxxxxxxxxxxx>
Delivered-to: mhonarc@xxxxxxxxxxxxxxxx
List-help: <https://listserv.ieee.org/cgi-bin/wa?LIST=STDS-1788>, <mailto:LISTSERV@LISTSERV.IEEE.ORG?body=INFO%20STDS-1788>
List-owner: <mailto:STDS-1788-request@LISTSERV.IEEE.ORG>
List-subscribe: <mailto:STDS-1788-subscribe-request@LISTSERV.IEEE.ORG>
List-unsubscribe: <mailto:STDS-1788-unsubscribe-request@LISTSERV.IEEE.ORG>
Sender: stds-1788@xxxxxxxx
Thread-index: AQHOkQs7SMrH8C6RcEuyxjbAQsnd+w==
Thread-topic: P1788: M0046 Interval Literals PASSES

P1788,

Motion 46.01 Interval Literals PASSES: Yes - 40; No - 2; Needed for Quorum: 23

Digest of discussion follows.

My sense from reading is that there are 2-3 "friendly amendments" embedded in the discussion. I suggest that a friendly amended version be stated explicitly. Since this was not a standard text vote, I suggest we NOT re-vote on the amended text; it will re-appear as standard text some day soon.

I sense that there remain a few points on which there is not consensus. Perhaps those can be formulated by either side for a straight up or down vote to (semi-)resolve those issues.

George Corliss,
P1788 Voting Tabulator

Begin forwarded message:

2013/7/13 Ralph Baker Kearfott <rbk5287@xxxxxxxxxxxxx <mailto:rbk5287@xxxxxxxxxxxxx>>

P-1788:

The voting period for Motion 46 herewith
begins. Voting will continue until after Saturday, August 3, 2013.
Voting on this motion will proceed according to the rules for
position papers (quorum and simple majority).
Comment can continue during voting, but the motion
cannot be changed during voting.

I have forwarded the email from our overall technical editor
with the motion as updated on July 13, 2013. (The motion
consists of a 2-line statement, a clarification, and a file
in PDF. The email also contains some explanation of plans
for a two-tiered standard, and how it might impact this motion.)

Webmaster: Please update the web page as follows:

1. Please post the updated motion and its clarification.

2. Please post the corresponding PDF document.

3. Please update the motion's status.

Acting secretary: Please record the transaction in the minutes.

NOTE: ALTHOUGH THE PDF CONTAINS PAGES FROM THE DRAFT TEXT,
THIS MOTION IS ON CONTENT, RATHER THAN ACTUAL WORDING.
IF THIS MOTION PASSES, WE WILL HAVE A SEPARATE VOTE ON THE
ACTUAL WORDING.

The motion will appear in the private area of the IEEE P-1788 site:

http://grouper.ieee.org/groups/1788/private/Motions/AllMotions.html

As usual, please contact me if you need the password to the private
area.

Best regards,

Baker (acting as chair, P-1788)

==================================================================
==================================================================

On 07/13/2013 03:54 AM, John Pryce wrote:

Baker, P1788

Chair, are we ready to vote?

I apologise for having written about this motion "Here is text
to vote on", or similar. Well, yes, but motion 46 is not about
the actual wording but about the content. (BTW that means it
passes on simple majority, not two thirds.)

This is important because the motion is in danger of being
endlessly bogged down in the debate

"Do we want a small, basic standard or a larger, fuller-featured
standard?"

As it relates to motion 46:
- Do we want rational number literals such as "22/7"? NO!! (e.g.
Jürgen) YES!! (e.g. Dmitry)
- Do we want other number literals to follow (a) the syntax of
the host language,
or (b) a minimal syntax whose productions are common to all
widely used languages?
[I started with (b), which was criticised; changed to (a);
now there is pressure to go back to (b).]
- etc.

It seems clear we need both a "full" standard, and a "basic" one
that is a subset.

We have a willing candidate to be "Technical Editor for the
Basic Standard" (TEBS) and I hope we can shortly announce he has
accepted the chair's formal invitation. His role is to create
the basic subset and, I hope, a simplified document that
describes only that subset. The basic standard will be angled
toward ease of implementation.

That being so, I ask us to vote on motion 46 concentrating on
the principles. Please ignore issues of what details are in or
out of the subset: you will have your say on these when the TEBS
makes his proposals.

======
Motion 46, revision of 13 July 2013.
======
The syntax and semantics of interval literals shall be as
specified in the attached extract from Draft 7.3.

======
Clarification
======

- The TEBS will choose a subset to form the definition of
interval literals in the basic standard. ("Subset" means any
literal that conforms to the basic standard also conforms to the
full standard.)

- The main principles you are voting on, as I see it, are:

1. Interval literals (ILs) have a mathematical value. Converting
them to finite precision intervals is a separate operation.

2. ILs are what the Level 2 constructor text2interval(), of any
finite precision type, takes as input.

3. ILs have a close relation to interval I/O: it shall be possible
to write an internal interval to an IL, and read an IL to an
internal interval, preserving containment in either direction.
(Not directly covered by this motion, but relevant.)

4. Inf-sup form "[1.2,3.4]" and uncertain form "12.345?6" are both
a Good Thing.

- Whether number literals follow the host language syntax or a
simple language-independent syntax, is left to the TEBS to decide.

======
Notes
======
- In the new text I have added a definition of what "last place"
(an integer) and "unit in last place" (ulp) mean in this
context, since I learned they have more than one meaning and are
unfamiliar to some.
[Example. For the decimal strings 123 and 123. , as well as 0
and 0. , the last place is 0 and one ulp is 1. For .123 and
0.123 , as well as .000 and 0.000 , the last place is −3 and one
ulp is 0.001.]

- I hope to have corrected an error in the definition of
"exponent field" for uncertain form, which was inconsistent as
to whether the prefix character 'e' was included or not.

- This was the original version of the motion.

======
Motion
======
The syntax and semantics of interval literals shall be
- as specified in Draft 7.1 circulated as
20130402Level1and2textV7.1Sent.pdf;
- with the addition of the singleton interval form [x] which
is equivalent to [x,x].

The standard will not at this stage include a facility for
named constants such as pi to be included in the definition
of an interval literal.

--
---------------------------------------------------------------
R. Baker Kearfott, rbk@xxxxxxxxxxxxx <mailto:rbk@xxxxxxxxxxxxx>
(337) 482-5346 <tel:%28337%29%20482-5346> (fax)
(337) 482-5270 <tel:%28337%29%20482-5270> (work) (337) 993-1827
<tel:%28337%29%20993-1827> (home)
URL: http://interval.louisiana.edu/kearfott.html
Department of Mathematics, University of Louisiana at Lafayette
(Room 217 Maxim D. Doucet Hall, 1403 Johnston Street)
Box 4-1010, Lafayette, LA 70504-1010, USA
---------------------------------------------------------------

Begin forwarded message:

From: Vincent Lefevre <vincent@xxxxxxxxxx>
Subject: Re: Motion 46: finalise interval literals
Date: July 17, 2013 8:44:28 AM CDT
To: stds-1788 <stds-1788@xxxxxxxxxxxxxxxxx>

On 2013-07-12 16:21:52 +0200, Vincent Lefevre wrote:

On 2013-07-12 14:37:47 +0100, John Pryce wrote:

BTW. What does denote infinity in a Turkish C program?

Interpretation of a C program is locale-independent. But locales
affect strtod(). I haven't done any test yet. There are 4 possibilities
to write "inf", but I don't know which ones are recognized in practice
and whether there may be errors in the documentation or the the C
library.

I had raised the issue in 2005 and 2006 for the revision of the
IEEE 754 standard, but it has been ignored. The final version of
the IEEE 754-2008 standard says:

Conversion of external character sequences "inf" and "infinity"
(regardless of case) with an optional preceding sign, to a supported
floating-point format shall produce an infinity (with the same sign
as the input).

So, in Turkish locales, where "i" and "I" are different letters
(even in a case-insensitive way[*]), "INF" is not guaranteed to
produce an infinity.

[*] You can see the dot over the "i" as an accent.

The C99 standard says:

7.20.1.3 The strtod, strtof, and strtold functions
[...]
3 The expected form of the subject sequence is an optional plus or minus
sign, then one of the following:
— a nonempty sequence of decimal digits optionally containing a
decimal-point character, then an optional exponent part as defined
in 6.4.4.2;
— a 0x or 0X, then a nonempty sequence of hexadecimal digits
optionally containing a decimal-point character, then an optional
binary exponent part as defined in 6.4.4.2;
— INF or INFINITY, ignoring case
— NAN or NAN(n-char-sequence_opt), ignoring case in the NAN part,
[...]
5 In other than the "C" locale, additional locale-specific subject
sequence forms may be accepted.

(and AFAIK, C11 says the same thing). So, this is the opposite: "INF"
will give an infinity, but nothing is guaranteed for "inf" in Turkish
locales.

I've done some tests against the glibc and reported the following bug:

http://sourceware.org/bugzilla/show_bug.cgi?id=15744

Note also the behavior concerning the decimal-point character.

--
Vincent Lefèvre <vincent@xxxxxxxxxx> - Web: <http://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)

Begin forwarded message:

From: Ian McIntosh <ianm@xxxxxxxxxx>
Subject: P1788: Motion 46 - YES
Date: July 19, 2013 10:04:17 AM CDT
To: <stds-1788@xxxxxxxxxxxxxxxxx>

I vote YES on Motion 46 Interval Literals, with some comments:

1. The mapping to an implementation may or may not require the text string to be a quoted string, but that is a language mapping or library mapping issue not needing to be part of the standard.

2. There has been concern over the standard needing unlimited storage when long text strings are used to define a literal. To me that's not a serious concern. Pick some length, say one million, which is a small enough object to allocate one or two of compared to the total memory use of most compilers or programs. Which of us would ever write a program containing a literal over that number of digits long? Or even 100 digits? We can either set a maximum text length in the standard, or leave the maximum length an implementation decision.

3. The same is true for rational strings. The memory usage isn't a real problem, and the implementation effort should be smaller than the time we've spent debating them.

4. Fairly simple things that will be used by many are usually best put in a standard, instead of being repeatedly reimplemented. Aside from the efficiency and convenience, it reduces the risk of programmer error or suboptimal approaches or precision. Considering that, I would include the literal constants [PI], [E] and possibly a couple others. How many programmers have PI memorized to double precision? How many non-mathematicians know how to calculate it without looking it up? Would that give the best precision? It's easier for an implementation to provide than for each programmer to figure out the details, or for us to debate it.

These issues can be resolved later, and regardless of the outcome wouldn't justify me voting no. The important thing is to have a function to construct intervals from text strings.

- Ian McIntosh IBM Canada Lab Compiler Back End Support and Development

"Corliss, George" ---07/19/2013 06:46:49 AM---P1788: Voting is in progress for

From:

"Corliss, George" <george.corliss@xxxxxxxxxxxxx>

To:

Ian McIntosh/Toronto/IBM@IBMCA

Date:

07/19/2013 06:46 AM

Subject:

P1788: Motions 45 & 46 - Please vote

P1788:

Voting is in progress for

Motion 45. Exact Dot Product Revision
Current tally: Yes - 15; No - 5; quorum - 31
Voting extends through Monday, 29 July, 2013

Motion 46. Interval literals
Current tally: Yes - 12; No - 1; quorum - 23
Voting extends through Saturday, 3 August, 2013.

Details of each motion are below.

PLEASE VOTE.

George Corliss
P1788 Voting Tabulator

Begin forwarded message:

> From: Ralph Baker Kearfott <rbk5287@xxxxxxxxxxxxx>
> Subject: Fwd: Motion P1788/M0045.02:DotProduct -- final text
> Date: July 9, 2013 10:59:24 AM CDT
> To: stds-1788 <STDS-1788@xxxxxxxxxxxxxxxxx>
> Reply-To: <rbk@xxxxxxxxxxxxx>
>
> -------- Original Message --------
> Subject: Motion P1788/M0045.02:DotProduct -- final text
> Date: Tue, 9 Jul 2013 15:22:25 +0100
> From: John Pryce <j.d.pryce@xxxxxxxxxx>
> To: rbk@xxxxxxxxxxxxx
>
>
>
> P1788
> This is the text to be voted on. Webmaster, please put this on the web site. I am happy to keep the rationale as originally written, but of course all subsequent discussion in the group is relevant.
> In response to a nit from someone I changed [vectors whose length] "is given as non-integral, zero or negative" to "is not given as a positive integer". I hope this isn't some dreadful solecism.
> John Pryce
>
> ======
> Motion
> ======
> 1. An implementation of Exact Dot Product EDP and Complete Arithmetic CA be no longer required by P1788. They should be treated as a recommended way to achieve the broader aim of evaluating highly accurate sums and dot products, which has many uses in interval computing.
>
> 2. The current text on EDP and CA (11.11.11 in the current draft) be moved to Level 3 with minor revisions and replaced at Level 2 by the following text:
>
> ---start of text---
> Reduction operations.
>
> An implementation that provides 754-conforming interval types shall provide the four reduction operations sum, dot, sumSquare and sumAbs of IEEE 754-2008 §9.4, correctly rounded. These shall be provided for the parent formats of each such type.
>
> Correctly rounded means that the returned result is defined as follows.
> - If the exact result is defined as an extended-real number, return this after rounding to the relevant format according to the current rounding mode. An exact zero shall be returned as +0 in all rounding modes.
> - Otherwise return NaN.
>
> All other behavior, such as overflow, underflow, setting of IEEE 754 flags, raising of exceptions, and behavior on vectors whose length is not given as a positive integer, shall be as specified in IEEE 754-2008 §9.4. In particular, evaluation is as if in exact arithmetic up to the final rounding, with no possibility of intermediate overflow or underflow.
>
> Intermediate overflow could result from adding an extremely large number N of large terms of the same sign. The implementation shall ensure this cannot occur. This is done by providing enough leading carry bits in an accumulator, or equivalent, so that the N required is unachievable with current hardware. [Note: For example, Complete Arithmetic for IEEE 754 binary64, parameterized as recommended by Kulisch and Snyder, requires around 2^88 terms before overflow can occur.]
>
> It is recommended that these operations be based on an implementation of Complete Arithmetic as specified in §X.Y.
> ---end of text---

Begin forwarded message:

> From: Ralph Baker Kearfott <rbk5287@xxxxxxxxxxxxx>
> Date: July 13, 2013 12:59:24 PM CDT
>
> P-1788:
>
> The voting period for Motion 46 herewith
> begins. Voting will continue until after Saturday, August 3, 2013.
> Voting on this motion will proceed according to the rules for
> position papers (quorum and simple majority).
> Comment can continue during voting, but the motion
> cannot be changed during voting.
>
> I have forwarded the email from our overall technical editor
> with the motion as updated on July 13, 2013. (The motion
> consists of a 2-line statement, a clarification, and a file
> in PDF. The email also contains some explanation of plans
> for a two-tiered standard, and how it might impact this motion.)
>
> Webmaster: Please update the web page as follows:
>
> 1. Please post the updated motion and its clarification.
>
> 2. Please post the corresponding PDF document.
>
> 3. Please update the motion's status.
>
> Acting secretary: Please record the transaction in the minutes.
>
> NOTE: ALTHOUGH THE PDF CONTAINS PAGES FROM THE DRAFT TEXT,
> THIS MOTION IS ON CONTENT, RATHER THAN ACTUAL WORDING.
> IF THIS MOTION PASSES, WE WILL HAVE A SEPARATE VOTE ON THE
> ACTUAL WORDING.
>
> The motion will appear in the private area of the IEEE P-1788 site:
>
> http://grouper.ieee.org/groups/1788/private/Motions/AllMotions.html
>
> As usual, please contact me if you need the password to the private
> area.
>
> Best regards,
>
> Baker (acting as chair, P-1788)
>
> ==================================================================
> ==================================================================
>
>
> On 07/13/2013 03:54 AM, John Pryce wrote:
>> Baker, P1788
>>
>> Chair, are we ready to vote?
>>
>> I apologise for having written about this motion "Here is text to vote on", or similar. Well, yes, but motion 46 is not about the actual wording but about the content. (BTW that means it passes on simple majority, not two thirds.)
>>
>> This is important because the motion is in danger of being endlessly bogged down in the debate
>>
>> "Do we want a small, basic standard or a larger, fuller-featured standard?"
>>
>> As it relates to motion 46:
>> - Do we want rational number literals such as "22/7"? NO!! (e.g. Jürgen) YES!! (e.g. Dmitry)
>> - Do we want other number literals to follow (a) the syntax of the host language,
>> or (b) a minimal syntax whose productions are common to all widely used languages?
>> [I started with (b), which was criticised; changed to (a);
>> now there is pressure to go back to (b).]
>> - etc.
>>
>> It seems clear we need both a "full" standard, and a "basic" one that is a subset.
>>
>> We have a willing candidate to be "Technical Editor for the Basic Standard" (TEBS) and I hope we can shortly announce he has accepted the chair's formal invitation. His role is to create the basic subset and, I hope, a simplified document that describes only that subset. The basic standard will be angled toward ease of implementation.
>>
>> That being so, I ask us to vote on motion 46 concentrating on the principles. Please ignore issues of what details are in or out of the subset: you will have your say on these when the TEBS makes his proposals.
>>
>> ======
>> Motion 46, revision of 13 July 2013.
>> ======
>> The syntax and semantics of interval literals shall be as specified in the attached extract from Draft 7.3.
>>
>> ======
>> Clarification
>> ======
>>
>> - The TEBS will choose a subset to form the definition of interval literals in the basic standard. ("Subset" means any literal that conforms to the basic standard also conforms to the full standard.)
>>
>> - The main principles you are voting on, as I see it, are:
>>
>> 1. Interval literals (ILs) have a mathematical value. Converting
>> them to finite precision intervals is a separate operation.
>>
>> 2. ILs are what the Level 2 constructor text2interval(), of any
>> finite precision type, takes as input.
>>
>> 3. ILs have a close relation to interval I/O: it shall be possible
>> to write an internal interval to an IL, and read an IL to an
>> internal interval, preserving containment in either direction.
>> (Not directly covered by this motion, but relevant.)
>>
>> 4. Inf-sup form "[1.2,3.4]" and uncertain form "12.345?6" are both
>> a Good Thing.
>>
>> - Whether number literals follow the host language syntax or a simple language-independent syntax, is left to the TEBS to decide.
>>
>> ======
>> Notes
>> ======
>> - In the new text I have added a definition of what "last place" (an integer) and "unit in last place" (ulp) mean in this context, since I learned they have more than one meaning and are unfamiliar to some.
>> [Example. For the decimal strings 123 and 123. , as well as 0 and 0. , the last place is 0 and one ulp is 1. For .123 and 0.123 , as well as .000 and 0.000 , the last place is −3 and one ulp is 0.001.]
>>
>> - I hope to have corrected an error in the definition of "exponent field" for uncertain form, which was inconsistent as to whether the prefix character 'e' was included or not.
>>
>> - This was the original version of the motion.
>>> ======
>>> Motion
>>> ======
>>> The syntax and semantics of interval literals shall be
>>> - as specified in Draft 7.1 circulated as 20130402Level1and2textV7.1Sent.pdf;
>>> - with the addition of the singleton interval form [x] which is equivalent to [x,x].
>>>
>>> The standard will not at this stage include a facility for named constants such as pi to be included in the definition of an interval literal.
>
>
> --
>
> ---------------------------------------------------------------
> R. Baker Kearfott, rbk@xxxxxxxxxxxxx (337) 482-5346 (fax)
> (337) 482-5270 (work) (337) 993-1827 (home)
> URL: http://interval.louisiana.edu/kearfott.html
> Department of Mathematics, University of Louisiana at Lafayette
> (Room 217 Maxim D. Doucet Hall, 1403 Johnston Street)
> Box 4-1010, Lafayette, LA 70504-1010, USA
> ---------------------------------------------------------------
>
>

[attachment "20130713IntervalLiterals.pdf" deleted by Ian McIntosh/Toronto/IBM]

Begin forwarded message:

From: Vincent Lefevre <vincent@xxxxxxxxxx>
Subject: Motion 46: NO
Date: July 19, 2013 5:28:39 AM CDT
To: <stds-1788@xxxxxxxxxxxxxxxxx>

I vote NO on Motion 46 (Interval literals).

The text contains various ambiguities about literals of host language
and locales. In particular:

langNumLit {number literal of host language}

For instance, if the host language is C, what would a number literal be?
A number literal in a C source (the C standard says "integer constant"
and "floating constant")? An input of strtod(), which is the numeric
version of what text2interval is for intervals? This is different as
strtod() is sensitive to locales concerning the decimal-point character
and also case sensitiveness I assume.

Now, are interval literals intended to be sensitive to locales, at least
in some contexts like in C?

What if in the host language, the decimal-point character is a comma ","
(i.e. the same character as the number separator in infSupIntvl), like
for strtod() in some European locales such as fr_FR? This leads to an
ambiguity as the comma would be used for two different purposes.

Moreover, in the second example, about C/C++, I don't think that it is a
good idea to accept floating-suffixes as said in "This is ignored within
an interval literal: 1.2345 and 1.2345f both denote the mathematical
number 1.2345", since in C/C++, 1.2345f only means a float number, which
is mathematically different from the decimal value 1.2345. Why would a
user put a floating-suffix for something that is not a float number?
Accepting floating-suffixes would lead to confusion. Note that these
suffixes make sense only in a source; they are not accepted by strtod(),
where they are regarded as an error.

I think that the standard should fully specify the format of interval
literals or specify nothing. If specification is provided, it should
have a note saying that for case sensitiveness, concepts like locales
are ignored. An implementation may provide functions that accept other
formats and/or take locales into account, but these functions are out
of the scope of this standard.

Other minor remarks:

Is it intended to accept "[empty]" but not "[ empty ]" or more generally
"[" {sp} "empty" {sp} "]"? Ditto for "[entire]".

The term "Constants" for "[empty]" and "[entire]" is rather inadequate
since "[1]" is also a constant. I would say "Special intervals".

Concerning the uncertain form:

* It is said that r is a non-negative decimal integer literal. Is it
intended to allow 0?

* This isn't really important, but I would accept any case for letters,
i.e. "d", "u" and "e" used here, for consistency with the other parts.

* In the examples, I would add:
-10?u [-10,-9.5]
to make sure ulp is well-understood on powers of the radix and that
direction characters are also well-understood on negative numbers.

In the "anycase" example, if locales are chosen to be ignored, it may
be better to use: anycase("ai") matches any of "ai", "Ai", "aI", "AI"
(see discussion about the dotless "i" and "I" with dot in Turkish).

For decorated intervals, I think that connectChar should be specified
(see above about full specification). And shouldn't decorationLit be
case-insensitive?

--
Vincent Lefèvre <vincent@xxxxxxxxxx> - Web: <http://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)

Begin forwarded message:

From: John Pryce <j.d.pryce@xxxxxxxxxx>
Subject: Re: Motion 46: NO
Date: July 19, 2013 11:59:48 AM CDT
To: Vincent Lefevre <vincent@xxxxxxxxxx>
Cc: <stds-1788@xxxxxxxxxxxxxxxxx>

Vincent, P1788

On 2013 Jul 19, at 11:28, Vincent Lefevre wrote:

I vote NO on Motion 46 (Interval literals).

The text contains various ambiguities about literals of host language
and locales.

I agree with most of Vincent's criticisms. Chair, since this is a vote on the ideas, not actual text, are we allowed to carry on as if appropriate friendly amendments have been proposed and accepted?

A few seem to pose a dilemma or at least to require discussion.

In particular:

langNumLit {number literal of host language}

For instance, if the host language is C, what would a number literal be?
A number literal in a C source (the C standard says "integer constant"
and "floating constant")? An input of strtod(), which is the numeric
version of what text2interval is for intervals? This is different as
strtod() is sensitive to locales concerning the decimal-point character
and also case sensitiveness I assume.

Now, are interval literals intended to be sensitive to locales, at least
in some contexts like in C?

It seems the locale issue will be both tricky and verbose if we address it in detail (Turkish "i" vs "ı", comma vs period in numbers, etc). So I think this standard should apply to the "default locale", and implementations *may* make locale-specific variations. Would that do? My understanding of locales is shaky.

Second, I was well aware that "number literal of host language" is complicated when one gets down to detail, but I continue to think that for the full standard it is better than restricting to some basic syntax. Fuller syntax for full standard; a basic syntax for proposed basic standard, on the lines suggested by Dmitry.

I propose the meaning of "number literal of host language" be defined precisely for the commonest languages. This should go in an Annex, referred to by the interval literal clause.

What if in the host language, the decimal-point character is a comma ","
(i.e. the same character as the number separator in infSupIntvl), like
for strtod() in some European locales such as fr_FR? This leads to an
ambiguity as the comma would be used for two different purposes.

Moreover, in the second example, about C/C++, I don't think that it is a
good idea to accept floating-suffixes as said in "This is ignored within
an interval literal: 1.2345 and 1.2345f both denote the mathematical
number 1.2345", since in C/C++, 1.2345f only means a float number, which
is mathematically different from the decimal value 1.2345. Why would a
user put a floating-suffix for something that is not a float number?
Accepting floating-suffixes would lead to confusion. Note that these
suffixes make sense only in a source; they are not accepted by strtod(),
where they are regarded as an error.

For C, I hadn't realised that "input of strtod()" is different from "integer constant" and "floating constant" in source. Would it do to define "number literal of host language" to mean "input of strtod() in default locale"? I agree with what Vincent says: a floating suffix makes no sense in this context and should be an error.

I think that the standard should fully specify the format of interval
literals or specify nothing.

Yes

If specification is provided, it should
have a note saying that for case sensitiveness, concepts like locales
are ignored.

Not sure what this means.

An implementation may provide functions that accept other
formats and/or take locales into account, but these functions are out
of the scope of this standard.

Yes

Can we say *everything* about interval literals is case-insensitive? Then one can abolish the "anycase" notion and say a valid IL is one that, after converting to, say, lower case, obeys the given grammar.
I have no truck with complications such as (if I understand Vincent aright) Turkish I becoming a different letter when converted to lowercase. That's why we should stick to a "default" locale.

Other minor remarks:

Is it intended to accept "[empty]" but not "[ empty ]" or more generally
"[" {sp} "empty" {sp} "]"? Ditto for "[entire]".

Good point. What do people think? On output, people might prefer a table of intervals to look like
[1.234, 5.678]
[ empty ]
[2.345, 6.789]
...
rather than
[1.234, 5.678]
[empty]
[2.345, 6.789]
...
These outputs, if produced by interval2text, consist of ILs; which suggests strings of the form "[" {sp} "empty" {sp} "]" should be accepted on input, i.e. should be ILs.

The term "Constants" for "[empty]" and "[entire]" is rather inadequate
since "[1]" is also a constant. I would say "Special intervals".

Yes, that is a better name.

Concerning the uncertain form:

* It is said that r is a non-negative decimal integer literal. Is it
intended to allow 0?

Yes. 3.45?0 has the same value as [3.45].

* This isn't really important, but I would accept any case for letters,
i.e. "d", "u" and "e" used here, for consistency with the other parts.

Yes, see above.

* In the examples, I would add:
-10?u [-10,-9.5]
to make sure ulp is well-understood on powers of the radix and that
direction characters are also well-understood on negative numbers.

Good example. Yes.

In the "anycase" example, if locales are chosen to be ignored, it may
be better to use: anycase("ai") matches any of "ai", "Ai", "aI", "AI"
(see discussion about the dotless "i" and "I" with dot in Turkish).

See above.

For decorated intervals, I think that connectChar should be specified
(see above about full specification).

Shall we fix on "_" then? What do people think?

And shouldn't decorationLit be case-insensitive?

See above.

John Pryce

Begin forwarded message:

From: Ian McIntosh <ianm@xxxxxxxxxx>
Subject: Re: Motion 46: NO
Date: July 19, 2013 2:00:22 PM CDT
To: <stds-1788@xxxxxxxxxxxxxxxxx>

JP> I propose the meaning of "number literal of host language" be defined precisely for the commonest languages.
. . .
VL> > For decorated intervals, I think that connectChar should be specified
VL> > (see above about full specification).
JP> Shall we fix on "_" then? What do people think?

I like "_" but if C++ and Fortran are in the list of commonest languages it is awkward.
- In Fortran, "_" is a separator between value digits and kind digits. You could write "123_4" meaning the same as "123" but of kind 4, which typically is a 4 byte / 32 bit integer.
- In C++ 11, "_" may be the first character of a user defined suffix for a user defined literal. You could write "123_xx" to mean the same as "123" but to be processed by the user defined literal function that handles literals with "_xx" suffixes, which could for example return a value of type "xx". One could define suffixes and functions so that 1.5_feet + 6_inches equalled 2_feet, or that 0.1down was converted with rounding towards negative infinity and 0.1up towards positive infinity. Interesting opportunities, but for 1788 with interesting interactions.

Since there are conflicts between common languages it's impossible to define one literal syntax common to them all, and using "_" for a 1788 meaning when it already has two meanings for common language literals may cause additional confusion. If we use it, we might need a statement that our meaning overrides any language meanings.

- Ian McIntosh IBM Canada Lab Compiler Back End Support and Development

John Pryce ---07/19/2013 01:02:10 PM---Vincent, P1788 On 2013 Jul 19, at 11:28, Vincent Lefevre wrote:

From:

John Pryce <j.d.pryce@xxxxxxxxxx>

To:

Ian McIntosh/Toronto/IBM@IBMCA

Date:

07/19/2013 01:02 PM

Subject:

Re: Motion 46: NO

Vincent, P1788

On 2013 Jul 19, at 11:28, Vincent Lefevre wrote:
> I vote NO on Motion 46 (Interval literals).
>
> The text contains various ambiguities about literals of host language
> and locales.

I agree with most of Vincent's criticisms. Chair, since this is a vote on the ideas, not actual text, are we allowed to carry on as if appropriate friendly amendments have been proposed and accepted?

A few seem to pose a dilemma or at least to require discussion.

> In particular:
>
> langNumLit {number literal of host language}
>
> For instance, if the host language is C, what would a number literal be?
> A number literal in a C source (the C standard says "integer constant"
> and "floating constant")? An input of strtod(), which is the numeric
> version of what text2interval is for intervals? This is different as
> strtod() is sensitive to locales concerning the decimal-point character
> and also case sensitiveness I assume.
>
> Now, are interval literals intended to be sensitive to locales, at least
> in some contexts like in C?

It seems the locale issue will be both tricky and verbose if we address it in detail (Turkish "i" vs "ı", comma vs period in numbers, etc). So I think this standard should apply to the "default locale", and implementations *may* make locale-specific variations. Would that do? My understanding of locales is shaky.

Second, I was well aware that "number literal of host language" is complicated when one gets down to detail, but I continue to think that for the full standard it is better than restricting to some basic syntax. Fuller syntax for full standard; a basic syntax for proposed basic standard, on the lines suggested by Dmitry.

I propose the meaning of "number literal of host language" be defined precisely for the commonest languages. This should go in an Annex, referred to by the interval literal clause.

> What if in the host language, the decimal-point character is a comma ","
> (i.e. the same character as the number separator in infSupIntvl), like
> for strtod() in some European locales such as fr_FR? This leads to an
> ambiguity as the comma would be used for two different purposes.
>
> Moreover, in the second example, about C/C++, I don't think that it is a
> good idea to accept floating-suffixes as said in "This is ignored within
> an interval literal: 1.2345 and 1.2345f both denote the mathematical
> number 1.2345", since in C/C++, 1.2345f only means a float number, which
> is mathematically different from the decimal value 1.2345. Why would a
> user put a floating-suffix for something that is not a float number?
> Accepting floating-suffixes would lead to confusion. Note that these
> suffixes make sense only in a source; they are not accepted by strtod(),
> where they are regarded as an error.

For C, I hadn't realised that "input of strtod()" is different from "integer constant" and "floating constant" in source. Would it do to define "number literal of host language" to mean "input of strtod() in default locale"? I agree with what Vincent says: a floating suffix makes no sense in this context and should be an error.

> I think that the standard should fully specify the format of interval
> literals or specify nothing.
Yes

> If specification is provided, it should
> have a note saying that for case sensitiveness, concepts like locales
> are ignored.
Not sure what this means.

> An implementation may provide functions that accept other
> formats and/or take locales into account, but these functions are out
> of the scope of this standard.
Yes

Can we say *everything* about interval literals is case-insensitive? Then one can abolish the "anycase" notion and say a valid IL is one that, after converting to, say, lower case, obeys the given grammar.
I have no truck with complications such as (if I understand Vincent aright) Turkish I becoming a different letter when converted to lowercase. That's why we should stick to a "default" locale.

> Other minor remarks:
>
> Is it intended to accept "[empty]" but not "[ empty ]" or more generally
> "[" {sp} "empty" {sp} "]"? Ditto for "[entire]".
Good point. What do people think? On output, people might prefer a table of intervals to look like
[1.234, 5.678]
[ empty ]
[2.345, 6.789]
...
rather than
[1.234, 5.678]
[empty]
[2.345, 6.789]
...
These outputs, if produced by interval2text, consist of ILs; which suggests strings of the form "[" {sp} "empty" {sp} "]" should be accepted on input, i.e. should be ILs.

> The term "Constants" for "[empty]" and "[entire]" is rather inadequate
> since "[1]" is also a constant. I would say "Special intervals".
Yes, that is a better name.

> Concerning the uncertain form:
>
> * It is said that r is a non-negative decimal integer literal. Is it
> intended to allow 0?
Yes. 3.45?0 has the same value as [3.45].

> * This isn't really important, but I would accept any case for letters,
> i.e. "d", "u" and "e" used here, for consistency with the other parts.
Yes, see above.

> * In the examples, I would add:
> -10?u [-10,-9.5]
> to make sure ulp is well-understood on powers of the radix and that
> direction characters are also well-understood on negative numbers.
Good example. Yes.

> In the "anycase" example, if locales are chosen to be ignored, it may
> be better to use: anycase("ai") matches any of "ai", "Ai", "aI", "AI"
> (see discussion about the dotless "i" and "I" with dot in Turkish).
See above.

> For decorated intervals, I think that connectChar should be specified
> (see above about full specification).
Shall we fix on "_" then? What do people think?

> And shouldn't decorationLit be case-insensitive?
See above.

John Pryce

Begin forwarded message:

From: Michel Hack <mhack@xxxxxxx>
Subject: Re: Motion 46 -- Vincent's objections
Date: July 19, 2013 2:04:07 PM CDT
To: stds-1788 <stds-1788@xxxxxxxxxxxxxxxxx>

(I changed the subject line because I'm not voting right now; nor were
Ian and John.)

We have to make clear that the term "interval literal" is used in a
non-standard way. We use it as a shorthand for the argument to the
text2interval() operation.

No interval or floating-point standard has any say in what the syntax
of any particular language should be: what we wish to standardize is
the semantics, i.e. the expected behaviour. So we have no say in how
a language supports its literals -- including interval literals for
languages (like Sun Fortran) that support them as first-class types.
This was clearly stated in 754-2008 for example.

What we CAN and SHOULD standardize are interchange formats, i.e. the
behaviour of the text2interval() and interval2text() operations (and
not their actual names, or the syntax of their invocation).

For this purpose, interval2text() should be precise, and text2interval()
permissive, and of course the latter must accept what the former produces.

In a world with widely different conventions it is useful to have the
concept of locale-dependency, but one must also be able to be universally
precise. I don't have much experience with the locale issues of atof(),
say, but I did trip over such issues in the AIX libc implementation, and
I find it troublesome that there is no guaranteed way to recognize a
floating-point literal. I would have preferred if the so-called C-locale
notation was ALWAYS accepted, and others might be accepted too, especially
since there is no conflict with digit-separators which, though possible on
output, are (as far as I recall) NOT accepted on input: 1,234.56 for
example is recognized as 1 (the rest is junk). So in a European locale,
where one would write 1234,56 there should be no ambiguity if 1234.56 is
entered, because 1.234,56 would not be acceptable as equivalent to 1234,56
and would (in the absence of additional C-locale interpretation) be taken
to be 1.

So I think we should define a standard format for text2interval() arguments,
and we could compare them to the C locale for clarity. We would obviously
not refuse a locale-dependent variant -- it is always permissible to provide
additional functions, or perhaps even the same function which ALSO allows
locale-dependent forms, together with whatever might be required to avoid
conflict with the standard syntax.

Turkish locales might have difficulty with Inf vs inf, but I can imagine
worse: what if a locale specifies right-to-left strings? I know that numeric
literals are witten in the same order as for left-to-right scripts, but what
about the order of bounds in the pair that denotes an interval? Is the low
bound the first or the leftmost element?

Getting back to comma vs point. C syntax uses comma for separating elements
of an initializer, among other things. How would that interact with numeric
literals containing a comma? That's why locale has no impact on C literals,
even though it does affect some common library functions.

Michel.
---Sent: 2013-07-19 19:40:04 UTC

Begin forwarded message:

From: Vincent Lefevre <vincent@xxxxxxxxxx>
Subject: Re: Motion 46: NO
Date: July 19, 2013 5:05:16 PM CDT
To: <stds-1788@xxxxxxxxxxxxxxxxx>

On 2013-07-19 17:59:48 +0100, John Pryce wrote:

It seems the locale issue will be both tricky and verbose if we
address it in detail (Turkish "i" vs "ı", comma vs period in
numbers, etc). So I think this standard should apply to the "default
locale", and implementations *may* make locale-specific variations.
Would that do? My understanding of locales is shaky.

I think this is fine. I hope that implementers will understand that
if the standard requires anycase("inf") to be recognized, this means
"inf", "INF", etc. and possibly other locale-dependent strings.
Rejecting "INF" in tr_TR because of the dotless "I" would be
unexpected behavior IMHO.

Second, I was well aware that "number literal of host language" is
complicated when one gets down to detail, but I continue to think
that for the full standard it is better than restricting to some
basic syntax. Fuller syntax for full standard; a basic syntax for
proposed basic standard, on the lines suggested by Dmitry.

I propose the meaning of "number literal of host language" be
defined precisely for the commonest languages. This should go in an
Annex, referred to by the interval literal clause.

This should just be recommendations. Some forms may be rejected for
good reasons (e.g. floating-suffixes and octal). Note that in C,
when an integer is expected, the fact that the first digit is 0
doesn't always mean that it will be interpreted as an octal value.
for instance, for the %e printf specifier:

[...] The exponent always contains at least two digits, and only
as many more digits as necessary to represent the exponent. [...]

So if the exponent is 8 or 9, it will be represented as 08 or 09,
and of course, this mustn't be interpreted as octal.

For C, I hadn't realised that "input of strtod()" is different from
"integer constant" and "floating constant" in source. Would it do to
define "number literal of host language" to mean "input of strtod()
in default locale"? I agree with what Vincent says: a floating
suffix makes no sense in this context and should be an error.

Yes.

Can we say *everything* about interval literals is case-insensitive?

Yes, this would be much better.

Then one can abolish the "anycase" notion and say a valid IL is one
that, after converting to, say, lower case, obeys the given grammar.

Yes. In Unicode, "Default Case Folding" (used for case-insensitive
matching) is based on the lower case version, so that it is better
to choose this one:

Unicode Default Case Folding is built on the toLowercase(X)
transform, with some adaptations specifically for caseless matching.
Context-dependent mappings based on the casing context are not used.

The case folding rules are given by:

http://www.unicode.org/Public/UNIDATA/CaseFolding.txt

Is it intended to accept "[empty]" but not "[ empty ]" or more generally
"[" {sp} "empty" {sp} "]"? Ditto for "[entire]".

Good point. What do people think? On output, people might prefer a
table of intervals to look like
[1.234, 5.678]
[ empty ]
[2.345, 6.789]
...
rather than
[1.234, 5.678]
[empty]
[2.345, 6.789]
...

Yes, the former looks nicer.

Concerning the uncertain form:

* It is said that r is a non-negative decimal integer literal. Is it
intended to allow 0?

Yes. 3.45?0 has the same value as [3.45].

But [3.45?] will be [3.445,3.455]. It is a bit unusual that a missing
(or default) value isn't equivalent to 0.

For decorated intervals, I think that connectChar should be specified
(see above about full specification).

Shall we fix on "_" then? What do people think?

I think so.

--
Vincent Lefèvre <vincent@xxxxxxxxxx> - Web: <http://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)

Begin forwarded message:

From: Vincent Lefevre <vincent@xxxxxxxxxx>
Subject: Re: Motion 46: NO
Date: July 19, 2013 5:23:17 PM CDT
To: <stds-1788@xxxxxxxxxxxxxxxxx>

On 2013-07-20 00:05:16 +0200, Vincent Lefevre wrote:

Concerning the uncertain form:

* It is said that r is a non-negative decimal integer literal. Is it
intended to allow 0?

Yes. 3.45?0 has the same value as [3.45].

But [3.45?] will be [3.445,3.455]. It is a bit unusual that a missing
(or default) value isn't equivalent to 0.

Actually 3.45? will be [3.445,3.455]. But is there a reason not to
use "[" / "]" for uncertain intervals? IMHO [3.45?] would be better.
and would allow easier parsing of streams of numbers and literals,
such as:

17 [1,2] [3.0?] -5 [3]

The first character of a literal would immediately determine whether
it is a number or an interval.

And you could also allow spaces in such interval literals for more
readability.

--
Vincent Lefèvre <vincent@xxxxxxxxxx> - Web: <http://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)

Begin forwarded message:

From: Vincent Lefevre <vincent@xxxxxxxxxx>
Subject: Re: Motion 46: NO
Date: July 19, 2013 5:30:30 PM CDT
To: <stds-1788@xxxxxxxxxxxxxxxxx>

On 2013-07-19 15:00:22 -0400, Ian McIntosh wrote:

I like "_" but if C++ and Fortran are in the list of commonest
languages it is awkward.
- In Fortran, "_" is a separator between value digits and kind digits.
You could write "123_4" meaning the same as "123" but of kind 4, which
typically is a 4 byte / 32 bit integer.

But there would be no ambiguities, because what follows "_" would
be letters (not digits) in a decorated interval literal.

- In C++ 11, "_" may be the first character of a user defined suffix for a
user defined literal. You could write "123_xx" to mean the same as "123"
but to be processed by the user defined literal function that handles
literals with "_xx" suffixes, which could for example return a value of
type "xx". One could define suffixes and functions so that 1.5_feet +
6_inches equalled 2_feet, or that 0.1down was converted with rounding
towards negative infinity and 0.1up towards positive infinity. Interesting
opportunities, but for 1788 with interesting interactions.

In C++ 11, can you write literals of the form [123]_xx ?

Since there are conflicts between common languages it's impossible to
define one literal syntax common to them all, and using "_" for a 1788
meaning when it already has two meanings for common language literals may
cause additional confusion. If we use it, we might need a statement that
our meaning overrides any language meanings.

"_" was mainly for text2interval(). "Real" literals of interval type
in a language may need some transformation. Indeed I don't expect all
"commonest languages" to accept literals like [1,2] directly.

--
Vincent Lefèvre <vincent@xxxxxxxxxx> - Web: <http://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)

John, P-1788,

On 07/19/2013 11:59 AM, John Pryce wrote:

Vincent, P1788

On 2013 Jul 19, at 11:28, Vincent Lefevre wrote:

I vote NO on Motion 46 (Interval literals).

The text contains various ambiguities about literals of host language
and locales.

I agree with most of Vincent's criticisms. Chair, since this is a vote on the ideas, not actual text, are we allowed to carry on as if appropriate friendly amendments have been proposed and accepted?

I see no harm in altering the exact text, as long as the general framework
is followed if the vote passes. (We will need to vote on the actual text,
anyway, in that case.) The "no" votes can help explain what to do
next if the motion doesn't pass. (Would it mean we shouldn't have literals
at all, or something else?)

A few seem to pose a dilemma or at least to require discussion.

In particular:

langNumLit {number literal of host language}

For instance, if the host language is C, what would a number literal be?
A number literal in a C source (the C standard says "integer constant"
and "floating constant")? An input of strtod(), which is the numeric
version of what text2interval is for intervals? This is different as
strtod() is sensitive to locales concerning the decimal-point character
and also case sensitiveness I assume.

Now, are interval literals intended to be sensitive to locales, at least
in some contexts like in C?

It seems the locale issue will be both tricky and verbose if we address it in detail (Turkish "i" vs "ı", comma vs period in numbers, etc). So I think this standard should apply to the "default locale", and implementations *may* make locale-specific variations. Would that do? My understanding of locales is shaky.

We might define "default locale." I have been working at this long enough to
remember systems that only had the 26 upper case Roman letters, plus one or
two punctuation marks. I'm not even sure they had "[" and "]".
My own (perhaps biased) opinion of a "default" is standard
128 character ASCII, but the expansion of that acronym explicitly contains
the word "American" in it (/American Standard Code for Information Interchange/). There
are more recent possibilities that may be more inclusive, but perhaps less universally
available in systems and programming languages (Unicode?). In any case, it might not be totally
clear what the default locale should be.

Baker

Begin forwarded message:

From: Ralph Baker Kearfott <rbk5287@xxxxxxxxxxxxx>
Subject: Re: Motion 46: NO
Date: July 19, 2013 5:36:38 PM CDT
To: Ian McIntosh <ianm@xxxxxxxxxx>
Cc: <stds-1788@xxxxxxxxxxxxxxxxx>
Reply-To: <rbk@xxxxxxxxxxxxx>

On 07/19/2013 02:00 PM, Ian McIntosh wrote:

Since there are conflicts between common languages it's impossible to define one literal syntax common to them all, and using "_" for a 1788 meaning when it already has two meanings for common language literals may cause additional confusion. If we use it, we might need a statement that our meaning overrides any language meanings.

I think that's a good point (about meaning conflicts). I'm
wondering if we can word the standard in such a way
to avoid such conflicts (by specifying, say, the locale's
_expression_ to express the three most significant
decimal digits). The downside to that, of course, would
be a less clear and weaker standard. I'll duck out of
that one ...

Baker

--

---------------------------------------------------------------
R. Baker Kearfott, rbk@xxxxxxxxxxxxx (337) 482-5346 (fax)
(337) 482-5270 (work) (337) 993-1827 (home)
URL: http://interval.louisiana.edu/kearfott.html
Department of Mathematics, University of Louisiana at Lafayette
(Room 217 Maxim D. Doucet Hall, 1403 Johnston Street)
Box 4-1010, Lafayette, LA 70504-1010, USA
---------------------------------------------------------------

Begin forwarded message:

From: Vincent Lefevre <vincent@xxxxxxxxxx>
Subject: Re: Motion 46 -- Vincent's objections
Date: July 19, 2013 6:07:53 PM CDT
To: stds-1788 <stds-1788@xxxxxxxxxxxxxxxxx>

On 2013-07-19 15:04:07 -0400, Michel Hack wrote:

In a world with widely different conventions it is useful to have the
concept of locale-dependency, but one must also be able to be universally
precise. I don't have much experience with the locale issues of atof(),
say, but I did trip over such issues in the AIX libc implementation, and
I find it troublesome that there is no guaranteed way to recognize a
floating-point literal. I would have preferred if the so-called C-locale
notation was ALWAYS accepted, and others might be accepted too, especially

This is what we do in MPFR:

Parsing follows the standard C `strtod' function with some
extensions. After optional leading whitespace, one has a subject
sequence consisting of an optional sign (`+' or `-'), and either
numeric data or special data. The subject sequence is defined as
the longest initial subsequence of the input string, starting with
the first non-whitespace character, that is of the expected form.

The form of numeric data is a non-empty sequence of significand
digits with an optional decimal point, and an optional exponent
consisting of an exponent prefix followed by an optional sign and
a non-empty sequence of decimal digits. A significand digit is
either a decimal digit or a Latin letter (62 possible characters),
with `A' = 10, `B' = 11, ..., `Z' = 35; case is ignored in bases
less or equal to 36, in bases larger than 36, `a' = 36, `b' = 37,
..., `z' = 61. The value of a significand digit must be strictly
less than the base. The decimal point can be either the one
defined by the current locale or the period (the first one is
accepted for consistency with the C standard and the practice, the
second one is accepted to allow the programmer to provide MPFR
numbers from strings in a way that does not depend on the current
locale). The exponent prefix can be `e' or `E' for bases up to
10, or `@' in any base; it indicates a multiplication by a power
of the base. In bases 2 and 16, the exponent prefix can also be
`p' or `P', in which case the exponent, called _binary exponent_,
indicates a multiplication by a power of 2 instead of the base
(there is a difference only for base 16); in base 16 for example
`1p2' represents 4 whereas `1@2' represents 256. The value of an
exponent is always written in base 10.

In C, strtod is required to accept the locale-dependent version (well,
this is how the glibc developers have interpreted the standard), but
can accept more in other than the "C" locale:

In other than the "C" locale, additional locale-specific subject
sequence forms may be accepted.

since there is no conflict with digit-separators which, though possible on
output, are (as far as I recall) NOT accepted on input: 1,234.56 for
example is recognized as 1 (the rest is junk). So in a European locale,
where one would write 1234,56 there should be no ambiguity if 1234.56 is
entered, because 1.234,56 would not be acceptable as equivalent to 1234,56
and would (in the absence of additional C-locale interpretation) be taken
to be 1.

So I think we should define a standard format for text2interval() arguments,
and we could compare them to the C locale for clarity. We would obviously
not refuse a locale-dependent variant -- it is always permissible to provide
additional functions, or perhaps even the same function which ALSO allows
locale-dependent forms, together with whatever might be required to avoid
conflict with the standard syntax.

Turkish locales might have difficulty with Inf vs inf, but I can imagine
worse: what if a locale specifies right-to-left strings?

AFAIK, this notion is only for *display* purpose. "Inf" in such a locale
would still be the "Inf" string. No differences with finite numeric
literals.

I know that numeric literals are witten in the same order as for
left-to-right scripts, but what about the order of bounds in the
pair that denotes an interval? Is the low bound the first or the
leftmost element?

The first, but I would say that the full interval literal would be
written left-to-right.

Otherwise the first character would be "]", not "[".

One may also wonder a BOM would be accepted as the first character
of a literal. Possibly as an implementation-dependent variant. Or
perhaps we should introduce a notion of canonicalization.

Getting back to comma vs point. C syntax uses comma for separating
elements of an initializer, among other things. How would that
interact with numeric literals containing a comma?

A C source is always interpreted under the C locale (or similar,
e.g. to allow non-ASCII characters). This means that the decimal
point character in a C source is always a point.

That's why locale has no impact on C literals, even though it does
affect some common library functions.

This is for a practical reason: a C source needs to have the same
interpretation whatever the locale. Well, the character encoding
matters, and a bit more (for instance, GCC removes spaces at the
end of each line). But that's all, even though this initial
transformation is unspecified by the C standard.

--
Vincent Lefèvre <vincent@xxxxxxxxxx> - Web: <http://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)

Begin forwarded message:

From: Vincent Lefevre <vincent@xxxxxxxxxx>
Subject: Re: Motion 46: NO
Date: July 19, 2013 6:25:38 PM CDT
To: <stds-1788@xxxxxxxxxxxxxxxxx>

On 2013-07-19 17:31:19 -0500, Ralph Baker Kearfott wrote:

We might define "default locale." I have been working at this long
enough to remember systems that only had the 26 upper case Roman
letters, plus one or two punctuation marks.

Like the Apple ][.

I'm not even sure they had "[" and "]".

These characters aren't even part of the invariant subset of
ISO/IEC 646:

http://en.wikipedia.org/wiki/ISO/IEC_646

That's why C had trigraphs for them.

But IMHO, one can consider that one has at least the intersection
of ASCII and EBCDIC.

My own (perhaps biased) opinion of a "default" is standard 128
character ASCII, but the expansion of that acronym explicitly
contains the word "American" in it (/American Standard Code for
Information Interchange/). There are more recent possibilities that
may be more inclusive, but perhaps less universally available in
systems and programming languages (Unicode?). In any case, it might
not be totally clear what the default locale should be.

In C, the notion of locale is distinct from the notion of character
set or encoding. C implementations may be based on various character
sets, not necessarily based on ASCII. This is unspecified, but there
are minimal requirements.

--
Vincent Lefèvre <vincent@xxxxxxxxxx> - Web: <http://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)

Begin forwarded message:

From: John Pryce <j.d.pryce@xxxxxxxxxx>
Subject: Re: Discussion of interval literal syntax
Date: July 20, 2013 6:04:08 AM CDT
To: Vincent Lefevre <vincent@xxxxxxxxxx>
Cc: <stds-1788@xxxxxxxxxxxxxxxxx>

Folk

On 2013 Jul 20, at 00:30, Vincent Lefevre wrote:

Requiring that all interval literals start with "[" and end with
the first "]" can help...

I rather like the conciseness of uncertain form *without* the brackets, 1.234?5 not [1.234?5], but the convenience for implementers should have high priority.

Straw poll please from those with experience in parsing. Does Vincent's suggestion make it easier to parse interval literals? E.g. to create an LR(1) grammar to do the parsing. Especially in the context of reading them as data from a stream.

John Pryce

Begin forwarded message:

From: Dmitry Nadezhin <dmitry.nadezhin@xxxxxxxxxx>
Subject: Re: Discussion of interval literal syntax
Date: July 20, 2013 9:45:49 AM CDT
To: <j.d.pryce@xxxxxxxxxx>
Cc: <vincent@xxxxxxxxxx>, <stds-1788@xxxxxxxxxxxxxxxxx>

I don't see much difference in parsing difficulty between with and without brackets.

-Dima

Begin forwarded message:

From: Michel Hack <mhack@xxxxxxx>
Subject: Re: Discussion of interval literal syntax
Date: July 20, 2013 1:03:44 PM CDT
To: stds-1788 <stds-1788@xxxxxxxxxxxxxxxxx>

John Pryce asked:

On 2013 Jul 20, at 00:30, Vincent Lefevre wrote:

Requiring that all interval literals start with "[" and end with
the first "]" can help...

Straw poll please from those with experience in parsing. Does
Vincent's suggestion make it easier to parse interval literals?

It does not matter much in the context of text2interval(), where
we know that an interval is expected. It would help however in
an implicit-typing language, and perhaps even more in interpretive
or just-in-time-compilation settings.

Michel.
---Sent: 2013-07-20 18:11:45 UTC

Begin forwarded message:

From: John Pryce <j.d.pryce@xxxxxxxxxx>
Subject: Re: Discussion of interval literal syntax
Date: July 20, 2013 3:28:15 PM CDT
To: Michel Hack <mhack@xxxxxxx>
Cc: stds-1788 <stds-1788@xxxxxxxxxxxxxxxxx>

Michel

On 2013 Jul 20, at 19:03, Michel Hack wrote:

John Pryce asked:

On 2013 Jul 20, at 00:30, Vincent Lefevre wrote:

Requiring that all interval literals start with "[" and end with
the first "]" can help...

Straw poll please from those with experience in parsing. Does
Vincent's suggestion make it easier to parse interval literals?

It does not matter much in the context of text2interval(), where
we know that an interval is expected. It would help however in
an implicit-typing language, and perhaps even more in interpretive
or just-in-time-compilation settings.

Yes, particularly for a function that reads objects from a file or standard input, and determines on the fly what kind of object it is.

It just strikes me that an important language of that kind is Matlab (and its clones). Unfortunately there is an unavoidable clash: Matlab uses brackets to construct arrays so they are unavailable for intervals. Here is the "input" function being used with in effect dynamic-typing (is that the same as implicit-typing?):

while true, x=input('Give x: '), end

Give x: 1.23 %scalar
x = 1.2300
Give x: [1.2,3.4;5.6,7.8] %2 by 2 array
x =

1.2000 3.4000
5.6000 7.8000

Give x: {1,2,3,4} %cell vector
x =

{
[1,1] = 1
[1,2] = 2
[1,3] = 3
[1,4] = 4
}

etc.

When a 1788 implementation is built in Matlab, it would be nice to overload "input" so that it can read intervals using text2interval(), but that would need a syntax change! Any ideas?

John Pryce

Begin forwarded message:

From: Ian McIntosh <ianm@xxxxxxxxxx>
Subject: Re: Motion 46 -- Right to left intervals
Date: July 23, 2013 5:54:16 PM CDT
To: <stds-1788@xxxxxxxxxxxxxxxxx>

Spaces and exponents?

It looks like spaces indicate where "words" are, and within brackets the order of some numbers is reversed, but values with exponents aren't. Maybe values with exponents aren't recognized by fribidi as being numbers?

- Ian McIntosh IBM Canada Lab Compiler Back End Support and Development

Begin forwarded message:

From: Ian McIntosh <ianm@xxxxxxxxxx>
Subject: Re: Motion 46: NO - C++ 11 question
Date: July 23, 2013 6:37:32 PM CDT
To: <stds-1788@xxxxxxxxxxxxxxxxx>

Vincent asked: In C++ 11, can you write literals of the form [123]_xx ?

Unfortunately no, it's not that flexible. The [ would be treated as an opening subscript operator and depending on context that would either be an error or not do what you want for a interval literal. In the right context the ]_xx would be treated as a closing subscript operator followed by a symbolic name _xx.

It's possible some future C++ standard could allow this, but not C++ 11.

What would work is to put the bracketed part of the literal in quotes, like "[123]"_xx, where the suffix _xx has been defined to produce an interval from a string. Handlers for multiple suffixes can produce values of the same type, and each handler could include whatever decoration that suffix implied.

- Ian McIntosh IBM Canada Lab Compiler Back End Support and Development

Begin forwarded message:

From: "G. William (Bill) Walster" <bill@xxxxxxxxxxx>
Subject: Re: P1788 input/output
Date: July 20, 2013 10:37:34 AM CDT
To: John Pryce <prycejd1@xxxxxxxxxxxxx>
Cc: Vincent Lefevre <vincent@xxxxxxxxxx>, stds-1788 <stds-1788@xxxxxxxxxxxxxxxxx>

All,

Regarding I/O and literals, given that Fortran remains the language primarily devoted to numerically intense computing and clearly the language with the most elaborate and flexible I/O syntax and semantics, might it be worthwhile to consider its latest standard in addition to C and C++'s?

Cheers,

Bill

On 7/19/13 11:12 PM, John Pryce wrote:

Vincent, P1788

This query by Vincent is ambiguous:

On 2013 Jun 28, at 13:18, Vincent Lefevre wrote:

Moreover I suppose that if T is a subset of T' and x is an interval
of type T (so that hull_T'(x) = x), it is *not* required that

public2interval(interval2public(x),T') = x

i.e. the behavior of interval2public is strongly type-dependent, not
based on a Level 1 function. But perhaps the above equality should be
recommended.

In the latest draft I renamed these as exact2interval and interval2exact. So the query is: If
(*) s = interval2exact(x) (x is a T-interval so this means T-interval2exact(x))
then is
(**) x' = T'-exact2interval(s)
equal to x ?

Well, "no" in the normal sense of Level 2 equality because equal(x,x') is simply undefined when x and x' have different types. BTW Vincent's hull_T'(x) = x above has no meaning at Level 2 for that reason.

But should be "yes" at Level 1 since (*) maps x to an exact text encoding of its Level 1 value, and (**) should recover this value.

But actually "maybe" (this is a *design decision*). I think it's what Vincent really means. Namely, T'-exact2interval(s) is guaranteed to be defined, and equal x', when string s is produced by T'-interval2exact applied to a T'-interval x'. But it needn't even be *defined* for the s from (*), which is produced by T-interval2exact.

Here's an example to fix ideas.
Let T be inf-sup binary32.
Let T' be inf-sup binary64.
So T is a subset of T'.
Let x = [xlo,xhi] be a nonempty interval of type T or T'.
Then s = interval2exact(x), is "[slo, shi]" (meaning concatenation of "[" slo "," shi "]" ) where strings slo, shi are the result of outputting xlo and xhi in hex-significand form with enough digits to represent them exactly, which I think is 6 digits for T and 14 digits for T'. So s for x of type T might be
s = "[6.789abp+5, 7.89abcp+5]"
and for x of type T' the (mathematically same) interval might be output as
s' = "[6.789ab00000000p+5, 7.89abc00000000p+5]"
if the implementation decided always to print the full number of hex digits, and not suppress trailing zeros.
(Recall p+5 means "times (two raised to power 5)" where the 5 is decimal.)

Should T'-exact2interval() be able to read s? Should T-exact2interval() be able to read s'? My view is Yes. In fact the current draft of I/O says that for 754-conforming types, T-exact2interval() is the same as T-text2interval() (for any type T), so it can read any interval literal, so it can read s or s' equally well.

Is that sensible?

A follow-up query is whether a (new? change to existing?) Level 2 equality-comparison operation is needed to test equal(x,x') when x,x' are of different types, one being a subset of the other. Relevant especially for 754-conforming types, as above.

John Pryce

Begin forwarded message:

From: John Pryce <j.d.pryce@xxxxxxxxxx>
Subject: Re: P1788 input/output
Date: July 20, 2013 2:58:25 PM CDT
To: "G. William (Bill) Walster" <bill@xxxxxxxxxxx>
Cc: stds-1788 <stds-1788@xxxxxxxxxxxxxxxxx>

Bill

On 2013 Jul 20, at 16:37, G. William (Bill) Walster wrote:

Regarding I/O and literals, given that Fortran remains the language primarily devoted to numerically intense computing and clearly the language with the most elaborate and flexible I/O syntax and semantics, might it be worthwhile to consider its latest standard in addition to C and C++'s?

Definitely a good idea. Volunteers to do it?

However, the I/O and literals design is fairly language-independent, except for my recent half-baked idea of how the cs conversion specifier might be designed, which is in the C fprintf style. Would you show us some ways in which Fortran-style I/O has advantages for intervals? Implied DOs in I/O are one brilliant feature that is fairly unique to Fortran, but I'm not sure that's relevant to intervals as such.

John Pryce

Begin forwarded message:

From: "G. William (Bill) Walster" <bill@xxxxxxxxxxx>
Subject: Re: P1788 input/output
Date: July 20, 2013 3:27:43 PM CDT
To: John Pryce <j.d.pryce@xxxxxxxxxx>
Cc: stds-1788 <stds-1788@xxxxxxxxxxxxxxxxx>

John,

We followed this principle when implementing interval data types in Sun's Fortran: There will be no integer or real syntax- or systematic-feature in standard Fortran 95 that does not have an interval counter part. That is, intervals will not be a second class data type. This required two items to be developed and implemented: interval formats; and "widest need _expression_ evaluation" and its application to I/O. The second is required to guarantee containment in mixed mode expressions involving intervals as well as integers and reals.

I hope that nothing in P1788 will preclude first class interval data types from being implemented in Fortran and other languages.

Thus, while I am not a "language lawyer", when it comes to definitions of terms and constructs having to do with literals used to represent intervals, I think it might be wise to consult and cite the Fortran standard and not just the C and C++ standards.

Cheers,

Bill

On 7/20/13 12:58 PM, John Pryce wrote:

Bill

On 2013 Jul 20, at 16:37, G. William (Bill) Walster wrote:

Regarding I/O and literals, given that Fortran remains the language primarily devoted to numerically intense computing and clearly the language with the most elaborate and flexible I/O syntax and semantics, might it be worthwhile to consider its latest standard in addition to C and C++'s?

Definitely a good idea. Volunteers to do it?

However, the I/O and literals design is fairly language-independent, except for my recent half-baked idea of how the cs conversion specifier might be designed, which is in the C fprintf style. Would you show us some ways in which Fortran-style I/O has advantages for intervals? Implied DOs in I/O are one brilliant feature that is fairly unique to Fortran, but I'm not sure that's relevant to intervals as such.

John Pryce

Begin forwarded message:

From: Vincent Lefevre <vincent@xxxxxxxxxx>
Subject: Re: P1788 input/output
Date: July 23, 2013 5:41:05 AM CDT
To: stds-1788 <stds-1788@xxxxxxxxxxxxxxxxx>

On 2013-07-22 10:59:31 -0400, Michel Hack wrote:

This gets us back to "recoverable representations" which I had brought
up a number of times in the past. The problem here is that one has to
record or remember (a) the original type, and (b) the fact that this
was indeed the result of intervalToRecoverable() -- because recovery
will have to apply compatible rounding to recover the original interval.
One of the conversion directions is likely to (technically) violate
the containment rule, so one always has to be mindful of context. Both
intervalToRecoverable() and recoverableToInterval() could use to-nearest,
or one could use outward rounding and the other inward rounding.

The primary goal of recovery is to use the same type. Then there's the
question of whether you want to allow exact2interval to be applied on
a different type. If yes, then I agree that one needs to require an
exact representation. For instance, if exact2interval is applied on
a different type, it shall either return an exact result or raise a
language- or implementation-defined exception.

--
Vincent Lefèvre <vincent@xxxxxxxxxx> - Web: <http://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)

Begin forwarded message:

From: Michel Hack <mhack@xxxxxxx>
Subject: Re: P1788 input/output
Date: July 23, 2013 5:49:24 AM CDT
To: stds-1788 <stds-1788@xxxxxxxxxxxxxxxxx>

Vincent Lefèvre replied to my note on intervalTo Recoverable():

The primary goal of recovery is to use the same type. Then there's the
question of whether you want to allow exact2interval to be applied on a
different type. If yes, then I agree that one needs to require an exact
representation. For instance, if exact2interval is applied on a different
type, it shall either return an exact result or raise a language- or
implementation-defined exception.

This is possible only if the recoverable representation includes type
information. And the issue of deliberate containment violation in one
or the other direction remains: warnings about mis-use would be needed.
One way would be for the type encoding to be such that the result is
syntactically invalid for textToInterval(), so that it can be read back
successfully only through recoverableToInterval() to a matching format,
as Vincent suggested above (but I would not call it "exactToInterval()").

The nice thing about intervalToExact() is that its result can be read
back by regular textToInterval(), recovering the original type exactly,
but preserving containment in all cases, in and out. There is no need
for extra information.

I have no problem permitting small integers to be expressed in decimal,
but large integers benefit from exponential notation in the proper radix.

Michel.
---Sent: 2013-07-23 11:06:21 UTC

Begin forwarded message:

From: Vincent Lefevre <vincent@xxxxxxxxxx>
Subject: Re: P1788 input/output
Date: July 23, 2013 6:41:35 AM CDT
To: stds-1788 <stds-1788@xxxxxxxxxxxxxxxxx>

On 2013-07-23 06:49:24 -0400, Michel Hack wrote:

Vincent Lefèvre replied to my note on intervalTo Recoverable():

The primary goal of recovery is to use the same type. Then there's the
question of whether you want to allow exact2interval to be applied on a
different type. If yes, then I agree that one needs to require an exact
representation. For instance, if exact2interval is applied on a different
type, it shall either return an exact result or raise a language- or
implementation-defined exception.

This is possible only if the recoverable representation includes type
information.

No, in either case, you don't need to record type information.

* If just recoverable is chosen, the user is assumed to know the type
(just like when he uses a function, he needs to know the prototype).

* If exact representation is chosen, type information is useless
as the representation is exact, so that its meaning is known for
every type (but some types may not be able to use it exactly).
I agree that it is less error-prone.

And the issue of deliberate containment violation in one
or the other direction remains: warnings about mis-use would be needed.

Yes.

One way would be for the type encoding to be such that the result is
syntactically invalid for textToInterval(), so that it can be read back
successfully only through recoverableToInterval() to a matching format,
as Vincent suggested above (but I would not call it "exactToInterval()").

Yes.

But note that there should also be warnings for the "[x]" form.
It can also easily be mis-used.

The nice thing about intervalToExact() is that its result can be read
back by regular textToInterval(), recovering the original type exactly,
but preserving containment in all cases, in and out. There is no need
for extra information.

I agree.

--
Vincent Lefèvre <vincent@xxxxxxxxxx> - Web: <http://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)

Begin forwarded message:

From: Michel Hack <mhack@xxxxxxx>
Subject: Re: P1788 input/output
Date: July 23, 2013 10:24:51 AM CDT
To: stds-1788 <stds-1788@xxxxxxxxxxxxxxxxx>

Vincent Lefèvre replying to my response concerning exactToInterval() etc:

This is possible only if the recoverable representation includes type
information.

No, in either case, you don't need to record type information.

The "possible only" applied to your suggestion of reporting an error
when recoverableToInterval() was used with the wrong target type.

(just like when he uses a function, he needs to know the prototype).

In good programming environments the compiler or linker would complain
about a mismatch. Also remember that recoverable representations are
often used externally; this matches the case of separately-compiled
functions, where the responsibility may fall on the linker -- assuming
the compiler and linker tag external names with a type hash or such.

Michel.
---Sent: 2013-07-23 15:29:27 UTC

Begin forwarded message:

From: John Pryce <j.d.pryce@xxxxxxxxxx>
Subject: Re: P1788 input/output: exact text representation
Date: July 23, 2013 3:39:28 PM CDT
To: Michel Hack <mhack@xxxxxxx>, Vincent Lefevre <vincent@xxxxxxxxxx>
Cc: stds-1788 <stds-1788@xxxxxxxxxxxxxxxxx>

Vincent, Michel, P1788

On 2013 Jul 23, at 16:24, Michel Hack wrote:

Vincent Lefèvre replying to my response concerning exactToInterval() etc:

This is possible only if the recoverable representation includes type
information.

No, in either case, you don't need to record type information.

The "possible only" applied to your suggestion of reporting an error
when recoverableToInterval() was used with the wrong target type.

(just like when he uses a function, he needs to know the prototype).

In good programming environments the compiler or linker would complain
about a mismatch. Also remember that recoverable representations are
often used externally; this matches the case of separately-compiled
functions, where the responsibility may fall on the linker -- assuming
the compiler and linker tag external names with a type hash or such.

Michel.

IMO we should Keep It Simple with exact text representation of intervals.
- I don't want to include type information. Let's stick to the case where
the user knows what type to read back to.
- I don't specially wish to specify such a representation for decorated intervals
though it wouldn't be hard.
Let such things wait for a revision of the standard in due course.

However, I think Michel's observation about writing from one type and reading to another is both perceptive and valuable.

Namely suppose type T' is wider than type T, meaning that regarded as sets of mathematical intervals, T is a subset of T'.

Then if
xx is a T-interval
and we write xx out by
s = T-interval2exact(xx).
Michel says string s *should* be readable by T'-exact2interval and recover xx exactly:
yy = T'-exact2interval(s)
should be defined, and should have the same interval value as xx (though of a different type).

E.g. to fix ideas, let T be inf-sup binary32 and T' be inf-sup binary64. Then interval yy should be just xx converted from single to double precision, but

In fact, with the spec as given so far, this happens automatically if T and T' are 754-conforming types. I'm working on writing a suitable "should" statement for general types.

John Pryce

Begin forwarded message:

From: Vincent Lefevre <vincent@xxxxxxxxxx>
Subject: Re: P1788 input/output
Date: July 23, 2013 7:20:06 PM CDT
To: stds-1788 <stds-1788@xxxxxxxxxxxxxxxxx>

On 2013-07-23 11:24:51 -0400, Michel Hack wrote:

Vincent Lefèvre replying to my response concerning exactToInterval() etc:

This is possible only if the recoverable representation includes type
information.

No, in either case, you don't need to record type information.

The "possible only" applied to your suggestion of reporting an error
when recoverableToInterval() was used with the wrong target type.

I didn't suggest to report an error in such a case. Only in the case
where exact representation would be used.

(just like when he uses a function, he needs to know the prototype).

In good programming environments the compiler or linker would complain
about a mismatch.

This is not always possible (see variadic functions in C).
And some languages like C are not strongly typed: if you use

typedef int my_own_integer;

the compiler or linker can't see the difference between int and
my_own_integer, because my_own_integer is just an alias for int.
You can't create a real new "int".

--
Vincent Lefèvre <vincent@xxxxxxxxxx> - Web: <http://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)

Begin forwarded message:

From: Christian Keil <c.keil@xxxxxxxxxxxxx>
Subject: Re: P1788 input/output: exact text representation
Date: July 24, 2013 12:31:28 PM CDT
To: stds-1788 <stds-1788@xxxxxxxxxxxxxxxxx>

Am Tue, 23 Jul 2013 21:39:28 +0100
schrieb John Pryce <j.d.pryce@xxxxxxxxxx>:

However, I think Michel's observation about writing from one type and
reading to another is both perceptive and valuable.

Namely suppose type T' is wider than type T, meaning that regarded as
sets of mathematical intervals, T is a subset of T'.

Then if
xx is a T-interval
and we write xx out by
s = T-interval2exact(xx).
Michel says string s *should* be readable by T'-exact2interval and
recover xx exactly: yy = T'-exact2interval(s)
should be defined, and should have the same interval value as xx
(though of a different type).

E.g. to fix ideas, let T be inf-sup binary32 and T' be inf-sup
binary64. Then interval yy should be just xx converted from single to
double precision, but

In fact, with the spec as given so far, this happens automatically if
T and T' are 754-conforming types. I'm working on writing a suitable
"should" statement for general types.

This seems to fall out of the current spec automatically, yes. But if
it doesn't in a more general way, than I would prefer to keep it simple
here.

If the importing system supports the originally used type, there is
always the possibility to read to the correct format and convert to a
wider one afterwards. The situation were the original format is not
supported seems quite special to me and probably doesn't deserve extra
handling if it complicates the standard, does it?

Christian

Begin forwarded message:

From: Vincent Lefevre <vincent@xxxxxxxxxx>
Subject: Re: P1788 input/output: exact text representation
Date: July 25, 2013 8:59:04 AM CDT
To: stds-1788 <stds-1788@xxxxxxxxxxxxxxxxx>

On 2013-07-24 19:31:28 +0200, Christian Keil wrote:

Am Tue, 23 Jul 2013 21:39:28 +0100
schrieb John Pryce <j.d.pryce@xxxxxxxxxx>:

However, I think Michel's observation about writing from one type and
reading to another is both perceptive and valuable.

Namely suppose type T' is wider than type T, meaning that regarded as
sets of mathematical intervals, T is a subset of T'.

Then if
xx is a T-interval
and we write xx out by
s = T-interval2exact(xx).
Michel says string s *should* be readable by T'-exact2interval and
recover xx exactly: yy = T'-exact2interval(s)
should be defined, and should have the same interval value as xx
(though of a different type).

E.g. to fix ideas, let T be inf-sup binary32 and T' be inf-sup
binary64. Then interval yy should be just xx converted from single to
double precision, but

In fact, with the spec as given so far, this happens automatically if
T and T' are 754-conforming types. I'm working on writing a suitable
"should" statement for general types.

This seems to fall out of the current spec automatically, yes. But if
it doesn't in a more general way, than I would prefer to keep it simple
here.

If the importing system supports the originally used type, there is
always the possibility to read to the correct format and convert to a
wider one afterwards. The situation were the original format is not
supported seems quite special to me and probably doesn't deserve extra
handling if it complicates the standard, does it?

Yes, I think that guaranteeing that a string output by some
T-interval2exact is readable by any T'-exact2interval is impossible
in general. So, the user should avoid such combinations. The standard
could make some recommendations in specific cases, in particular
because the result would be more intuitive (probably more for OO
languages).

--
Vincent Lefèvre <vincent@xxxxxxxxxx> - Web: <http://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)

Begin forwarded message:

From: Dmitry Nadezhin <dmitry.nadezhin@xxxxxxxxxx>
Subject: Re: Motion P1788/MOO46.02:IntervalLiteratals -- NO
Date: August 2, 2013 8:34:58 PM CDT
To: <stds-1788@xxxxxxxxxxxxxxxxx>

I vote NO for motion 46.02 .

I would vote YES if the main text of the standard contained portable concrete syntax of interval literals
together with a phrase that implementations may extend this syntax.

I think that the standard text should be self-contained and it shouldn't delegate definition
of the portable syntax to other document (basic standard).
According to motion 46.02, full standard says about implementations that
- textTointerval(s) parses interval literals in host-specific syntax;
- intervalToText(X, cs) may produce interval literals in host-spocific syntax.
It ensures only that implementation can read back interval literals produced by itself.

Basic standard could restrict the full standard saying that:
- text2Interval(s) can parse only interval literals the portable syntax;
- intervalToText(X, cs) always emits interval literals in the portable syntax.
This ensures that one implementation of basic standard can read back interval literals
produced by other implementation of basic standard. This doesn't ensure interoperability
between one implementation of basic standard and other implementation of full standard.

If full satndard said:
- textTointerval(s) of every implementation can parse interval literals in the portable syntax
(and possibly in extended language-specific syntax too);
- every implementation has such convsersion specifier "cs" (call it portable conversion specifier)
that intervalToText(X, cs) is in the portable syntax
(and other specifiers may emit literals in extended language-specific syntax).
then this ensures that every implementation of full standard can parse intervals produced
by any other implementation of full standard with portable "cs" .

-Dima

Begin forwarded message:

From: John Pryce <j.d.pryce@xxxxxxxxxx>
Subject: Re: Motion P1788/MOO46.02:IntervalLiteratals -- NO
Date: August 3, 2013 2:20:20 PM CDT
To: Dmitry Nadezhin <dmitry.nadezhin@xxxxxxxxxx>
Cc: <stds-1788@xxxxxxxxxxxxxxxxx>

Dmitry

On 2013 Aug 3, at 03:34, Dmitry Nadezhin wrote:

I vote NO for motion 46.02 .

I would vote YES if the main text of the standard contained portable concrete syntax

I am broadly in agreement with what you write but need to think over how the "portable" concrete syntax should be handled -- whether via the distinction between basic and full standard, or within the full standard. (I think you would like the latter.) Your view is similar to that of Michel:

On 2013 Jul 19, at 21:04, Michel Hack wrote:

What we CAN and SHOULD standardize are interchange formats, i.e. the
behaviour of the text2interval() and interval2text() operations (and
not their actual names, or the syntax of their invocation).

For this purpose, interval2text() should be precise, and text2interval()
permissive, and of course the latter must accept what the former produces.

Yes, I think interval2text() *should* generate literals in the to-be-defined portable syntax, and am sorry I didn't put this in the motion.

John Pryce

Prev by Date: Re: P1788: Motion 45 PASSES
Next by Date: Re: Motion P1788/MOO46.02:IntervalLiteratals -- NO
Previous by thread: Re: Motion P1788/MOO46.02:IntervalLiteratals -- NO
Next by thread: Motion P1788/M0047:Motion45Amendment-1 -- discussion period begins
Index(es):
- Date
- Thread