Re: Bare decorations (was ...level 2 datums)
John Pryce wrote:
On 17 Oct 2010, at 17:48, Nate Hayes wrote:
...the bulk of your Level 2 examples and arguments are expressed in terms
of bits, bytes and language bindings. I've yet to hear you clearly
formulate your position purely in Level 2 terms.
It is possible to discuss Level 2 in the abstract, but there is NO way to
define a specific Level 2 floating-point or interval datatype without
describing it by some sort of formula, which usually amounts to a
representation.
I'm sorry, John, but this is not true.
It is done all the time. A good reference is Bertrand Meyer's book on
object-oriented design where he describes an abstract datatype (ADT) purely
in mathematical terms by:
-- its types (i.e., its mathematical sets)
-- its functions
-- its axioms
-- its preconditions (if any)
He then maps these ADTs to representations, i.e., classes.
This makes the same distinctions between Level 2 (ADT) and Level 3
(representation) as we understand it here in P1788.
Even for IEEE 754, Dan already admits NaNs are Level 2. The NaNs are also
distinct (disjoint sets) from the finite numbers and the +-Infinities.
Hence, the ADT of "floating-point data" is defined by:
-- its types: the set R* (finite numbers, infinites), and the set NaN
-- its functions: +, -, *, /, sqrt, etc.
-- its axioms: "any arithmetic operation involving a NaN returns a NaN",
etc.
Some Level 2 examples are:
sqrt(4) = 2 // the result is a bare number, i.e., element of R*
sqrt(-1) = NaN // the result is a bare decoration, i.e., element of NaN
No representation or reference to Level 3 is needed to make this
distinction.
If you are not familliar with this concept, I think it would do you well to
take a look at Meyer's book.
So I'm totally unconviced what you say above is true. In fact, I know it is
not true.
The more relevant question in my mind is whether or not P1788 is going to
continue using the Levels model or abandon it. To the best I understand,
this appears to be what you are arguing for (abandoning the Levels model).
Specific examples are helpful, to suggest what consequences our decisions
at Level 2 might have at Level 3. Apart from those, my discussion is
always at level 2.
Language bindings are an excellent way to discuss level 2 in an accessible
way. As I said, a language binding must mirror level 2 semantics, so
discussing how an implementation in Matlab, C++, etc. should behave at the
user program level, tells us a lot about how level 2 should and should not
behave.
The behavior is specified at Level 2.
The implementation of the behavior is specified at Level 3 and beyond
(including language bindings).
Nate Hayes wrote:
(JDP) Consider 754 floating point. Let Real denote a particular level 2
format.
My understanding is that a "format" is a Level 3 concept (see below),
hence from my perspective you're already off-topic.
In 5.1 and 5.2.1 of the draft standard text, the level 2 meaning of
"format" is explicitly defined. OK, motion 19 changes the language to
"datatype" at level 2 and "format" at level 3. I wrote motion 19, so I
know the distinction; so please interpret my usage in the context common
sense suggests.
The name "Real" I used may have been poorly chosen. I'll change it to
"FPData" here & below, so my sentence is now
"Consider 754 floating point. Let FPData denote a particular level 2
format..."
My point was that in IEEE 754 "floating-point data" is Level 2, which is
then mapped into a Level 3 representation (see quote from 754-2008 in
previous e-mail). These Level 3 representations are then further mapped into
computer languages via language bindings. So there are many levels involved:
Level 2: abstract data types (ADT)
Level 3: representations of ADTs
Level 4: encodings of representations into bits and bytes
Level 5: language bindings
I don't really care what are the names we call stuff, or even if we use the
same names as IEEE 754 or not. I'm more concerned that whatever names we
choose we use them consistently and to make clear distinctions between Level
2 abstract concepts and Level 3 representations.
... i.e. the set of datums of that format. But I am happy for it to denote
some language binding of such a set, also.
Well, this is where we need to be _very_ careful about what we are talking
about.
As I mention above, even IEEE 754 has operations that return Level 2 objects
belonging to disjoint Level 2 sets (e.g., sqrt(4) returns a finite number
but sqrt(-1) returns NaN; and NaNs do not belong to the Level 2 set of
finite numbers).
However at Level 3, a binary64 is capable of representing elements from both
of these Level 2 sets: finite numbers and NaNs.
Hence the language binding in C++, for example, of a "double" to binary64
does not include Level 2 objects that belong to only one Level 2 set: a
"double" may represent either a finite number or a NaN, both of which belong
to disjoint Level 2 sets.
At the same time, a "double" _does_ represent an element of the set of
"floating-point data".
But this is because "floating-point data" is the union of all the disjoint
Level 2 sets:
-- finite numbers and infinities, i.e., R*
-- NaNs
and by coincidence it just happens all of these Level 2 objects can be
uniformly encoded into 8-bytes of information.
So when talking about language bindings, its important not to "fuse"
together the fact that at Level 2 there are disjoint sets of Level 2
objects, all of which may be represented in the language by a binding such
as "double", which may represent a union of those sets only because it is
conveniently possible to encode all of them in some desired fixed-size
binary image.
These are the distinctions I want to be sure do not get lost, either in this
discussion or in a future IEEE 1788 standard.
As long as the distinctions are clearly and consistently made, I do not care
what we call of these various Levels.
According to 754 Table 3.1, FPData IS a set comprising finitely many
members
of the extended reals R*, together with one thing called NaN, distinct
from any extended-real number...
Just as in
754, which put NaN, as well as numbers, into the level 2 set FPData.
No. YOU put NaN into FPData, and "FPData" in this case is actually a
language binding of a format, neither of which are a Level 2 set.
With respect, Nate, one might think you are obfuscating by objecting to
this.
Oh, really?
Nice try, John...
I would like to draw attention to the fact you substituded "Real" for
"FPData" in the above quotes (i.e., you changed the orignal e-mail texts).
This substitution dramatically changes the entire meaning of your original
argument, as well as my responses, e.g., the verbatim meaning of the above
conversation "quote" is that your "comment":
Just as in
754, which put NaN, as well as numbers, into the level 2 set FPData.
is correct, but my "response":
No. YOU put NaN into FPData, and "FPData" in this case is actually a
language binding of a format, neither of which are a Level 2 set.
is now wrong (or at least ambiguous).
So who is really being obfuscational here?
At least this is evidence you are finally changing your position... even
though it appears in doing so you now wish to take credit for my original
view. ;-)
754-2008, section 3.2 Specification Levels, second paragraph, last
sentence says,
"A floating-point datum, which can be a signed
zero, a finite non-zero number, signed infinity,
or a NaN (Not-a-Number), ..."
I know, John. I was the one who quoted this paragraph to you, before you
changed your position (e.g., the "modified" e-mail texts above).
In any case, at least in changing your position you now agree with my
original argument:
-- finite numbers and +-Infinity are elements of R*
-- NaNs are not an element of R*, i.e., NaN is a disjoint set from R*
-- Level 2 "floating-point data" is the union of these two disjoint
sets: R* and NaN
this is consistent with my example above:
sqrt(4) = 2 // returns an element of R*
sqrt(-1) = NaN // returns an element of NaN
Hence at Level 2, IEEE 754 has operations which return Level 2 objects
belonging to disjoint Level 2 sets; there is also "floating-point data",
which is the union of all the Level 2 sets.
None of this requires Level 3 representations or language bindings to
specify or understand.
Since you now agree to all of the above, I don't comprehend why we need to
change the Level 2 model presented in Motion 8.02, since it is analagous:
-- bare intervals are elements of the set overline-IR
-- bare decorations (NaI) are not elements of overline-IR, and hence
are a set disjoint from overline-IR
-- Level 2 "interval data" is the union of these two disjoint sets: bare
intervals (overline-IR) and bare decorations (NaI)
Of course, there is also decorated intervals. But this concept is unique to
P1788 (Level 2 "interval data" also includes the set of decorated
intervals).
IMO language binding is a good way to discuss level 2, as said above. A
valid program is executable mathematics: else why write it? Writing a type
like FPData in a program specifies a particular piece of mathematics,
namely a finite set.
In IEEE 754 FPData is the union of set R* and set NaN. Language bindings
such as "double" are capable of representing elements of either of these
sets. Hence "double" can represent _all_ elements of FPData.
However, the fact "double" has these properties is incidental to the Level 2
concepts.
Note that IEEE 754 does not have the concept of decorated numbers. This is
why language binding "double" can easily represent any element of FPData,
i.e., one can encode either an element of R* or NaN into 8-bits.
Since P1788 has the concept of decorated intervals, it is not the case that
a single language binding "interval" will typically represent any element of
IData ("interval data"). More likely, it will represent only elements of
IData that belong to bare intervals (overline-IR) or bare decorations (NaI).
A second language binding "dinterval" will likely be needed to represent
elements of IData that are the elements of decorated intervals.
Note this means bare decorations (NaI) _are_ elements of IData, i.e.,
"interval data". However, as I've always said from the beginning: this does
_not_ mean bare decorations (NaI) are elements of overline-IR, i.e., the set
of "bare intervals".
It appears to me this is the crucial point you miss in your recent position
paper.
Its also why trying to inject Level 3 concepts and language bindings into
the discussion is confusing and unproductive, IMO, at least until as a group
P1788 can settle on the Level 2 terminology.
Nate Hayes wrote:
You are the one advocating NaI should actually "be an interval" at Level
2, which is quite a strange concept, if you ask me.
No. I am saying NaI is [=we should choose it to be] one of the Level 2
*interval datums*, in exactly the same way as NaN is [=754 chose it to be]
one of the Level 2 *floating-point datums*.
This was already accomplished a long time ago in Motion 8.02; as I've been
pointing out over and over in this and previous e-mails. Hence, what you
mention is already the status-quo.
I believe the point you keep missing is that NaI is a bare decoration.
They are the same.
You were one of the people who voted NO to Motion 7 because you wanted NaI
to carry a payload of information, and this is exactly what a bare
decoration is.
So what is the purpose of your position paper?
Nate Hayes