IEEE 754R minutes from February 20, 2003

Attendance and minutes review

The IEEE 754R committee met at Intel building SC 12, Santa Clara campus, on February 20, 2003. In attendance were

David Hough posted a summary of the events of the January meeting, but Dick Delp's January minutes are still pending.

Draft review preview

The discussions from the January meeting led to major changes in the draft. The biggest item was adding decimal, but we also got rid of signaling NaNs. There was also no draft review in January, and consequently previously unsettled items remained unsettled.

Hough noted four potentially contentious points which he felt we should preview before the draft review:

  1. What are the names of the basic formats, and which formats should be required?
  2. When and how should extended formats and expression evaluation disciplines be discussed?
  3. When redundant representations are allowed, should there be a canonical representation?
  4. What should we do with signaling NaNs?

Names of formats

We first looked at names of the basic formats in binary and decimal. The issue of names arose, in part, because single precision has different parameters in decimal and binary, and we thought it might be confusing to use that terminology.

Three naming conventions were proposed:

There was some argument about whether or not we were encouraging a particular language binding by picking format names. IEEE 754-1985 did not specify that the names of the formats were required for language, and neither C nor Fortran adopted single as the keyword for the single precision type.

Kahan noted that there was already a standard for describing data formats -- IEEE 1596.5. He suggested we follow the lead of that document. He also thought Hough's 32-bit unit naming scheme had merit because it suggested the right granularity: we would not be able to describe a 37 bit format, and so might better be able to avert our eyes if someone designed such a format.

Though the focus of our argument was which names we should use for basic formats, we also wandered into names for extended formats and integers. Hough observed that for portable C code, integer formats are already frequently named by type and length in bits (e.g. int32), since the sizes of the C basic types vary from platform to platform. We also observed that measuring length in 32-bit units meant we could not readily describe the Intel 80 bit extended format; for the Itanium 82 bit internal extended format, we could not use 32-bit or 8-bit units. Some thought an inability to describe those formats would be a benefit; some thought it would be a deficiency; some thought it irrelevant; and at least one person noted that a byte has not always meant eight bits.

We eventually voted on names consisting of the radix together with the format length in bits. The vote was fifteen for and two against.

Required formats

After discussing what to call basic formats, we turned to the question of which of those formats would be required. In each proposal, the intent was that whichever of decimal or binary was provided should comply (so an implementation could provide only decimal or only binary). We considered three proposals:

Hough suggested that we require only one format and let the market decide which formats are important. Cowlishaw noted that in decimal, the compact (32-bit) format may be useful for storage, but he did not expect it to be used much for arithmetic. Schwarz agreed, and said he thought he could see the possibility of dropping support for the decimal compact format at some point.

Kahan thought we would send the wrong message by requiring double but not single precision. The Matlab environment supports double but not single, a choice which Kahan thought was regrettable. Single precision is useful for bulk data storage, and there are computations which are aided by use of mixed single and double precisions. Furthermore, it would be vexing to portable numerical software authors to be unable to rely on the presence of at least a single precision format across platforms.

Several committee members noted that it is possible to implement a narrow format with a sufficiently wider format and a conversion instruction. Kahan noted that the original RS6000 machine was optimized for double precision arithmetic, but could handle single precision data. He was unsure why later versions of the architecture optimized single precision arithmetic; Hauser thought it was to improve SPEC marks. Hauser also noted that an implementation could, in principle, provide hardware support for only double precision, and implement in software the conversion operations needed to manage single precision. The standard only mandates capabilities, not speeds, though as Kahan often notes, anything too slow won't be run.

Thomas noted that the committee's general policy has been to make features required rather than optional. Not requiring all formats seems a break with that policy. Zuras thought we would send the wrong message by not requiring all formats for at least one radix, though some of them might be in software. Hauser countered that DSP designers find binary single alone to be a useful subset.

From preliminary polls, we determined that most of the committee seemed to prefer the tower of precisions proposal. Hough then moved that for any binary format, implementors must include single, single and double, or single and double and quad. The same will hold for decimal. Either binary or decimal are required, but implementors need not support both. The vote was 17 for and 1 against.

Extended formats

Hough argued that we should discuss extended precisions not in the context of formats, but rather in the the context of expression evaluation. He noted that, as it stands, the extended formats are under-utilized because of inconsistent language bindings and hardware support. Optimizers sometimes use those formats, but too often they use them inconsistently.

Kahan responded that we must continue to describe the extended formats in the standard. If we omit their description, we'll never be able to access them, and there are many computations which are made easier by a few extra bits in the exponent and significand. Results will not be exactly reproducible across platforms, since the details of the extended precision may vary; but the style of arithmetic is predictable, and that is sufficient for programs written in a certain style. The 128 bit format is not yet a viable alternative, since it remains too expensive: the few people willing to pay for fast quad arithmetic are also willing to pay analysts sufficiently skilled to do the job without quad precision.

Someone noted that the historical problem with extended formats is that they are underdefined; Hough retorted that they are not underdefined in practice, since there is only one popular extended format. Darcy thought there was utility to providing a description of arithmetic formats beyond the basic formats, parameterized by width and exponent range. Zuras and Hough questioned the utility of describing anything but the Intel 80 bit format; Kahan and Riedy disagreed. Thomas noted that he would prefer not to invite funny arithmetic by not providing a definition.

Hauser noted that in the current standard, there can be as many as three formats involved: two input operands and one output. He suggested that we could reasonably define arithmetic for any arithmetic formats by specifying that all operands are converted to the widest format involved. Kahan responded that, at present, the destination must be wider than the operands, and the standard specifies that the arithmetic be done as though the operands were in the destination format. Hauser agreed, but thought we should define expression evaluation in a way that permits arbitrary formats (subject to the constraint that the destination be wider than the operands).

At this point, the discussion stalled, and Zuras declared that it was time for a break. Hough asked for guidance for the next step in editing. Any extended format discussion he could write, he warned, would not tell much. Kahan thought that was unfair: information about the lower bounds on the available precision and exponent range for extended formats is useful.

Redundant and canonical representations

Hough raised the possibility of specifying a canonical representation when the encoding has redundancies. For example, in the proposed decimal encoding, there are many possible representations for positive infinity; should we specify in the format section a canonical representation to be returned when a new infinity is generated?

Those who supported specifying canonical representations were motivated by testing. Others replied that the whole point of redundant representations is that they are indistinguishable unless you look at bits. The primary objection to specifying canonical representations in the format section, though, was that we had not yet discussed whether the redundant encodings are really equivalent. In the context of decimal, we have a proposal to encode additional information via the choice of representation. Until we have discussed the implications of that proposal, it seems premature to specify canonical forms for redundantly encoded ordinary values. For redundantly encoded special values like NaN and infinity, we did not all agree on the usefulness of canonical representations. Also, there may be value in permitting implementations to use the redundancy for non-arithmetic purposes (like storing information for retrospective diagnostics)

After the discussion, the committee voted to remove the discussion of canonical representations from the current draft: 17 for, 1 against.

Signaling NaNs redux

We began with a brief discussion of Eric Shade's use of signaling NaNs. Shade uses signaling NaNs to represent non-numeric data in his dynamically-typed language. Quiet NaNs then represent either ordinary invalid results from numerical operations (like 0/0) or results from arithmetic operations applied to non-arithmetic data. The fact that the NaN is signaling is unimportant in Shade's current implementation -- he just needs a set of NaNs which cannot be produced by ordinary arithmetic. Kahan suggested that the signaling behavior will eventually be important to Shade, since his users will care about the difference between a loaded or arithmetically produced NaN and a NaN that occurred because of a type violation.

Zuras argued that Shade could accomplish his goal without signaling NaNs, and that Gentleman's application is less compelling because his code is, at present, lost and gone. Kahan countered that the fact that two instances came up means we will find more, possibly hundreds. Liu put the position more strongly: we are incapable of doing due diligence. Signaling NaNs are a part of the IEEE 754 standard, and it would be irresponsible to remove them in 754R. Kahan agreed that he felt surly about having signaling NaNs in the standard, but thought we were stuck with them. He felt it would only be worthwhile to get rid of signaling NaNs if they were intrinsically horribly expensive, and that he was unsure why others felt it so expensive. Our efforts, said Kahan, should not be directed toward expunging signaling NaN, but toward supporting them in an economical way.

Kahan also argued that signaling NaNs have several uses, but that they would be far more useful if they were properly supported by the programming language community. Hauser noted that a primary reason signaling NaNs did not enter the C standard is that the C committee found no compelling use. Thomas noted that the C99 group did try to support signaling NaNs, but failed. While he didn't remember all the reasons behind the C99 group decisions, he said a few were related to the lack of specification about where conversion takes place. Does a move of a signaling NaN call a signal or not? It's unclear whether assignments, function calls, or text output should cause the signal. Another point is optimization. The C99 group tried to be careful about preserving signal semantics, but if the semantics of signaling NaNs are carefully respected, then even such optimizations as 1*x == x are illegal. Liu noted that Sun faced the same problem with signaling NaNs on function return -- the interface to produce a signaling NaN on the x86 cannot produce sNaNs in the float or double format, since the 80-bit signaling NaN which they create is turned into a quiet NaN on the type conversion.

Hough, Darcy, Hauser, and Scott all noted that any usefulness of signaling NaNs should be balanced against implementation costs. The hardware cost of signaling NaNs may not be great, but the semantics are complicated, and that causes problems for language designers, library designers, and validators. Hough also pointed noted that signaling NaNs without traps are not particularly useful.

Thomas also raised the issue of compatibility; his concern was seconded by Riedy. By removing signaling NaNs, we would invalidate all existing implementations, unless we included a grandfather clause. While it would be possible to force all NaNs to be quiet by installing an invalid trap handler, that would be a performance hit whenever invalid operations occurred.

Hough noted that we'd already invalidated PA-RISC by specifying how signaling NaNs are distinguished from quiet NaNs in 754R. Hauser recalled that MIPS once used the same conventions as PA-RISC for quiet and signaling NaNs. MIPS-based machines are still alive and well; they are used in embedded systems, and NEC makes MIPS-based machines for Japan.

Kahan moved that the draft contain a specification for signaling NaNs which will distinguish them from quiet NaNs when, at the very least, arithmetic operations act on them. The move passed: 11 for, 5 against.

Draft review

We made it through the first page of section 3.1 in the draft review; then we ran out of steam and broke for dinner. We decided to replace the two definitions for binary floating point number and decimal floating point number by a general definition for floating point number followed by two shorter definitions for the binary and decimal cases. Thomas objected to the reference to algebraic operations in the glossary entry for NaN; he suggested just listing them instead. James suggested that we specify the behavior for four types of operations: arithmetic, transcendental, copy, and comparison. Then we argued about the definition of normal number; some members thought it was insufficiently clear that a normal number has a normalized representation, but it may have other representations as well.

754 | revision | FAQ | references