The third meeting of the IEEE 754R revision group was held Wednesday 14 March 2001 at 1:00 pm at Network Appliance, Bob Davis chair. These notes are intended to give an informal flavor of the discussion. Attending were David Bindel, Joe Darcy, Bob Davis, David Hough, Jim Hull, W Kahan, Alex Liu, Jason Riedy, David Scott, Neil Toda, Dan Zuras.
A mailing list has been established for this work. Send a message "subscribe stds-754" to firstname.lastname@example.org to join. Knowledgeable persons with a desire to contribute positively toward an upward-compatible revision are encouraged to subscribe and participate.
The next meeting was scheduled for Weds 11 Apr at Network Appliance. The subsequent meeting will be Weds May 9, a late afternoon-early evening meeting at UC Berkeley.
The draft notes of the previous meeting were corrected: 854 was never intended to encompass any radix but 10.
Scope and Purpose . The existing purpose of the 754R effort refers to identical results. This has never been a universal literal aim of 754 or 854 - running afoul of various good and bad hardware and software features within systems that are still considered to conform to 754. If this aim is not considered to be worth trying for in 754R, perhaps the stated purpose should be amended accordingly. For instance, the optimal cache blocking algorithms for matrix multiply depends on cache size, which varies among machines, affecting the order of roundoff, even though in this case identical error bounds could be derived valid for all implementations.
754R-200x will be a new standard, and it is conceivable that it might not be strictly upward compatible.
The programming environment is what's standardized - it must at least be predictable and preferably controllable (by the programmer, and economically). In his Java and Marketing papers on his website, Kahan mention rules of thumb to achieve these aims.
So we need proposed revised Scope and Purpose of standard for the next meeting. 754 was (generally misunderstood to be) a hardware standard. 754R should be more clearly a language standard. Hough recommends casting 754R as a set of requirements on a C9X implementation. Others thought language bindings to C9X would be suitable for an appendix. Still others commented that the important thing was a test suite (e.g. ucbtest) - which defined in effect what languages were specified. Kahan responded that test suites arise by funding graduate students.
Bob Davis commented that he hoped the viewpoint would be what would be right in 25 years rather than what seems convenient in a short term.
Resolution of 854 discrepancies with 754 - Kahan. There were two known discrepancies - the question of whether conversion to integral value in integer or floating format should raise inexact, and whether the recommended function specifications were congruent.
As to the inexact integral issue, an explicit conversion like y = AINT(x) or i = INT(x) or i = (int) x; should not raise inexact; there is no error due to insufficient precision, the usual meaning of inexact; the meaning of inexact would been overloaded if it were also used as an indication of a non-integral value. In contrast, an implicit conversion such as Fortran's i = x might well raise an inexact exception for the purpose of warning of an error in a program that intended to manipulate integral values within floating formats - common in many early RISC designs that omitted integer multiplication - or to detect an inappropriate compiler optimization of e.g. (a*b)/c.
[As to the appendix functions, Joe Darcy subsequently reported: With regard to 754 vs. 854, the standard gives different definitions of how logb behaves on subnormals; 754 returns Emin -1 and 854 returns Emin. With the 854 definition, all subnormals can be detected as finite numbers where scalb(x, -logb(x)) is between 0 and 1. This is true for most, but not all, subnormals with the 754 definition. In my version of the recommended functions, I also provide a fully normalizing version so scalb(x, -lobgn(x)) would always be in [1, 2). Neither standard provides a good definition of scalb and neither nails down the behavior of copysign with a NaN sign argument.]
1596 - David James edited down the descriptive text mostly to bit patterns. Kahan noted that pedagogically, understanding comes easier when floating-point numbers are regarded as scaled integers rather than scaled fractions. Encoding pictures are not needed in the standard, but could be relegated to an appendix.
ACTION - Kahan and David James should get together to compactify the external representation descriptions as much as possible, and no more.
Extended formats - From a C point of view, "long double" might be double-extended or it might be quad. A guaranteed quad format should have a different name in C - like "quad."
Hough suggested that single-extended be dropped from the standard. This was tabled until the next meeting.
Dynamically-variable precision is another proposed extension. Zuras commented that the standard should specify fixed higher precisions and not (yet) dynamically-variable precision. Hough commented that the real use of the latter was in interval arithmetic, and there was no consensus on how that might be specified syntactically in C. So this issue will be continued next time.
Transcendentals - these are too hard to standardize now because nobody knows how to tradeoff precision vs performance. If less than correct rounding is specified (as Java first did by specifying the algorithm) then is always possible that somebody will come up with a non-standard algorithm that is more accurate AND faster. This can't happen with a correct rounding specification, but correctly-rounded transcendental functions seem to be inherently much more expensive than almost-correctly-rounded. One could instead standardize properties that approximate functions are supposed to obey - but anomalies still abound. All these points argue against standardizing transcendental functions now.
[Joe Darcy subsequently reported: Java currently finesses this issue by providing two versions of the math library. The StrictMath version must use the fdlibm algorithms; the Math version can use any algorithm that has 1 ulp accuracy and is semi-monotonic (IIRC, the accuracy requirements vary somewhat by function).]