[
Date Prev][
Date Next][
Thread Prev][
Thread Next][
Date Index][
Thread Index]
Re: agenda
> > Note that the Intel/AMD IA32 implementation of double-extended is obsolete,
> > with modern Intel/AMD CPU's focusing their performance efforts on the new
> > SSE instruction set
>
> As a heavy user of extended-precision arithmetic, I fervently
> hope that is not the case!
It's explicit in the AMD64 reference manual and perhaps implicit in Intel's
current IA32 documentation: the extended precision registers are for
backward compatibility and the target for performance is the the SSE2
64-bit registers.
While the "extended" may have something to do with that - it does have
little costs associated with saving odd-sized temporaries or the whole
register file on a context switch - the main problem with IA32's extended
register file is its stack architecture and its small size. Both
make standard optimizations like
loop unrolling and software pipelining difficult. The stack architecture
also contributes to unique kinds of 754 invalid exceptions when over-pushed
or over-popped that led some Windows versions to take control of the
the trap enable mask from user mode.
Thus in the future, most high-performance computing applications on IA32
and AMD processors will be targeted to the SSE2 registers rather than the
extended-precision stack. That's why I don't think it really matters what
754R does about extended - IA32-compatible implementations will implement
IA32 extended and nobody else will implement IEEE extended no matter what
754R says. The fate of the
other IEEE extended architectures that come to mind - 8070, ELXSI, m68k,
IA64 - was all sealed by other considerations than extended precision
arithmetic but neither can one say that extended helped much.
Of the many graphics
and signal processing systems that use the general idea of single-extended for
computing on single-precision data, evidently none of the successful ones
provide full IEEE single-extended with gradual underflow, rounding modes,
and exceptions.
So in my view IEEE extended is an accidental victim of architectural decisions
made during the "reduce the semantic gap" era epitomized by the Intel 432.
My first job after Berkeley in 1976 was microcoding floating-point for a stack
architecture implemented on AMD 2900 bit slices. Later, by the time I got to
Sun in 1984, everybody had accepted the Berkeley
RISC gospel and had an instruction-set design project going.
That was a bit too late for PC's however, but by virtue of volume,
a large part of the industry
is now stuck with incrementally evolving the IA32 PC architecture toward
a higher-performance yet upwardly-compatible architecture, that is now getting
appallingly complex.
The lesson for 754R might be that too much higher performance and not enough
upward compatibility with 754 might doom our efforts to be ignored.
Or is the lesson that
754R can only hope to influence new instruction set architectures
anyway, so it might as well not be unduly bound by backward compatibility?
Thus our discussion of extended precision carries so much weight.