Thread Links Date Links
Thread Prev Thread Next Thread Index Date Prev Date Next Date Index

Re: Please listen to Ulrich here...



Am 25.08.2013 00:10, schrieb David Lester:
Ulrich,

What are the interrupt properties of this instruction?

As you'll recall, what clinched the RISC vs CISC debate was the discovery that the VAX 780 "return
from subroutine" was actually slower than doing the job one register at a time. The reason was that
to get the latency down, RSUB was interruptible.

How many long accumulators do we need? With ARM there are up to eight different PC and SP registers
corresponding to the different register sets for different interrupt modes. Similarly with other
RISC processors, such as SPARC.

Are we going to require _eight_ accumulators for compliant ARM processors? Are we expecting that the instruction will complete in a single clock cycle, as usual? If not why not?

Why are extended double and quad precision not part of your proposed standard?

Dave
These are interesting questions indeed. To answer them all would take me more time than I have right now. I remember that years ago I discussed the interrupt question with colleagues who implemented compilers. The majority of them were of the opinion that a dot product is an elementary operation which whould not be interrupted. But for large parallel machines different answers may be possible.

I agree that the position paper of motion 9 is too general. I attach a shortened and simplified version. It just requires the EDP for double precision and I would be fully satisfied if the standard provides just this.

The question how much of the long accumulator should be supported by fast hardware is somehow a question of taste. I would say a sufficiently large portion. (See the section Hardware accumulation window in my book). This question, of course is connected with the question how many long accumuators should be supported by fast hardware. An answer to these question may be application dependent.

Best regards
Ulrich



On 23 Aug 2013, at 21:42, Ulrich Kulisch wrote:

Dear Ian,

thank you for writing. I think all the problems you see are solved. I assumed that the reader of my mails is more or less familiar with the contents of my book.
Computer Arithmetic and Validity - Theory, Implementation, and Applications. De Gruyter 2008, second edition 2013. You probably find it in your library.

I attach copies of a few pages. You should not be shocked by the large exponent ranges of the IEEE data formats. See the section "Hardware Accumulation Window" in my book. When we implemented these ideas for IBM on /370 mainframes in the 1980ies we only used about 1200 bits for the long accumulator and I am not aware of any problem where this was not enough.

With best regards
Ulrich



--
Karlsruher Institut für Technologie (KIT)
Institut für Angewandte und Numerische Mathematik
D-76128 Karlsruhe, Germany
Prof. Ulrich Kulisch

Telefon: +49 721 608-42680
Fax: +49 721 608-46679
E-Mail: ulrich.kulisch@xxxxxxx
www.kit.edu
www.math.kit.edu/ianm2/~kulisch/

KIT - Universität des Landes Baden-Württemberg
und nationales Großforschungszentrum in der
Helmholtz-Gesellschaft

Attachment: DotProdP1788.pdf
Description: Adobe PDF document