Thread Links Date Links
Thread Prev Thread Next Thread Index Date Prev Date Next Date Index

Re: Exact dot product



Siegfried,

.. And the loss is quite significant: interrupts.

Or perhaps I should say interrupts with sensible latency. I think if I
was doing this I’d make the EDP be a single instruction, with length,
two starting addresses and result held in four ordinary registers. The
whole instruction is then made non-interruptible. The down-side is that
interrupt latency is unpredictable (but probably large).

In a bolt-on co-processor this may not matter so much, as the stock
processor can service the interrupts whilst the co-processor just
gets on with things. However, notice a second effect: we cannot
use the EDP within a standard OS setup, without a lot of extras,
i.e. the EDP unit becomes an OS allocated resource.

So the next thought I had was: “Well what about supercomputing?
Is there a place for this idea there?” The Human Brain Project
is bringing me into contact with people I wouldn’t normally meet
such as the directors of the Swiss and German Supercomputer
Centres. The issues for them are: cheapness of the processors
(which are stock hardware), repeatability/check-pointing and
controllability, i.e. can the sys admin easily detect and delete
errant programs. If the EDP is implemented for 64-bit addressable
architectures (as it probably would be on a supercomputer) then
we’d need a way to abort the EDP units. And ideally a way
to restart them after a check-point.

All of this is pointing to the need for a read/write of the big
accumulator from/to memory. Then the split of the instructions
made by Jim, Jack and David makes sense. We can now interrupt
between each instruction, and save and restore the big
accumulator. If the processor is provided with enough big
accumulators, then this may not happen very often as we can
use the multiple accumulators as part of a RISC register
window.

But now the issue is: “What is the optimal number of
long accumulators to have?” And that will depend on a lot
of awkward to measure features such as the OS, typical
job mix, and the threading properties of the programming
language used. (BTW the typical supercomputer use-case
will be data-basing; because it always is!)

I think the “TL; DR” take-home is: it’s certainly an interesting
idea, but I cannot see the business case for EDP in most of
the settings I’m familiar with. But perhaps I’ve missed
something?

Regards,

Dave Lester

ps I’m taking the caveat at the end of Section 2 on page 3 at face value:

     “While the user is not precluded from reading intermediate values
      between separate dot products, the EDPA cannot be interrupted
      while computing a dot product.”


> On 1 Jul 2015, at 06:00, Siegfried M. Rump <rump@xxxxxxxxxxxxx> wrote:
> 
> As you know there are other fast methods for fast and
> accurate summation and dot product, including reproducible
> results.
> If you want to stick to the long accumulator, utilizing
> the old idea of overlapping chunks in Malcolm's 1971
> Comm. ACM paper may further improve the performance
> without loosing something.
> "The long accumulator" may be regarded as Malcolm's
> approach without overlapping.
> 
> Siegfried M. Rump
> 
> 
> 
> Am 19.06.2015, 23:02 Uhr, schrieb James Demmel <demmel@xxxxxxxxxxxxxxxxx>:
> 
>> You might find the attached class project report interesting.
>> Two grad students here, David Biancolin and Jack Koenig (cc-ed)
>> used the hardware design tools being designed at Berkeley to
>> design a long accumulator, and measure its chip area and
>> power requirements at a fairly deep level of detail. Please contact
>> them if you have further questions.
>> 
>> Jim Demmel
>> 
> 
> 
> -- 
> =====================================================
> Prof. Dr. Siegfried M. Rump
> Institute for Reliable Computing
> Hamburg University of Technology
> Schwarzenbergstr. 95
> 21071 Hamburg
> Germany
> phone +49 40 42878 3027
> fax +49 40 42878 2489
> http://www.ti3.tuhh.de
> 
> and
> 
> Guest Professor at Waseda University
> Faculty of Science and Engineering
> 3-4-1 Okubo, Shinjuku-ku
> Tokyo 169-8555
> Japan
> phone/fax in Japan +81 3 5286 3330
> 
> ---
> Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft.
> https://www.avast.com/antivirus