Re: Requiring the EDP in 754 (or 1788.1 for that matter)
On Sun, Jan 31, 2016 at 9:50 AM, David Lester <dlester@xxxxxxxxxxxx> wrote:
> On 30 Jan 2016, at 23:03, Ulrich Kulisch <ulrich.kulisch@xxxxxxx> wrote:
>
> But the real point is this: A correctly rounded dot product is slower by at
> least one magnitude
> than a possibly wrong computation of the dot product in conventional
> floating-pooint arithmetic.
> An exact dot product would be 6 times faster than the latter. So
> the EDP is at least 60 times faster than a possibly wrong correctly rounded
> dot product.
> Speed and accuracy are essential for acceptance and success of interval
> arithmetic.
>
My comments are interspersed below, mostly respectfully.
>
> No Ulrich, it is _you_ who is refusing to pay attention.
>
> (1) The world of chip design has moved on from 1985. Then, it was a
> handful of specialists like Seymour Cray, now everyone can do it.
That is not a change. Anyone could do it in 1964. The fact that few
did is not relevant to your premise.
> (2) We have hit the “Wall” in mid-1990s. You will never get processors
> with clock speeds of more than a few GHz at economic prices, unless
> there is a new non-silicon technology invented.
Yes.
> There’s nothing on the horizon yet.
No. In this domain your ignorance is showing. In fact there are
several. HP's committment to memristors is very encouraging. See
"https://en.wikipedia.org/wiki/Memristor".
>
> (3) Thus the real interest in chip design lies not in the uni-processor,
> but many-core processors (hundreds of CPU’s on each one square
> centimetre die).
No. Collectively we are already recycling (as scrap) GPUs with a
thousand or more processors. One hundred CISC processors per CPU chip
is an uninteresting goal in that regard. 1e5 FPU processors per chip
starts to get interesting. Especially if those processors are not for
typical (partial) implementations but for rigorous (complete)
implementations.
> (4) What I and David Baincolin from Berkeley have been pointing out
> is that the guts of what you wish to achieve can be achieved more
> economically using tweaks to stock chip designs, which do not
> have the bad side-effects your almost religious belief in a mandated
> hardware solution have.
I must have missed something. Other than space and bandwidth for
context switching (c.f. barrel register files that only require a base
pointer reference to switch), what are the bad side effects of a
mandated hardware solution?
I suspect you might state the same kind of objection to modern CISC
GDT and IDT tables, which are not only large, but variable length.
> (5) In particular, if we posit the implementation of 1024-bit add/sub
> instructions working on the upper and lower half of the ARM
> NEON SIMD unit registers, then far from a factor of 60 you
> mention, we’d be in the territory of about factor two or four,
> asymptotically.
The above statement needs a qualifier. What are the aysmptote
sampling points? 100? 100,000? 4e9? 1e100?
> However, this unit is an optional extra for high-end processors
> (CORTEX-A) and is not available on the billions of cores shipped
> each year which are usually CORTEX-M0, or CORTEX-M4.
Too bad for them. Your observation is a backward-looking datum, which
should not be used to constrain the future.
>
> (6) You appear to be working on the assumption that by mandating
> “hardware” EDP you will be expediting the appearance of said
> processors. This is not so. What will get them made is a good
> economic case.
Wrong. C.f., MC68000 vs i8086.
Do you know why IBM choose the i8086 at a time when the MC68000 was
winning over 70% of CPU design ins (not just over intel, but over all
competing CPUs)?
Because the weaknesses of the intel offering made it possible for IBM
to purchase a substantial share of Intel's equity. That ownership
position gave IBM a lot of information and control. THAT was the
compelling "economic" case.
See also Mitre's idiocy in the selection of the fundamental
communications architecture for JTIDS. I claim that it is _never_
useful to get the wrong answer quickly. Much less the wrong answer
slowly.
Please, let us not help that happen again.
Personally, I have _way_ too much experience wrestling with pre-'754
numerical systems. And I have no interest in a system that is not as
rigorous.
IEEE '754-1985 was an unbelieveably huge improvement. Counting those
blessings takes a while. But was it perfect? Certainly not. It has
terminology issues and unspecifed behavoir issues that are _still_
extant today. I'm not hlding my breathe.
However, there is simply no excuse for repeating those kinds of
defects and omissions. "Economic cases" be damed. I want the "right
answer" no matter what it takes.
>
> As you know (because we have discussed this before), ARM’s
> FPU is _not_ fully 754-R compliant. This is because their
> typical customer is not bothered by fully standards-compliant
> hardware.
OK, then we need a fuzzy "standard" for the "typical" customer and a
rigorous standard for people who actually know something about the
problem domain.
Do you approve of spreadsheets hiding rounding from users? If so,
please read more of Prof. Kahan's notes on such self-injuring
policies.
Do you use MICROS~1's spreadsheet? Do you know why open source
spreadsheets offer two indepenent collections of functions? One set
helps people who's interest is dominated by getting the right answer.
The second set help's people who's interest is dominated by getting a
compatible, but wrong, answer.
Personally, I am not interested in the second set.
>
> (7) You have never specified what you mean by “hardware-implementation”.
> Is it permissible to soft-trap the EDP-instruction? Are co-processors
> permissible. Are there — as I suspect — execution time constraints?
> Your rough calculations a few days ago, suggest you harbour the
> naive belief that each instruction is executed in a single
> clock-cycle.
A valid point.
>
> (8) Finally you might wish to comment on the national security implications
> of your EDP proposal. As mentioned in (5) above you are perilously
> close to proposing extremely fast integer operations, especially if
> you want to combine two EDP calculations done in parallel (see (3)
> above). A chip with these features may well suffer from export
> licence restrictions — if not now, then in the near future. This
> surely has implications on the number of devices which could be sold..
That issue is irrelevant. And it is mischaracterized. There are no
"national security implications" for hardware EDP. I happen to know
the ITAR resrtictions pretty well. As soon as a single hardware EDP
design has been published in book or simiar form, ITAR cannot be
applied to that technology.
For verification of the above claim talk to any experienced cryptographer.
> Ignoring these facts is a terrible service to the standard.
Absolutely agreed. (But I am not sure that I attributed the above
quote correctly).
Lee Winter
Nashua, New Hampshire
United States of America