Thread Links Date Links
Thread Prev Thread Next Thread Index Date Prev Date Next Date Index

Re: Requiring the EDP in 754 (or 1788.1 for that matter)




On 30 Jan 2016, at 23:03, Ulrich Kulisch <ulrich.kulisch@xxxxxxx> wrote:

But the real point is this:  A correctly rounded dot product is slower by at least one magnitude 
than a possibly wrong computation of the dot product in conventional floating-pooint arithmetic.
An exact dot product would be 6 times faster than the latter. So
the EDP is at least 60 times faster than a possibly wrong correctly rounded dot product.
Speed and accuracy are essential for acceptance and success of interval arithmetic.

No Ulrich, it is _you_ who is refusing to pay attention.

(1) The world of chip design has moved on from 1985. Then, it was a
      handful of specialists like Seymour Cray, now everyone can do it.

(2) We have hit the “Wall” in mid-1990s. You will never get processors
      with clock speeds of more than a few GHz at economic prices, unless
      there is a new non-silicon technology invented. There’s nothing on
      the horizon yet.

(3) Thus the real interest in chip design lies not in the uni-processor,
      but many-core processors (hundreds of CPU’s on each one square
      centimetre die).

(4) What I and David Baincolin from Berkeley have been pointing out
     is that the guts of what you wish to achieve can be achieved more
     economically using tweaks to stock chip designs, which do not
     have the bad side-effects your almost religious belief in a mandated
     hardware solution have.

(5) In particular, if we posit the implementation of 1024-bit add/sub
      instructions working on the upper and lower half of the ARM
      NEON SIMD unit registers, then far from a factor of 60 you
      mention, we’d be in the territory of about factor two or four,
      asymptotically.

      However, this unit is an optional extra for high-end processors
      (CORTEX-A) and is not available on the billions of cores shipped
      each year which are usually CORTEX-M0, or CORTEX-M4.

(6) You appear to be working on the assumption that by mandating
      “hardware” EDP you will be expediting the appearance of said
     processors. This is not so. What will get them made is a good
     economic case.

      As you know (because we have discussed this before), ARM’s
      FPU is _not_ fully 754-R compliant. This is because their
      typical customer is not bothered by fully standards-compliant
      hardware.

(7) You have never specified what you mean by “hardware-implementation”.
      Is it permissible to soft-trap the EDP-instruction? Are co-processors
      permissible. Are there — as I suspect — execution time constraints?
      Your rough calculations a few days ago, suggest you harbour the
      naive belief that each instruction is executed in a single clock-cycle.

(8) Finally you might wish to comment on the national security implications
     of your EDP proposal. As mentioned in (5) above you are perilously
     close to proposing extremely fast integer operations, especially if
     you want to combine two EDP calculations done in parallel (see (3)
     above). A chip with these features may well suffer from export
     licence restrictions — if not now, then in the near future. This
     surely has implications on the number of devices which could be sold..

Regards,

Dave Lester

Ignoring these facts is a terrible service to the standard.