Thread Links			Date Links
Thread Prev	Thread Next	Thread Index	Date Prev	Date Next	Date Index

Re: Requiring the EDP in 754 (or 1788.1 for that matter)

To: Michel Hack <mhack@xxxxxxx>
Subject: Re: Requiring the EDP in 754 (or 1788.1 for that matter)
From: David Lester <dlester@xxxxxxxxxxxx>
Date: Wed, 27 Jan 2016 15:34:35 +0000
Cc: David Lester <dlester@xxxxxxxxxxxx>, stds-1788 <stds-1788@xxxxxxxxxxxxxxxxx>, stds-754 <stds-754@xxxxxxxxxxxxxxxxx>
Delivered-to: mhonarc@xxxxxxxxxxxxxxxx
In-reply-to: <201601271434.u0REYZjr015954@d01av04.pok.ibm.com>
List-help: <https://listserv.ieee.org/cgi-bin/wa?LIST=STDS-1788>, <mailto:LISTSERV@LISTSERV.IEEE.ORG?body=INFO%20STDS-1788>
List-owner: <mailto:STDS-1788-request@LISTSERV.IEEE.ORG>
List-subscribe: <mailto:STDS-1788-subscribe-request@LISTSERV.IEEE.ORG>
List-unsubscribe: <mailto:STDS-1788-unsubscribe-request@LISTSERV.IEEE.ORG>
References: <201601271434.u0REYZjr015954@d01av04.pok.ibm.com>
Sender: stds-1788@xxxxxxxx

> On 27 Jan 2016, at 14:27, Michel Hack <mhack@xxxxxxx> wrote:
> 
> 
> Now, the 754-2018 revision is going well, so perhaps we can revisit the
> initial restraint.  I urge those who care to join the 754 working group.

Hopefully, I have already done this, but if not, can you forward the following for me,
Michael?

> In the meantime I take every opportunity to remind my colleagues at IBM
> who are involved in processor design of the properties and advantages of
> Complete Arithmetic (as we had 30 years ago for HFP, easier due to the
> narrower exponent range of the old S/360 format) -- but I have no direct
> influence.

I’ve been wondering about the ARM-NEON instructions and how to utilise them
as a long accumulator. If we assume that they were treated as 32x32 bit,
then the following four operations could be implemented:

    EDP_ZERO             vmov.I32Q{0-15} #0
                         !! Might be a single instruction; might need replication

    EDP_SAVE    <addr>   !! A sequence of instructions to move vector registers to memory (slow)
    EDP_RESTORE <addr>   !! A sequence of instructions to move memory to vector registers (fast)

Of course the missing one is:

    EDP <addr_x> <addr_y>
                         !! MUF r0, <addr_x>, <addr_y>
                         !! split r0 into a signed mantissa, and an exponent
                         !!  .. and then the magic bit:

                         !!   AN INTEGER ADD to the 1024 bit vector in Q{0-15}.

Like you, I have no influence over ARM’s engineers, but a suggestion
that we should have a vector add/sub for 1024 bit accumulators, would
permit very fast EDP — as well as fast GMP, with all that that means for
crypto-applications.

I’ll suggest it to them..

Dave

> 
> Michel

References:
- Requiring the EDP in 754 (or 1788.1 for that matter)
  - From: Michel Hack

Prev by Date: Re: How to merge required operations under a common name
Next by Date: Re: How to merge required operations under a common name
Previous by thread: Requiring the EDP in 754 (or 1788.1 for that matter)
Next by thread: Requiring the EDP in 754 (or 1788.1 for that matter)
Index(es):
- Date
- Thread