Thread Links Date Links
Thread Prev Thread Next Thread Index Date Prev Date Next Date Index

Re: p1619 (disk): ciphertext-stealing, tweak-mapping, other



Laszlo,

On Dec 21, 2005, at 9:17 AM, laszlo@HARS.US wrote:
>
>>> EME* essentially does two passes of ECB mode AES, plus three  
>>> extra AES calls
> It means, that a parallel implementation can perform all of the ECB  
> mode
> AES calls in the top and bottom row (a latency of 2), plus two of the
> extra AES operations in the leftmost column (Figure 2 of the EME*
> paper). We have a total delay of 4 AES operations, plus change. In  
> case
> of XCB the question boils down to how parallelizable is the GHash
> operation. Because it is a hash function, it is basically sequential,
> is not it?

Actually, GHASH is can be implemented in parallel, using an algebraic  
trick.  See the last paragraph of Section 5 in http://csrc.nist.gov/ 
CryptoToolkit/modes/proposedmodes/gcm/gcm-revised-spec.pdf

> This would give the advantage to EME*, unless there was a
> parallel version of the function h in XCB.
>

You mean advantage in terms of latency, right?  I'm not sure that  
this is the case, since both XCB and EME* need to do one pass over  
the data before any data can be output, and I suspect that the  
circuit depth of those two passes isn't much different.  It would be  
interesting to see a detailed comparison.  For that matter, it would  
be worthwhile to discuss the implementation scenarios enough to get a  
good idea of what the "success criteria" for wide-block modes like  
these are.  (E.g. since all of these modes require the data to be  
buffered, what critical path should be measured?  The path to output  
the first byte, or to output all of the bytes?)

> For sequential implementation the speed relations depend on the  
> speed of
> AES vs. a GHash block operation. A GHash step can be implemented  
> faster
> in HW, cannot it be?

Yes, the GF multiply can be done faster than an AES operation, though  
the details matter a lot.  IIRC a multiply can be done in a single  
clock using a circuit that is about 25% the size of a fully pipelined  
AES implementation.

> That would make the sequential XCB faster. How
> about the royalties?
>

Are we supposed to talk about royalties on an IEEE list?  At any  
rate, there has been a UC Davis statement on EME, but no statement on  
XCB.  ABL is IPR free, but is more costly to implement than the other  
modes.

David