Re: [P1619-2] another P1619.2 question: the EME2 mix function
Overall, I'm hoping we can find a way to avoid changing P1619.2, since we've successfully completed a second Sponsor ballot, and we've only got a little more time to submit the standard to IEEE RevCom before the project authorization (PAR) expires in December. Maybe it would be appropriate to provide some supplementary usage guidance instead of modifying the standard itself.
On Tue, Mar 23, 2010 at 11:07 AM, Laszlo Hars <laszlo.hars@xxxxxxxxxxx>
The EME2 issues in P1619.2 which would be interesting to discuss:
1. There should be some information in the text of the standard about the security vs. the amount of data encrypted. Especially, what a user can expect if more data is encrypted than this limit. Without this we would hear the same questions over and over again.
2. In the Bibliography section there is no reference to EME2, so the user cannot find security proofs, further information. The similarities and differences from the referenced EME* need to be noted.
3. If all wide encryption modes, which are constructed from 128-bit block ciphers, have similar inherent limits, it has to be told; otherwise a user might spend a lot of time for searching for a better alternative, in vain. If the limits were avoidable, that would have been noted, too.
4. Normally, when the limit for the encrypted data amount is approached, some minor, hardly detectable regularity could be expected. In the case of EME2 a very obvious problem is seen. When disks are formatted, when files get wiped, or when database records are initialized, the same data is written to a huge number of sectors, maybe with a sequence number in their first words. These create the situation I mentioned earlier: identical long ciphertext pairs appear (disregarding their first 16 byte blocks).
The probability that this happens is not large in a single drive, but over billions of encrypting disk drives it would be sooner or later found, and publicized, harming the reputation of the manufacturer. It does not matter that these drives have different keys. The found almost identical long ciphertexts reveal the almost equality of the corresponding plaintexts, which is a real leak of information, not just a theoretical weakness.
5. EME2 is too slow for sequential implementations in modern disk drives, where data is streamed at 12 Gb/s, or more. However, table B.2.1 of the Standard is not applicable for parallel implementations; in Galois fields multiplication with a power of x is not a shift-XOR operation, if products with smaller powers are not available. The user would need some guidance how complex is a 128x128 bit Galois multiplication in this special setting. Keep in mind that typical applications work on 4KB sectors, so one has to multiply with up to x^255, which is not of any simple structure when taken modulo of the fixed 128 degree polynomial. In HW implementations large lookup tables are expensive.
I am sorry that we did not raise these points earlier, but only now I had to look into actually implementing EME2 in HW.
Let's briefly discuss this at this Wednesday's SISWG meeting. I seem to recall that the point was brought up and discussed at the P1619.2 face-to-face meeting in San Diego a couple years ago, and the general consensus was that you should avoid encrypting an amount that is near the birthday bound of the underlying block cipher (AES).
Matt Ball, Chair, IEEE P1619 Security in Storage Working Group
Principal Software Engineer, Oracle USA, Inc.
500 Eldorado Blvd, Bldg #5 BRM05-212, Broomfield, CO 80021
Work: 303-272-7580, Cell: 303-717-2717