Thread Links Date Links
Thread Prev Thread Next Thread Index Date Prev Date Next Date Index

Re: [P1619-2] another P1619.2 question: the EME2 mix function


You are correct: the probability of a collision is pretty low.

When formatting a 2TB disk to 512 Byte sectors (~2^32 sectors), the
chance of a collision is about 2^-65. There are 10^9 encrypting disk
drives produced in a year, 50% increase in every year, so in 5 years
we could count 1+1.5+1.5^2+1.5^3+1.5^4 ~= 13.2 billion drives. If they
are reformatted 4 times in the average during this 5 year period, we
could see close to 2^36 times of the probability of a long collision:
2^-29 ~= 1e-9. Not too bad. If it is "non-negligible", depends on your

One can argue that the numbers above are off by an order of magnitude,
or two: not many users will watch for a collision, but on the other
hand, the capacity of disk drives will also rapidly increase. In any
case, the chance of accidental collisions is about one in a billion.
We can probably live with that, but ought to tell some numbers to the
users, which was my point.

Of course, actively searching for collisions in one disk drive can
write to the same location repeatedly. If the encryption is performed
in the host computer, the encrypted blocks could be overwritten in the
buffer, never actually get written to disk. Searching for the equality
of a few words of ciphertext blocks would need a second hard drive for
a sorted list, and only at a short match the attacker recreates and
checks the rest of the sectors.

With the crypto instructions of modern CPUs EME2 encryption could be
performed in about 5,000 clocks per sector, at a 2.5GHz processor half
a million sectors a second. A month long search gives: 30*24*60*60*5e5
= 1.3e12 ~= 2^40 encrypted sectors in one disk drive. It finds a
collision at a chance 2^-49, which is still low enough when 1,000 CPUs
of 12 cores each are running parallel.

Don't you agree that some numbers should be referenced in the text of
the standard?


On Tue, Mar 23, 2010 at 4:56 PM, Hal Finney <hal.finney@xxxxxxxxx> wrote:
> On Mon, Mar 22, 2010 at 4:17 PM, Laszlo Hars <laszlo.hars@xxxxxxxxxxx> wrote:
>> There might be a problem with EME2. Its mixing layer does not seem to be
>> very secure:
>> If the XOR of all the ciphertext blocks PPPi of top layer of encryptors
>> happens to be the same at two sets of input plaintext blocks, the mix value
>> M1 remains also the same, and so each XORed value M1*a^i in the mix layer
>> also remains the same. This happens at 50% chance among 2^64 encryption
>> operations. Thus, if we keep the P2...Pk plaintext blocks constant for some
>> k≤m<129, and vary the others, after 2^64 random tries we find two sets of
>> input at 50% chance such that the corresponding ciphertext blocks C2...Ck
>> are identical. It distinguishes EME2 from a random permutation. (When
>> varying the address of the target sector T* would change pseudo randomly. It
>> is just XORed to the PPPs, which does not affect the random search.)
>> A modern disk drive contains 2TB/512 ~= 2^32 sectors, which are filled up
>> pretty soon. There are close to 10^9 ~= 2^30 encrypting disk drives
>> manufactured a year, and so some user somewhere will find this strange
>> situation with equal C2...Ck blocks with non-negligible probability.
> This last part isn't correct. With 2^32 trials (sectors) the chance of
> a 128-bit collision in a single disk is only about 1 in 2^65 by
> standard birthday arguments (n^2 / 2H where n=2^32 trials and
> H=2^128). Therefore close to 2^64 disks would have to be used which
> would take over 10 billion years at present rates.
> Hal Finney