Thread Links Date Links
Thread Prev Thread Next Thread Index Date Prev Date Next Date Index

1619: LRW collision probabilities



I'm sorry about being so late with this email: I was asked to provide
the calculations of collision probabilities for LRW. The setting that
is considered is a known-plaintext attack, so pretty much everything
below assumes that the attacker knows all the plaintext.

The event that we need to defend against is "accidental collisions".
Recall that on plaintext block P at location I, the LRW encryption
procedure computes X=P xor (I*K2), then Y=AES(X) and the ciphertext
block is C=Y xor (I*K2). A collision is an event where we have

  X=P xor (I*K2) = P' xor (I'*K2)=X'

for some I'<>I. The reason that this is bad is because an attacker that
knows P,I and P',I' (and knows that this is a collision) can compute K2
for example by setting K2=(P xor P')/(I xor I'). (The same can also be
computed as K2=(C xor C')/(I xor I').)  Once K2 is known, LRW does not
provide much more security than plain ECB. (See some discussion at the
end about how an attacker can identify collisions.)


The question that was asked was whether the bound on the collision
probability (cf. Clause 5.1 in 1619-D5) should use the amount of
storage under one key or the number of block-encryptions under one
key. In other words, if we have a small storage that we overwrite many
times, do we get better collision probability than if we have a large
amount of storage that we only write once.

The answer for the most part is that the thing to count is the number
of block-encryptions. So writing the same small storage many times is
pretty much the same (in terms of the collisions) as having a large
storage. The computation is pretty easy and I demonstrate it below
with an example:

--------------------------------------------------------------------
Compare a setting where you have a 16GB storage (=2^34 bytes) that you
overwrite 1024 times to the setting where you have a 16TB storage (so
in both cases you perform 2^40 block-encryptions). Consider first the
case of small storage. Denote by P_{I,j} the j'th plaintext block that
was written in location I. A collision event occurs if there exist two
pairs (I,j) and (I',j') with I<>I' such that

  K2 = (P_{I,j} xor P_{I',j'}) / (I xor I')

The number of such pairs is 2^40*(2^30-1)*1024/2, because every pair
(I,j) can potentially collide with all other pairs *except* the ones that
have the same I component. Hence, a good estimate for the probability of
collisions (when K2 is chosen at random) is

 (2^40*(2^30-1)*1024)/(2*2^128) ~= 2^{-49}*(1-2^{-30})

In the case of large storage, of course, the number of possible event
collisions is (2^40 choose 2), and the estimate for collision probability
would be

 (2^40 choose 2) / 2^128 ~= 2^{-49}

So the difference between the two cases is only the factor of 1-2^{-30},
which is pretty much equal to one.
--------------------------------------------------------------------

An attacker can identify collisions by checking that P xor P' = C xor C'.
This event can also happen when there is no collision, but it provides
a very significant statistical evidence for a collision. (In fact even
if the attacker only knows a few bits of P and P', it can already check
that the above equation holds for the bits that it knows, hence getting
some statistical evidence for a collision.)

Searching for collisions is also easy: for each pair (P,C) the attacker
computes D=P xor C, and then it looks for pairs that yield the same value
of D. This can be done with time almost linear in the number of pairs (if
you have linear space) and many time/space tradeoffs are possible.

-- Shai