Re: Scrambling Principle
> I am working with Rich Taborek at nSerial and want to learn more about
> the scrambler/descrambler function. I was re-reading your
> presentation from the Kauai IEEE Plenary meeting in Nov 1999 titled
> "Low overhead coding proposal for 10Gb/s serial links." Here is the URL:
> On the slide labeled "Scrambling principle," (I think it's page 13)
> you show an example of the scrambler/descrambler parallel form. I am
> having a hard time understanding how the serial form and the parallel
> form are related. Can you explain how the parallel form is derived
> from the scrambler polynomial or give me a reference?
Unfortunately, the parallel scrambler I showed in Kauai had a different
polynomial than the serial scrambler, so you really had no hope of
figuring it out. Sorry about that. I had just cut-and-pasted some
examples that were handy without checking that they were consistent.
So let's work out the parallel version of the 3-bit scrambler example
shown in the URL above. The serial scrambler is given by:
D(t) ----------(X)------------------------------+--> S(t)
| ^ |
| | |
+--[ S(t-3) ]--+--[ S(t-2) ]----[ S(t-1) ]<----+
Where D(t) is input data, S(t) is the corresponding output data at
time t, where t=0,1,2,3... as time increases. Brackets "" denote
latched signals, or a delay of one clock time.
A recursion equation describing the scrambler is:
S(t) = S(t-3) + S(t-2) + D(t)
where "+" denotes modulo-2 addition (exclusive-or function). In
the figure, I have used "(X)" for the same function.
I find it convenient to rearrange the terms and add an offset to make
all the indices positive:
(i) S(3) = S(0) + S(1) + D(3)
(ii) D(3) = S(3) + S(1) + S(0) (subtraction and addition are equivalent)
This corresponds to a scrambler with a polynomial typically notated as
X^3 + X^1 + X^0
Now lets parallelize the function to make a 4-bit wide scrambler. I'll
drop the parenthesis now so S(4) == S4.
We would like to take the four data bits D5-8 and generate the scrambled
bits S5-8 in a single clocked operation.
From the recursion relation (i):
S5 = S2 + S3 + D5
S6 = S3 + S4 + D6
S7 = S4 + S5 + D7
S8 = S5 + S6 + D8
This can implemented as:
D5 ------(X)-+----> S5 output
S2+S3 --->+ +---[ S1 ]--
D6 ------(X)-+----> S6 output
S3+S4 --->+ +---[ S2 ]--
D7 ------(X)-+----> S7 output
S4+S5 --->+ +---[ S3 ]--
D8 ------(X)-+----> S8 output
S5+S6 --->+ +---[ S4 ]--
So, S5-S8 are generated in one clock cycle from D5-D8. One bank of
latches are used to store the previously computed values of S1-S4 and
these are used in the recurrence relation. The critical patch consists
of the S7 and S8 computations. S7 for instance is defined as
S7 = S5 + S4 + D7
S5 is not available in the storage latches, so must be generated by
S5 = S3 + S2 + D6
So the true equation for S7 is:
S7 = (S3 + S2 + D6) + S4 + D7
and can be readily implemented in the above diagram with a two-deep
cascade of 3-input XOR gates.
Extension of this example to X^58 + X^19 + 1 with a 64-wide data path
should be obvious. Please let me know if you find any errors.