Thread Links Date Links
Thread Prev Thread Next Thread Index Date Prev Date Next Date Index

Re: Scrambling Principle

Dear Don,

> I am working with Rich Taborek at nSerial and want to learn more about
> the scrambler/descrambler function.  I was re-reading your
> presentation from the Kauai IEEE Plenary meeting in Nov 1999 titled
> "Low overhead coding proposal for 10Gb/s serial links." Here is the URL: 
> On the slide labeled "Scrambling principle," (I think it's page 13)
> you show an example of the scrambler/descrambler parallel form.  I am
> having a hard time understanding how the serial form and the parallel
> form are related.  Can you explain how the parallel form is derived
> from the scrambler polynomial or give me a reference? 

Unfortunately, the parallel scrambler I showed in Kauai had a different
polynomial than the serial scrambler, so you really had no hope of
figuring it out.  Sorry about that.  I had just cut-and-pasted some
examples that were handy without checking that they were consistent.

So let's work out the parallel version of the 3-bit scrambler example
shown in the URL above.   The serial scrambler is given by:

      D(t) ----------(X)------------------------------+--> S(t)
                      ^                               |
                      |                               |
       +------------>(X)                              |
       |              ^                               |
       |              |                               |
       +--[ S(t-3) ]--+--[ S(t-2) ]----[ S(t-1) ]<----+

Where D(t) is input data, S(t) is the corresponding output data at
time t, where t=0,1,2,3... as time increases.  Brackets "[]" denote
latched signals, or a delay of one clock time.

A recursion equation describing the scrambler is:

    S(t) = S(t-3) + S(t-2) + D(t)

where "+" denotes modulo-2 addition (exclusive-or function).  In
the figure, I have used "(X)" for the same function.

I find it convenient to rearrange the terms and add an offset to make
all the indices positive:

(i)    S(3) = S(0) + S(1) + D(3)

(ii)   D(3) = S(3) + S(1) + S(0)  (subtraction and addition are equivalent)

This corresponds to a scrambler with a polynomial typically notated as

        X^3 + X^1 + X^0 

Now lets parallelize the function to make a 4-bit wide scrambler.  I'll
drop the parenthesis now so S(4) == S4.

We would like to take the four data bits D5-8 and generate the scrambled
bits S5-8 in a single clocked operation.

From the recursion relation (i):

        S5 = S2 + S3 + D5
        S6 = S3 + S4 + D6
        S7 = S4 + S5 + D7
        S8 = S5 + S6 + D8

This can implemented as:

    D5 ------(X)-+----> S5 output
	      ^  |      
	      |  |
    S2+S3 --->+  +---[ S1 ]--

    D6 ------(X)-+----> S6 output
	      ^  |      
	      |  |      
    S3+S4 --->+  +---[ S2 ]--

    D7 ------(X)-+----> S7 output
	      ^  |      
	      |  |      
    S4+S5 --->+  +---[ S3 ]--

    D8 ------(X)-+----> S8 output
	      ^  |      
	      |  |      
    S5+S6 --->+  +---[ S4 ]--

So, S5-S8 are generated in one clock cycle from D5-D8.  One bank of
latches are used to store the previously computed values of S1-S4 and
these are used in the recurrence relation.  The critical patch consists
of the S7 and S8 computations.  S7 for instance is defined as

        S7 = S5 + S4 + D7

S5 is not available in the storage latches, so must be generated by

        S5 = S3 + S2 + D6

So the true equation for S7 is:

        S7 = (S3 + S2 + D6) + S4 + D7

and can be readily implemented in the above diagram with a two-deep
cascade of 3-input XOR gates.

Extension of this example to X^58 + X^19 + 1 with a 64-wide data path
should be obvious.  Please let me know if you find any errors.

Best regards,
Rick Walker