Thread Links Date Links
Thread Prev Thread Next Thread Index Date Prev Date Next Date Index

Re: 64/66 system benefits and ad-hoc agenda





Dear Kamran,

>     Please find below a list of comments/questions about the 64b/66b
> code. Since I could not attend the Kauai meeting, I downloaded your
> presentation an d went through your slides.  Sorry if some of these
> questions have already been addressed at the Hawai meeting. 

No problem.  Thanks for your interest.  

There have been several modifications to the code as presented in Kauai,
so it will be good to go over these points again.

> 1) why are control characters coded with 7bit as opposed to 8bit for
> data words?  Why not 8bit (what is the role of the empty spaces in
> 66b blocks?). 

This point was obscured by an error in the slides as published.  The frame
labelled "Pure Control frame with 10 preamble" should have shown 8 
control character fields with no empty spaces.

If you look at the slide titled "Building frames with proposed HARI
10GbE mapping", you will see that several frames types are needed. 

Before going on, let me explain my notation.  I will show 8 octet
aggregations of control and data words in the following way:

    XXXX/XXXX

    ^^^^ ^^^^
    |||| ||||
    |||| |||+---- fourth lane, second HARI transfer
    |||| ||+----- third lane,  second HARI transfer
    |||| |+------ second lane, second HARI transfer
    |||| +------- first lane,  second HARI transfer
    |||+--------- fourth lane, first HARI transfer
    ||+---------- third lane,  first HARI transfer
    |+----------- second lane, first HARI transfer
    +------------ first lane,  first HARI transfer

Or in time as:

        lane 1:   K R K R K R S D D D D D D R K R K R 
        lane 2:   K R K R K R D D D D D D D R K R K R 
        lane 3:   K R K R K R D D D D D D T R K R K R 
        lane 4:   K R K R K R D D D D D D K R K R K R 
                     increasing time --->

And will use the following shorthand for 8B/10B characters:

    8B/10B  Name        shorthand
    ------  --------    ---------   
    K28.5   idle/even       K
    K28.0   idle/odd        R
    K30.7   error           E
    K27.7   start           S
    K29.7   terminate       T

    ----    generic cntrl   Z

So, to handle all conceivable HARI transfers, blocked up two transfers
at a time into 8-octet chunks, we will need the following block types:

2 kinds of packet starts:

    a) SDDD/DDDD
    b) ZZZZ/SDDD

Pure data (inner packet):

    c) DDDD/DDDD

8 kinds of packet endings:

    d) TZZZ/ZZZZ
    e) DTZZ/ZZZZ
    f) DDTZ/ZZZZ
    g) DDDT/ZZZZ
    h) DDDD/TZZZ
    i) DDDD/DTZZ
    j) DDDD/DDTZ
    k) DDDD/DDDT

Pure control (Inter-packet gap):

    l) ZZZZ/ZZZZ

Going back to your question:

> 1) why are control characters coded with 7bit as opposed to 8bit for
> data words?  Why not 8bit

This is because 64b/66b frames are either pure data or mixed frames
which include one or more control octets.  Mixed frames have 8bits of
their payload "used up" by the TYPE byte which is needed distinguish
between the various frame structures enumerated above. 

Frame type "l" (ZZZZ/ZZZZ) encodes 8 control characters.  The payload is
64 bits in size, of which 8 are taken by the TYPE byte.  This leaves
only 56 bits free to specify the 8 different control codes.  8*7=56,
hence the 7-bit field limit on control characters. 

> (what is the role of the empty spaces in 66b blocks?).

All other mixed frame types have fewer than 8 control characters
specified.  For these frames, there are some unused bits.  The empty
spaces have use.  Because the are not needed, they are left undefined. 

One other tricky bit:  I've defined the frame compositions to always pack
the 7-bit control fields to the right.  I believe this simplifies the IC
layout.  The alternative would be to always put the control fields in
time-order, sometimes packed right and sometimes packed left.  I think it
is easier to do the implementation in two stages:  First, the 8-bit fields
are picked off starting at the left of the frame while also picking off
the proper number of 7-bit fields starting from the right. Second, the 
decoded byte outputs are then swapped based on the TYPE byte.

> 2) isn't there an issue with DC wander (due to very short scrambling
> polynomial)?  I thought that was a problem in 100BASE-TX (DC wander
> problem). 

If the scrambled data is statistically random (a very good assumption
for long polynomials), then baseline wander (BLW) can be analyzed or
simulated quite easily.  Here's a quick run-down on the results:

The following table was built with an awk script that generates 1
million random bits, integrates the running baseline error through a
single coupling pole, and analyzes the wander in the form of a
histogram.  (program and plots available by request). 

Different coupling capacitor time constants can be easily simulated.  Here
is a summary of a few runs at 10.3125 Gb/s:

    C        RC (100 Ohm)   BLW (sigma)
    ----    -----------    ------------
    .01uF      250.00 ns     .00186
    .001uF      25.00 ns     .0079
    .0001uF      2.50 ns     .0243
    .00001uF     0.25 ns     .0775         

So, for scrambled data, the BLW varies from 0.2% RMS for a .01 uF coupling
capacitor to 7.7% with a 10 pF coupling capacitor.  Notice that a 10x
increase in capacitor size reduces the wander by about 3x.

If the BLW is small then, to a good approximation, the BLW is linear
with the number of poles.  Eg: two poles at 2.5ns time-constant will
give a BLW of 2 * 0.0243 RMS or about 4.8 % RMS.

A way to understand this is to think of the voltage across the coupling
capacitor as an LPF function of the bit stream, or approximately a
moving average across K bits where

    K ~= 2*PI*RC / Tbit

So a .01uF capacitor is storing an integral of the past 1us * 10.3125G =
10312 bits.  Scrambled bits obey random walk statistics so that the
expected imbalance in K bits is sqrt(K), giving an expected fractional
error of sqrt(K)/K or 1/sqrt(K) normalized to the total eye opening.

Plugging in, we get an expected BLW of 1/sqrt(64795), or 0.0039 which,
for a rule-of-thumb, is acceptably close to the simulated value of
0.0018. 

Here's the above table augmented the our rule of thumb values:

    C        RC (100 Ohm) BLW (sigma   R.o.t
    ----    -----------   ---------    ------
    .01uF      1000.0 ns     .00186     .004
    .001uF      100.0 ns     .0079      .012
    .0001uF      10.0 ns     .0243      .039
    .00001u       1.0 ns     .0775      .124

Notice that the rule of thumb captures the sqrt() dependency of BLW
on capacitor size.  The rule could be made much more accurate if we
took into account the exponential weighting of bits, but in its simple
form it gives an intuitive understanding of the effect.
    
So, how practical is this system?

If we believe that 0.01uF chip capacitors are readily available with
suitable bandwidth for 10.3Gbaud transmission, then we can compute the
6 sigma BLW for two poles to be .00186*2*6, or about 2.2%.  The BLW at
six sigma will affect the BER by less than 1e-15 even in a system that
completely fails with an offset of 2.2%.

Practical systems can be easily built to tolerate 10% tilt, which
corresponds to an 18 sigma BLW for a .01uf capacitor at 10.3Gbaud.  In
such a case, the effect of wander on BER is statistically insignificant. 

How does this compare with systems like 8B/10B?  In block coded systems
there are fixed limits on disparity.  This means that BLW drops linearly
with increases in capacitor size rather than as the sqrt() of capacitor
size.  However, this is irrelevant as long as the BLW of a scrambled
system can be made small enough with a practical capacitor value. 

As the bit rate gets higher, the scrambling system works better with
smaller capacitors. 

As SONET has proven, scrambling works quite nicely from 0.155 - 10Gbaud
with several 100ns poles. 

> 3) it seems to me, just looking at the polynomial, that the proposed
> scrambling polynomial may have poor spectral properties (EMC issues). 

It was not my intention to propose a polynomial in Kauai.  I had only
hoped to show a concept and to describe a general strategy.  

The 3 bit scrambler shown in the slide is titled "Scrambling Principle". 
It is NOT the proposed scrambler.  Please look to the third bullet
point: "Recommend long pattern length to reduce possibility of jamming
(eg: x^31+x^3+1).  I only used a 3-bit scrambler to show the simplified
architecture of a general self-synchronizing scrambler.

> Have you looked at the transmit spectrum when assuming a packet
> filled with zeros?  The registers 0,1,2 will go in the following
> cyclic states: (111) (110) (100) (001) (010) (101) (011), (111) (110)
> ...  (cycle of 7).  This would mean the output spectrum would
> represent discrete tones(?) Is that acceptable? 

In the light of the above clarification, this question should no longer be
relevant.

> 4) errors are replicated three times at the output of the
> self-synchronous de scrambler.  It seems to me that there is an issue
> with Ethernet CRC with the current choice of polynomial and proposed
> implementation. 

You are absolutely correct here.   There is certainly an issue to be
investigated. 

The scrambling polynomial is currently x^63+x^1+1.  This is a primitive
trinomial (see E.J. Watson, "Primitive polynomials (mod 2)", Mathematics
of computation 16, No. 79, pp. 368-369, July 1962).

Much work was done on the mathematics of error-multiplication during the
PPP-over-SONET investigating.  The results are quite interesting.

Boudreau, et al. (Boudreau, Bergman, Irvin, "Performance of a cyclic
redundancy check and its interaction with a data scrambler", IBM J. Res.
Develop. Vol. 38, No. 6, November 1994, pp 651-658) make the following
statement:

    "The full capability of the CRC should be maintained when the
    scrambler and the [CRC] generator have no common factors; for
    example, an irreducible polynomial could be chosen as the scrambler."

The proposed scrambler x^63+x^1+1 is irreducible and, because it is a
higher degree than CRC32, is not itself a factor of CRC32. 

Boudreau then cautions:

    "If the scrambler is allowed to run continuously, however, the
    effects of spilled errors must be considered in any analysis of the
    CRC's performance.  Moreover, no simple rules can be given for
    understanding the scramblers's impact on the capability of the CRC -
    each case must be examined on its own."

To this end, I have written a C-program to exhaustively check the
interaction of CRC32 with x^63+x^1+1 under the presence of multiple bit
errors, both "spilled into" and "contained within" the ethernet packet.

Subject to peer review, I have demonstrated that all 1 and 2 bit
multiplied errors are caught for all valid ethernet packet sizes.  In
addition, I have tested all contiguous error-multiplied burst errors up
to a length of 200 bits.  The proposed scrambler has shown no degradation
of CRC32 for all tests made to date.

If anyone would like to check my work, I'm happy to make my C-program
available upon request. 

> 5) rate conversion 64 to 66: puts some burden on the 10GHz PLL design,
> and requires rate conversion + FIFO. 

This is always the case in a code that expands the space.  8B/10B needs
a rate conversion.  SONET also needs similar circuitry.

> 1) is a clarification question.  I have a suggestion for 2) 3) and 4):
> to use a descrambling scheme similar to 100BASE-TX (not
> self-synchronized) but with a longer polynomial (perhaps one of the
> polynomials used in 1000BASE-T).  And I guess if Hari sticks to 8b/10b
> there is nothing to do about 5)... 

It is good advice to use a longer polynomial.  However, I think the
non-self-synchronized scrambler is unacceptably complex to implement
unless absolutely needed.  Since a solid mathematical basis exists for
choosing self-synchronizing scramblers, I propose to take the simple
path. 

kind regards,
--
Rick Walker