Thread Links |
Date Links |
||||
---|---|---|---|---|---|

Thread Prev | Thread Next | Thread Index | Date Prev | Date Next | Date Index |

*To*: stds-802-3-hssg-64b66b@xxxxxxxx*Subject*: Re: 64/66 system benefits and ad-hoc agenda*From*: Rick Walker <walker@xxxxxxxxxxxxxxxxx>*Date*: Wed, 22 Dec 1999 14:50:03 -0800*Sender*: owner-stds-802-3-hssg-64b66b@xxxxxxxx

Dear Kamran, > Please find below a list of comments/questions about the 64b/66b > code. Since I could not attend the Kauai meeting, I downloaded your > presentation an d went through your slides. Sorry if some of these > questions have already been addressed at the Hawai meeting. No problem. Thanks for your interest. There have been several modifications to the code as presented in Kauai, so it will be good to go over these points again. > 1) why are control characters coded with 7bit as opposed to 8bit for > data words? Why not 8bit (what is the role of the empty spaces in > 66b blocks?). This point was obscured by an error in the slides as published. The frame labelled "Pure Control frame with 10 preamble" should have shown 8 control character fields with no empty spaces. If you look at the slide titled "Building frames with proposed HARI 10GbE mapping", you will see that several frames types are needed. Before going on, let me explain my notation. I will show 8 octet aggregations of control and data words in the following way: XXXX/XXXX ^^^^ ^^^^ |||| |||| |||| |||+---- fourth lane, second HARI transfer |||| ||+----- third lane, second HARI transfer |||| |+------ second lane, second HARI transfer |||| +------- first lane, second HARI transfer |||+--------- fourth lane, first HARI transfer ||+---------- third lane, first HARI transfer |+----------- second lane, first HARI transfer +------------ first lane, first HARI transfer Or in time as: lane 1: K R K R K R S D D D D D D R K R K R lane 2: K R K R K R D D D D D D D R K R K R lane 3: K R K R K R D D D D D D T R K R K R lane 4: K R K R K R D D D D D D K R K R K R increasing time ---> And will use the following shorthand for 8B/10B characters: 8B/10B Name shorthand ------ -------- --------- K28.5 idle/even K K28.0 idle/odd R K30.7 error E K27.7 start S K29.7 terminate T ---- generic cntrl Z So, to handle all conceivable HARI transfers, blocked up two transfers at a time into 8-octet chunks, we will need the following block types: 2 kinds of packet starts: a) SDDD/DDDD b) ZZZZ/SDDD Pure data (inner packet): c) DDDD/DDDD 8 kinds of packet endings: d) TZZZ/ZZZZ e) DTZZ/ZZZZ f) DDTZ/ZZZZ g) DDDT/ZZZZ h) DDDD/TZZZ i) DDDD/DTZZ j) DDDD/DDTZ k) DDDD/DDDT Pure control (Inter-packet gap): l) ZZZZ/ZZZZ Going back to your question: > 1) why are control characters coded with 7bit as opposed to 8bit for > data words? Why not 8bit This is because 64b/66b frames are either pure data or mixed frames which include one or more control octets. Mixed frames have 8bits of their payload "used up" by the TYPE byte which is needed distinguish between the various frame structures enumerated above. Frame type "l" (ZZZZ/ZZZZ) encodes 8 control characters. The payload is 64 bits in size, of which 8 are taken by the TYPE byte. This leaves only 56 bits free to specify the 8 different control codes. 8*7=56, hence the 7-bit field limit on control characters. > (what is the role of the empty spaces in 66b blocks?). All other mixed frame types have fewer than 8 control characters specified. For these frames, there are some unused bits. The empty spaces have use. Because the are not needed, they are left undefined. One other tricky bit: I've defined the frame compositions to always pack the 7-bit control fields to the right. I believe this simplifies the IC layout. The alternative would be to always put the control fields in time-order, sometimes packed right and sometimes packed left. I think it is easier to do the implementation in two stages: First, the 8-bit fields are picked off starting at the left of the frame while also picking off the proper number of 7-bit fields starting from the right. Second, the decoded byte outputs are then swapped based on the TYPE byte. > 2) isn't there an issue with DC wander (due to very short scrambling > polynomial)? I thought that was a problem in 100BASE-TX (DC wander > problem). If the scrambled data is statistically random (a very good assumption for long polynomials), then baseline wander (BLW) can be analyzed or simulated quite easily. Here's a quick run-down on the results: The following table was built with an awk script that generates 1 million random bits, integrates the running baseline error through a single coupling pole, and analyzes the wander in the form of a histogram. (program and plots available by request). Different coupling capacitor time constants can be easily simulated. Here is a summary of a few runs at 10.3125 Gb/s: C RC (100 Ohm) BLW (sigma) ---- ----------- ------------ .01uF 250.00 ns .00186 .001uF 25.00 ns .0079 .0001uF 2.50 ns .0243 .00001uF 0.25 ns .0775 So, for scrambled data, the BLW varies from 0.2% RMS for a .01 uF coupling capacitor to 7.7% with a 10 pF coupling capacitor. Notice that a 10x increase in capacitor size reduces the wander by about 3x. If the BLW is small then, to a good approximation, the BLW is linear with the number of poles. Eg: two poles at 2.5ns time-constant will give a BLW of 2 * 0.0243 RMS or about 4.8 % RMS. A way to understand this is to think of the voltage across the coupling capacitor as an LPF function of the bit stream, or approximately a moving average across K bits where K ~= 2*PI*RC / Tbit So a .01uF capacitor is storing an integral of the past 1us * 10.3125G = 10312 bits. Scrambled bits obey random walk statistics so that the expected imbalance in K bits is sqrt(K), giving an expected fractional error of sqrt(K)/K or 1/sqrt(K) normalized to the total eye opening. Plugging in, we get an expected BLW of 1/sqrt(64795), or 0.0039 which, for a rule-of-thumb, is acceptably close to the simulated value of 0.0018. Here's the above table augmented the our rule of thumb values: C RC (100 Ohm) BLW (sigma R.o.t ---- ----------- --------- ------ .01uF 1000.0 ns .00186 .004 .001uF 100.0 ns .0079 .012 .0001uF 10.0 ns .0243 .039 .00001u 1.0 ns .0775 .124 Notice that the rule of thumb captures the sqrt() dependency of BLW on capacitor size. The rule could be made much more accurate if we took into account the exponential weighting of bits, but in its simple form it gives an intuitive understanding of the effect. So, how practical is this system? If we believe that 0.01uF chip capacitors are readily available with suitable bandwidth for 10.3Gbaud transmission, then we can compute the 6 sigma BLW for two poles to be .00186*2*6, or about 2.2%. The BLW at six sigma will affect the BER by less than 1e-15 even in a system that completely fails with an offset of 2.2%. Practical systems can be easily built to tolerate 10% tilt, which corresponds to an 18 sigma BLW for a .01uf capacitor at 10.3Gbaud. In such a case, the effect of wander on BER is statistically insignificant. How does this compare with systems like 8B/10B? In block coded systems there are fixed limits on disparity. This means that BLW drops linearly with increases in capacitor size rather than as the sqrt() of capacitor size. However, this is irrelevant as long as the BLW of a scrambled system can be made small enough with a practical capacitor value. As the bit rate gets higher, the scrambling system works better with smaller capacitors. As SONET has proven, scrambling works quite nicely from 0.155 - 10Gbaud with several 100ns poles. > 3) it seems to me, just looking at the polynomial, that the proposed > scrambling polynomial may have poor spectral properties (EMC issues). It was not my intention to propose a polynomial in Kauai. I had only hoped to show a concept and to describe a general strategy. The 3 bit scrambler shown in the slide is titled "Scrambling Principle". It is NOT the proposed scrambler. Please look to the third bullet point: "Recommend long pattern length to reduce possibility of jamming (eg: x^31+x^3+1). I only used a 3-bit scrambler to show the simplified architecture of a general self-synchronizing scrambler. > Have you looked at the transmit spectrum when assuming a packet > filled with zeros? The registers 0,1,2 will go in the following > cyclic states: (111) (110) (100) (001) (010) (101) (011), (111) (110) > ... (cycle of 7). This would mean the output spectrum would > represent discrete tones(?) Is that acceptable? In the light of the above clarification, this question should no longer be relevant. > 4) errors are replicated three times at the output of the > self-synchronous de scrambler. It seems to me that there is an issue > with Ethernet CRC with the current choice of polynomial and proposed > implementation. You are absolutely correct here. There is certainly an issue to be investigated. The scrambling polynomial is currently x^63+x^1+1. This is a primitive trinomial (see E.J. Watson, "Primitive polynomials (mod 2)", Mathematics of computation 16, No. 79, pp. 368-369, July 1962). Much work was done on the mathematics of error-multiplication during the PPP-over-SONET investigating. The results are quite interesting. Boudreau, et al. (Boudreau, Bergman, Irvin, "Performance of a cyclic redundancy check and its interaction with a data scrambler", IBM J. Res. Develop. Vol. 38, No. 6, November 1994, pp 651-658) make the following statement: "The full capability of the CRC should be maintained when the scrambler and the [CRC] generator have no common factors; for example, an irreducible polynomial could be chosen as the scrambler." The proposed scrambler x^63+x^1+1 is irreducible and, because it is a higher degree than CRC32, is not itself a factor of CRC32. Boudreau then cautions: "If the scrambler is allowed to run continuously, however, the effects of spilled errors must be considered in any analysis of the CRC's performance. Moreover, no simple rules can be given for understanding the scramblers's impact on the capability of the CRC - each case must be examined on its own." To this end, I have written a C-program to exhaustively check the interaction of CRC32 with x^63+x^1+1 under the presence of multiple bit errors, both "spilled into" and "contained within" the ethernet packet. Subject to peer review, I have demonstrated that all 1 and 2 bit multiplied errors are caught for all valid ethernet packet sizes. In addition, I have tested all contiguous error-multiplied burst errors up to a length of 200 bits. The proposed scrambler has shown no degradation of CRC32 for all tests made to date. If anyone would like to check my work, I'm happy to make my C-program available upon request. > 5) rate conversion 64 to 66: puts some burden on the 10GHz PLL design, > and requires rate conversion + FIFO. This is always the case in a code that expands the space. 8B/10B needs a rate conversion. SONET also needs similar circuitry. > 1) is a clarification question. I have a suggestion for 2) 3) and 4): > to use a descrambling scheme similar to 100BASE-TX (not > self-synchronized) but with a longer polynomial (perhaps one of the > polynomials used in 1000BASE-T). And I guess if Hari sticks to 8b/10b > there is nothing to do about 5)... It is good advice to use a longer polynomial. However, I think the non-self-synchronized scrambler is unacceptably complex to implement unless absolutely needed. Since a solid mathematical basis exists for choosing self-synchronizing scramblers, I propose to take the simple path. kind regards, -- Rick Walker

**Follow-Ups**:**Re: 64/66 system benefits and ad-hoc agenda***From:*Kamran Azadet

**Re: 64/66 system benefits and ad-hoc agenda***From:*Kamran Azadet

- Prev by Date:
**Re: 64/66 system benefits and ad-hoc agenda** - Next by Date:
**RE: 64/66 system benifits and ad-hoc agenda** - Prev by thread:
**re[2]: Loop Bandwidth for 64B66B** - Next by thread:
**Re: 64/66 system benefits and ad-hoc agenda** - Index(es):