Thread Links Date Links
Thread Prev Thread Next Thread Index Date Prev Date Next Date Index

Re: [802.3BA] 802.3ba XR ad hoc next step concern


     I am noticing a thread (one among several) being discussed
in this multi-threaded discussion that I consider disturbing given
our history of link standardization.

    Normally we define a link by developing a transceiver specification
for a defined data rate running over a specified media (or a referenced
media specification) at a maximum link distance under worst case

     Now we all know the that a slightly longer link distance will probably
work, sometimes significantly longer! But we DO NOT discuss it! Why?
Because we are NOT prepared to guarantee it worst case, and we normally
limit ourselves to the world of worst case analyses.

Thomas Dineen

Geoff Thompson wrote:

While your scenario is quite plausible, even likely. It is nothing we as a standards group can prevent. It's gonna happen. Now in a practical sense (and OUTSIDE of the standards arena) how does a vendor or end user keep it from happening?

The answer is:
        1) Good operational practices
        2) Good record keeping
If it were my network, there would have be tags at each end of the fiber AND a notation in the link record database (Remember, BICSI has best practices for how to handle all of this stuff. It's a bad idea not to follow them, especially in this sort of case.) that noted that this was an "out of spec" link and had a pointer to a procedure for the appropriate "exceptional handling" required. Do I think this is a good idea for twisted-pair links handled by casual users? No. Do I think that a data center manager would think it is reasonable and something that can be handled within an appropriately and professionally managed cabling system? Yes.

Given that the above is the case. It WILL eat into the market for any different product for a longer reach, especially a marginally longer reach. The other question then becomes how much cost handicap this customer will be willing to endure across his entire transceiver population to duck this minor problem. I believe that the consensus is that for distances over 100 meters the answer is "not much".


At 08:07 AM 8/22/2008 , Scott Kipp wrote:
I disagree with some premises that have been put forward and feel that we need to proceed on the present course to defining how to extend the reach beyond 100 meters.  The time to throw in the towel has not come.
Let's run through a little scenario.  Joe the installer needs to go 130 meters and begins plugging in the ends of the fiber to different modules and a combination finally works and the link comes up.  The link works fine for a few months or even years and then the module degrades a little (but is still within the 100 meter specification) and the link begins to have bit errors.  Joe the installer is long gone (or should hide after the link fails) and nobody knows what happened.  Both modules still pass any tests based on the standard, but the link is failing and no one is responsible. 
The reason we have standards (like Geoff says) is because we want reliable links to work all of the time over the lifetime of the parts.  When a link fails, we can determine the cause and have responsibility.  I agree that the links are defined conservatively and will work longer most of the time, but I disagree with John's premise that there is no value in the XR solution that goes significantly farther.  Responsible installers will rely on standards based parts to go the rated distance.  We have several ways we can make OM3 links work for longer than 100 meters. 
Do we have the difficult task of choosing only one way to extend the link or can we talk about several ways to do this in an informative annex?  The hard part is choosing one solution because we have so many camps with different solutions.  The usual problem is that we have no clear winner - no Usain Bolt - in the XR race beyond 100 meters.

From: Geoff Thompson [mailto:gthompso@xxxxxxxxxx]
Sent: Friday, August 22, 2008 9:15 AM
To: STDS-802-3-HSSG@xxxxxxxxxxxxxxxxx
Subject: Re: [802.3BA] 802.3ba XR ad hoc next step concern


In answer to your question

Because it has been the tradition of 802.3 (and I strongly believe a foundation of the success of 802.3 in general) that we offer more to the market than just "overwhelming odds" (quantified in the earlier message as 92% or only 11 times out of 12 tries). What we have worked to in 802.3 is significantly closer to "worst case design" than that. We have argued over the years about what that has meant but we have certainly never dipped that low.

What I believe that John has argued for (and not unreasonably) is the following.
We provide assured operation at 100 meters
If you want to go to 150 meters, the odds are very strong that you can succeed as long as you are will to lower your expectations from plug and chug to trying your way through a half dozen parts at each end in order to get a set that works.

His thesis, as I understand it, is:
  1) It is not worth the extra investment (time and money both) to get a different standard with the extra reach.
  2) Even if we do #1, the market won't pay anything for it. They will just go through the select and try route in order to save the extra money and separate inventory hassle.

Given the odds that it will work over 90% of the time, I would agree.

Am I willing to reduce our customers' overall chance of success to 92%?
No !!


Geoff Thompson

At 08:55 AM 8/21/2008 , Swanson, Steven E wrote:
Thanks for all your work on this; I have to study it more and would like to see the actual presentation but I would offer the following comment:
If the following statement is true, why do we have an objective of 100m rather than 150m?
"Do nothing to the standard and when 150 m of OM3 or 250 m of OM4 is desired just plug in the fiber.  The odds are overwhelming that it will work."

From: PETRILLA,JOHN [mailto:john.petrilla@xxxxxxxxxxxxx]
Sent: Wednesday, August 20, 2008 11:23 PM
To: STDS-802-3-HSSG@xxxxxxxxxxxxxxxxx
Subject: [802.3BA] 802.3ba XR ad hoc next step concern



I m concerned that the proposal of creating a new objective is leading us into a train wreck.  This is due to my belief that it s very unlikely that 75% of the project members will find this acceptable.  This will be very frustrating for various reasons, one of which, almost all the modules expected to be developed will easily support the desired extended link reaches, will be discussed below.


I don t want to wait until our next phone conference to share this in the hope that we can make use of that time to prepare a proposal for the September interim.  I ll try to capture my thoughts in text in order to save some time and avoid distributing a presentation file to such a large distribution.  I may have a presentation by the phone conference.


Optical modules are expected to either have a XLAUI/CLAUI interface or a PMD service interface, PPI.  Both are considered.


A previous presentation, petrilla_xr_02_0708, has shown that modules with XLAUI/CLAUI interfaces will support 150 m of OM3 and 250 m of OM4.  These modules will be selected by equipment implementers primarily because of the commonality of their form factor with other variants, especially LR, and/or because of the flexibility the XLAUI/CLAUI interface offers the PCB designer.  Here the extended fiber reach comes for no additional cost or effort.  This is also true in PPI modules where FEC is available in the host.


Everyone is welcome to express their forecast of the timing and adoption of XLAUI/CLAUI MMF modules vs baseline MMF modules.


To evaluate the base line proposal for its extended reach capability, a set of Monte Carlo, MC, analyses were run.  The first MC evaluates just a Tx distribution against an aggregate Tx metric.  This is to estimate the percentage removed by the aggregate Tx test.  The second MC evaluates the same Tx distribution in combination with an Rx distribution and 150 m of worst case OM3.  The third MC repeats the second but replaces the 150 m of OM3 with 250 m of worst case OM4.  Worst case fiber plant characteristics were used in all link simulations.


The Tx distribution characteristics follow.  All distributions are Gaussian.

  Min OMA, mean = -2.50 dBm, std dev = 0.50 dBm (Baseline value = -3.0 dBm)

  Tx tr tf, mean = 33.0 ps, std dev = 2.0 ps (Example value = 35 ps)

  RIN(oma), mean = -132.0 dB/Hz, std dev = 2.0 dB (Baseline value = -128 to -132 dB/Hz, Example value = -130 dB/Hz)

  Tx Contributed DJ, mean = 11.0 ps, std dev = 2.0 ps (Example value = 13.0 ps)

  Spectral Width, mean = 0.45 nm, std dev = 0.05 nm (Baseline value = 0.65 nm).

  Baseline values are from Pepeljugoski_01_0508 and where no baseline value is available Example values from petrilla_02_0508 are used.


All of the above, except spectral width, can be included in an aggregate Tx test permitting less restrictive individual parameter distributions than if each parameter is tested individually.  In this example distributions are chosen such that only the mean and one std dev of the distribution satisfy the target value in the link budget spreadsheet.  If the individual parameter is tested directly to this value the yield loss would be approximately 16%.


The Rx distribution characteristics follow.  Again, all distributions are Gaussian.

  Unstressed sensitivity, mean = -12.0 dBm, std dev = 0.75 dB (Baseline value = -11.3 dBm)

  Rx Contributed DJ, mean = 11.0 ps, std dev = 2.0 ps (Baseline value = 13.0 ps)

  Rx bandwidth, mean = 10000 MHz, std dev = 850 MHz (Baseline value = 7500 MHz).


For the Tx MC, only 2% of the combinations would fail the aggregate Tx test.


For the 150 m OM3 MC, only 2% of the combinations would have negative link margin and fail to support the 150 m reach.  This is less than the percentage of modules that would have been rejected by the Tx aggregate test and a stressed Rx sensitivity test and very few would actually be seen in the field.


For the 250 m OM4 MC, only 8% of the combinations would have negative link margin.  Here approximately half of these would be due to transmitters and receivers that should have been caught at their respective tests.


The above analysis is for a single lane.  In the case of multiple lane modules, the module yield loss will increase depending on how tightly the lanes are correlated.  Where module yield loss is high, module vendors will adjust the individual parameter distributions such that more than one std dev separates the mean from the spread sheet target value.  This will reduce the proportion of modules failing the extended link criteria. Also, any correlation between lanes results in a module distribution of units that are shipped having fewer marginal lanes than where the lanes are independent.   


So while there s a finite probability that a PPI interface module doesn t support the desired extended reaches, the odds are overwhelming that it does.


Then with all of one form factor and more than 92% of the other form factor supporting the desired extended reach, the question becomes, what s a rational and acceptable means to take advantage of what is already available?   A new objective would enable this but, as stated above getting a new objective for this is at best questionable.  Further, it s expected that one would test to see that modules meet the criteria for the new objective, set up part numbers, create inventory, etc. and that adds cost.  Finally, users, installers, etc. are intelligent and will soon find this out and will no longer accept any cost premium for modules that were developed to support extended reach - they will just use a standard module.  There s little incentive to invest in an extended reach module development.


I ll make a modest proposal: Do nothing just hook up the link.  Do nothing to the standard and when 150 m of OM3 or 250 m of OM4 is desired just plug in the fiber.  The odds are overwhelming that it will work.  If something is really needed in the standard, then generate a white paper and/or an informative annex describing the statistical solution.


Background/Additional thoughts:


Even with all the survey results provided to this project, it s not easy to grasp what to expect for a distribution of optical fiber lengths within a data center and what is gained by extending the reach of the MMF baseline beyond 100 m.  Here s another attempt.


In flatman_01_0108, page 11, there s a projection for 2012.  There for 40G, the expected adoption percentage of links in Client-to-Access (C-A) applications of 40G is 30%, for Access-to-Distribution (A-D) links, it is 30%, and for Distribution-to-Core (D-C)links it is 20%.  While Flatman does not explicitly provide a relative breakout of link quantities between the segments, C-A, A-D & D-C, perhaps one can use his sample sizes as an estimate.  This yields for C-A 250000, for A-D 16000 and for D-C 3000.  Combining with the above adoption percentages yields an expected link ratio of C-A:A-D:D-C = 750:48:6.


Perhaps Alan Flatman can comment on how outrageous this appears.


This has D-C, responsible for 1% of all 40G links, looking like a niche.  Arguments over covering the last 10% or 20% or 50% of D-C reaches does not seem like time well spent.  Even A-D combined with D-C, AD+DC, provides only 7% of the total.


Similarly for 100G:  the 2012 projected percentage adoption for C-A:A-D:D-C is 10:40:60 and link ratio is 250:64:18.  Here D-C is responsible for 5% of the links and combined with A-D generates 25% of the links.  Now the last 20% of AD+DC represents 5% of the market.


Since the computer architecture trend leads to the expectation of shorter link lengths and there are multiple other solutions that can support longer lengths, activating FEC, active cross-connects, telecom centric users prefer SM anyway, point-to-point connections, etc., there is no apparent valid business case supporting resource allocation for development of an extended reach solution.