Thread Links Date Links
Thread Prev Thread Next Thread Index Date Prev Date Next Date Index

Re: [802.3BA] 802.3ba XR ad hoc next step concern

thanks for offering this analysis.  It allows us to examine the situation from a new perspective, one of a statistical approach rather than a total worst-case approach.  This type of approach has precedent, as I am sure you are aware, in 10GBASE-LRM development, where the receiver stress test conditions were selected to cover 99% of links at the specified maximum link length.  This coverage was determined by Monte Carlo simulation, cross-checked and aligned to the traditional 802.3 link model, performed in large part by John Ewen.  See  

Here John Ewen modeled the MMF cabling using distributions supplied by fiber manufacturers.  Below you state that you are using worst-case fiber models, not fiber distributions.  Had you taken the approach used for LRM, the coverage of your simulations would have increased, almost assuredly to above 99% coverage for 150m of OM3, and probably to 99% coverage on OM4 near 250m.  For either OM3 or OM4, if hitting a 99% threshold at rated link length means adjusting the link length (up or down) a bit or making small adjustments to the transceiver parameters, but retaining the benefit of offering a significantly longer reach to address the complete data center, then I would support it.  When 99% coverage levels at max length are combined with actual link length distributions, the total coverage rate is nearly 100%.

I note the spectral width model you chose leaves four standard deviations to the baseline spec limit, which is unlike any other modeled parameter, where the baseline (or the example value) is selected usually at one standard deviation.  Presumably this is because you don't envision a Tx test that lumps the spectral impairments (of chromatic dispersion, mode partition noise, and their cross-products) in with the other parameters.  I think there may be a way of encompassing all of these together, further simplifying the tests and improving yields, using fiber with near-perfect DMD at 850nm as a spectrally dispersive element in the test bed.  I'd like to explore this more with the optics experts in the Task Force.

Next I examine the interface and evolutionary alternatives you suggest in lieu of a new XR objective.

XLAUI/CAUI - the power dissipation added by this interface drives the form factor to larger size than what you proposed for 100G-SR10 in pepeljugoski_01_0108 (to quote: Module + Cage Dimensions: ~22 mm W x 17 mm H x 79 mm D) as the estimated power dissipation of this module at 5W is nearly totally consumed by the 10 lanes of lasers, drivers, detectors and amps, leaving no dissipation capacity for the CAUI interface.  I arrive at this conclusion by examining 10G SFP+ data sheets, where the typical power dissipation is ~0.5W, which is a good surrogate for a single lane in a parallel PMD module.  You discussed the size of the 40/100G-LR module as the common form suitable for all PMD interfaces.  This is equivalent to XENPAK as the common footprint for all 10G.  The envisioned large size of the 100G-LR modules, at about 2x XENPAK size, drives up the cost of host line cards due to lower density.  A package that is 2x XENPAK in size would have a footprint 5x that of your proposed module for 100G-SR.  The extra cost factor due to the drop in density was not included in the cost analysis that I contributed to the XR ad-hoc's last teleconference, and would drive the cost disparity between the MM and SM alternatives even larger.  To impose this extra cost structure on a multimode module would largely be defeating the purpose of offering multimode solutions in the first place.  So while this is a technically-feasible possibility for implementation, it would not result in an economically viable solution relative to other alternatives.

FEC - this was the least favored XR technology in a pole of the XR ad-hoc despite the fact that FEC seems straight forward as a digital implementation in the host ASIC.  So implementation likelihood in the market may be small and/or offerings spotty.

SM - good for telecoms and possibly for customers with very large traffic management problems, but not economically feasible for general data center customers per my contribution to the XR ad-hoc a couple weeks ago.  

Active cross connects - by this I assume you are talking about using repeaters.  Repeaters eliminate switch fabric and simply "regenerate" the signal.  They would be deployed when the channel length exceeds 100m.  While removal of the fabric function is a large cost reduction relative to the cost of a switch, the user must purchase an additional pair of optics modules for each repeater in the channel and provide space and power for the repeaters.  Six optics modules would be deployed to serve channels exceeding 200m.  This may remain economically advantaged relative to the SM alternative, but would be strongly disadvantaged relative to a single pair of XR optics.

Shrinking data center foot print - this seems like a reasonable trend given the changes occurring in server form factors.  But it will play out over decades, not years.  Customer's data center buildings are what they are, and will be in service for many years to come.  So while it is wise to have an eye on this trend, it is not wise to put a blind eye to the reality of the present which will carry forward for many years to come.  All twenty customers that responded to Corning's survey asked for solutions that support channels longer than 100m for deployment within their data centers.  

In summary, the alternatives you outline each have large hurdles representing barriers to success in the market.  This leaves me doubtful that they will substantially satisfy the market and still in search of a better solution.

Lastly, I will look at the market size issues you raise.  

Relying on the survey from Alan Flatman once again, the number of links in the A-D and D-C subsystems (those where link lengths exceed 100m) totaled 19,000.  From his length distributions, the number of these exceeding 100m is about 2200.  Discounting this by the projected 40G conversion rate in 2012 yields 618 converted links.  Repeating this at 100G yields 974 converted links.  This is just within nine data centers of varying sizes.  That represents a combined total of almost 1600 instances where these nine customers will be faced with the fact that without an XR on MMF solution their choices are sub-optimal, and potentially prohibitively costly.  That's 1600 reasons why these nine customer could delay or decline purchase.  To solve this we need to look beyond the percentages and enable the market by providing an economically feasible solution for the whole data center.  


Taking a statistical approach similar to that which you have outlined seems like truly fertile ground.  It has real potential to resolve the issues.  And given the precedent, it can be handled normatively within the standard rather than by adding informative material.  By being smarter with the methodology, it simply allows what was to be specified as a 100m solution to be rated to support longer distances without incurring uncomfortable  risk.  Taking this approach does not raise cost, and is in line with the spec levels everyone who supported the MM baseline already agreed.  

John, thanks for your insightful contribution.

Paul Kolesar
CommScope Inc.
Enterprise Solutions
1300 East Lookout Drive
Richardson, TX 75082
Phone:  972.792.3155
Fax:      972.792.3111
eMail:   pkolesar@xxxxxxxxxxxxx

"PETRILLA,JOHN" <john.petrilla@xxxxxxxxxxxxx>

08/20/2008 10:22 PM
Please respond to
"PETRILLA,JOHN" <john.petrilla@xxxxxxxxxxxxx>

[802.3BA] 802.3ba XR ad hoc next step concern

I’m concerned that the proposal of creating a new objective is leading us into a train wreck.  This is due to my belief that it’s very unlikely that 75% of the project members will find this acceptable.  This will be very frustrating for various reasons, one of which, almost all the modules expected to be developed will easily support the desired extended link reaches, will be discussed below.
I don’t want to wait until our next phone conference to share this in the hope that we can make use of that time to prepare a proposal for the September interim.  I’ll try to capture my thoughts in text in order to save some time and avoid distributing a presentation file to such a large distribution.  I may have a presentation by the phone conference.
Optical modules are expected to either have a XLAUI/CLAUI interface or a PMD service interface, PPI.  Both are considered.
A previous presentation, petrilla_xr_02_0708, has shown that modules with XLAUI/CLAUI interfaces will support 150 m of OM3 and 250 m of OM4.  These modules will be selected by equipment implementers primarily because of the commonality of their form factor with other variants, especially LR, and/or because of the flexibility the XLAUI/CLAUI interface offers the PCB designer.  Here the extended fiber reach comes for no additional cost or effort.  This is also true in PPI modules where FEC is available in the host.
Everyone is welcome to express their forecast of the timing and adoption of XLAUI/CLAUI MMF modules vs baseline MMF modules.
To evaluate the base line proposal for its extended reach capability, a set of Monte Carlo, MC, analyses were run.  The first MC evaluates just a Tx distribution against an aggregate Tx metric.  This is to estimate the percentage removed by the aggregate Tx test.  The second MC evaluates the same Tx distribution in combination with an Rx distribution and 150 m of worst case OM3.  The third MC repeats the second but replaces the 150 m of OM3 with 250 m of worst case OM4.  Worst case fiber plant characteristics were used in all link simulations.
The Tx distribution characteristics follow.  All distributions are Gaussian.
  Min OMA, mean = -2.50 dBm, std dev = 0.50 dBm (Baseline value = -3.0 dBm)
  Tx tr tf, mean = 33.0 ps, std dev = 2.0 ps (Example value = 35 ps)
  RIN(oma), mean = -132.0 dB/Hz, std dev = 2.0 dB (Baseline value = -128 to -132 dB/Hz, Example value = -130 dB/Hz)
  Tx Contributed DJ, mean = 11.0 ps, std dev = 2.0 ps (Example value = 13.0 ps)
  Spectral Width, mean = 0.45 nm, std dev = 0.05 nm (Baseline value = 0.65 nm).
  Baseline values are from Pepeljugoski_01_0508 and where no baseline value is available Example values from petrilla_02_0508 are used.
All of the above, except spectral width, can be included in an aggregate Tx test permitting less restrictive individual parameter distributions than if each parameter is tested individually.  In this example distributions are chosen such that only the mean and one std dev of the distribution satisfy the target value in the link budget spreadsheet.  If the individual parameter is tested directly to this value the yield loss would be approximately 16%.
The Rx distribution characteristics follow.  Again, all distributions are Gaussian.
  Unstressed sensitivity, mean = -12.0 dBm, std dev = 0.75 dB (Baseline value = -11.3 dBm)
  Rx Contributed DJ, mean = 11.0 ps, std dev = 2.0 ps (Baseline value = 13.0 ps)
  Rx bandwidth, mean = 10000 MHz, std dev = 850 MHz (Baseline value = 7500 MHz).
For the Tx MC, only 2% of the combinations would fail the aggregate Tx test.
For the 150 m OM3 MC, only 2% of the combinations would have negative link margin and fail to support the 150 m reach.  This is less than the percentage of modules that would have been rejected by the Tx aggregate test and a stressed Rx sensitivity test and very few would actually be seen in the field.
For the 250 m OM4 MC, only 8% of the combinations would have negative link margin.  Here approximately half of these would be due to transmitters and receivers that should have been caught at their respective tests.
The above analysis is for a single lane.  In the case of multiple lane modules, the module yield loss will increase depending on how tightly the lanes are correlated.  Where module yield loss is high, module vendors will adjust the individual parameter distributions such that more than one std dev separates the mean from the spread sheet target value.  This will reduce the proportion of modules failing the extended link criteria. Also, any correlation between lanes results in a module distribution of units that are shipped having fewer marginal lanes than where the lanes are independent.    
So while there’s a finite probability that a PPI interface module doesn’t support the desired extended reaches, the odds are overwhelming that it does.
Then with all of one form factor and more than 92% of the other form factor supporting the desired extended reach, the question becomes, ‘what’s a rational and acceptable means to take advantage of what is already available?’  A new objective would enable this but, as stated above getting a new objective for this is at best questionable.  Further, it’s expected that one would test to see that modules meet the criteria for the new objective, set up part numbers, create inventory, etc. and that adds cost.  Finally, users, installers, etc. are intelligent and will soon find this out and will no longer accept any cost premium for modules that were developed to support extended reach - they will just use a standard module.  There’s little incentive to invest in an extended reach module development.
I’ll make a modest proposal: Do nothing – just hook up the link.  Do nothing to the standard and when 150 m of OM3 or 250 m of OM4 is desired – just plug in the fiber.  The odds are overwhelming that it will work.  If something is really needed in the standard, then generate a white paper and/or an informative annex describing the statistical solution.
Background/Additional thoughts:
Even with all the survey results provided to this project, it’s not easy to grasp what to expect for a distribution of optical fiber lengths within a data center and what is gained by extending the reach of the MMF baseline beyond 100 m.  Here’s another attempt.
In flatman_01_0108, page 11, there’s a projection for 2012.  There for 40G, the expected adoption percentage of links in Client-to-Access (C-A) applications of 40G is 30%, for Access-to-Distribution (A-D) links, it is 30%, and for Distribution-to-Core (D-C)links it is 20%.  While Flatman does not explicitly provide a relative breakout of link quantities between the segments, C-A, A-D & D-C, perhaps one can use his sample sizes as an estimate.  This yields for C-A 250000, for A-D 16000 and for D-C 3000.  Combining with the above adoption percentages yields an expected link ratio of C-A:A-D:D-C = 750:48:6.
Perhaps Alan Flatman can comment on how outrageous this appears.
This has D-C, responsible for 1% of all 40G links, looking like a niche.  Arguments over covering the last 10% or 20% or 50% of D-C reaches does not seem like time well spent.  Even A-D combined with D-C, AD+DC, provides only 7% of the total.
Similarly for 100G:  the 2012 projected percentage adoption for C-A:A-D:D-C is 10:40:60 and link ratio is 250:64:18.  Here D-C is responsible for 5% of the links and combined with A-D generates 25% of the links.  Now the last 20% of AD+DC represents 5% of the market.
Since the computer architecture trend leads to the expectation of shorter link lengths and there are multiple other solutions that can support longer lengths, activating FEC, active cross-connects, telecom centric users prefer SM anyway, point-to-point connections, etc., there is no apparent valid business case supporting resource allocation for development of an extended reach solution.