thanks for offering this analysis. It
allows us to examine the situation from a new perspective, one of a statistical
approach rather than a total worst-case approach. This type of approach
has precedent, as I am sure you are aware, in 10GBASE-LRM development,
where the receiver stress test conditions were selected to cover 99% of
links at the specified maximum link length. This coverage was determined
by Monte Carlo simulation, cross-checked and aligned to the traditional
802.3 link model, performed in large part by John Ewen. See http://grouper.ieee.org/groups/802/3/aq/public/sep05/ewen_1_0905.pdf.
Here John Ewen modeled the MMF cabling
using distributions supplied by fiber manufacturers. Below you state
that you are using worst-case fiber models, not fiber distributions. Had
you taken the approach used for LRM, the coverage of your simulations would
have increased, almost assuredly to above 99% coverage for 150m of OM3,
and probably to 99% coverage on OM4 near 250m. For either OM3 or
OM4, if hitting a 99% threshold at rated link length means adjusting the
link length (up or down) a bit or making small adjustments to the transceiver
parameters, but retaining the benefit of offering a significantly longer
reach to address the complete data center, then I would support it. When
99% coverage levels at max length are combined with actual link length
distributions, the total coverage rate is nearly 100%.
I note the spectral width model you
chose leaves four standard deviations to the baseline spec limit, which
is unlike any other modeled parameter, where the baseline (or the example
value) is selected usually at one standard deviation. Presumably
this is because you don't envision a Tx test that lumps the spectral impairments
(of chromatic dispersion, mode partition noise, and their cross-products)
in with the other parameters. I think there may be a way of encompassing
all of these together, further simplifying the tests and improving yields,
using fiber with near-perfect DMD at 850nm as a spectrally dispersive element
in the test bed. I'd like to explore this more with the optics experts
in the Task Force.
Next I examine the interface and evolutionary
alternatives you suggest in lieu of a new XR objective.
XLAUI/CAUI - the power dissipation added
by this interface drives the form factor to larger size than what you proposed
for 100G-SR10 in pepeljugoski_01_0108 (to quote: Module + Cage Dimensions:
~22 mm W x 17 mm H x 79 mm D) as the estimated power dissipation of this
module at 5W is nearly totally consumed by the 10 lanes of lasers, drivers,
detectors and amps, leaving no dissipation capacity for the CAUI interface.
I arrive at this conclusion by examining 10G SFP+ data sheets, where
the typical power dissipation is ~0.5W, which is a good surrogate for a
single lane in a parallel PMD module. You discussed the size of the
40/100G-LR module as the common form suitable for all PMD interfaces. This
is equivalent to XENPAK as the common footprint for all 10G. The
envisioned large size of the 100G-LR modules, at about 2x XENPAK size,
drives up the cost of host line cards due to lower density. A package
that is 2x XENPAK in size would have a footprint 5x that of your proposed
module for 100G-SR. The extra cost factor due to the drop in density
was not included in the cost analysis that I contributed to the XR ad-hoc's
last teleconference, and would drive the cost disparity between the MM
and SM alternatives even larger. To impose this extra cost structure
on a multimode module would largely be defeating the purpose of offering
multimode solutions in the first place. So while this is a technically-feasible
possibility for implementation, it would not result in an economically
viable solution relative to other alternatives.
FEC - this was the least favored XR
technology in a pole of the XR ad-hoc despite the fact that FEC seems straight
forward as a digital implementation in the host ASIC. So implementation
likelihood in the market may be small and/or offerings spotty.
SM - good for telecoms and possibly
for customers with very large traffic management problems, but not economically
feasible for general data center customers per my contribution to the XR
ad-hoc a couple weeks ago.
Active cross connects - by this I assume
you are talking about using repeaters. Repeaters eliminate switch
fabric and simply "regenerate" the signal. They would be
deployed when the channel length exceeds 100m. While removal of the
fabric function is a large cost reduction relative to the cost of a switch,
the user must purchase an additional pair of optics modules for each repeater
in the channel and provide space and power for the repeaters. Six
optics modules would be deployed to serve channels exceeding 200m. This
may remain economically advantaged relative to the SM alternative, but
would be strongly disadvantaged relative to a single pair of XR optics.
Shrinking data center foot print - this
seems like a reasonable trend given the changes occurring in server form
factors. But it will play out over decades, not years. Customer's
data center buildings are what they are, and will be in service for many
years to come. So while it is wise to have an eye on this trend,
it is not wise to put a blind eye to the reality of the present which will
carry forward for many years to come. All twenty customers that responded
to Corning's survey asked for solutions that support channels longer than
100m for deployment within their data centers.
In summary, the alternatives you outline
each have large hurdles representing barriers to success in the market.
This leaves me doubtful that they will substantially satisfy the
market and still in search of a better solution.
Lastly, I will look at the market size
issues you raise.
Relying on the survey from Alan Flatman
once again, the number of links in the A-D and D-C subsystems (those where
link lengths exceed 100m) totaled 19,000. From his length distributions,
the number of these exceeding 100m is about 2200. Discounting this
by the projected 40G conversion rate in 2012 yields 618 converted links.
Repeating this at 100G yields 974 converted links. This is
just within nine data centers of varying sizes. That represents a
combined total of almost 1600 instances where these nine customers will
be faced with the fact that without an XR on MMF solution their choices
are sub-optimal, and potentially prohibitively costly. That's 1600
reasons why these nine customer could delay or decline purchase. To
solve this we need to look beyond the percentages and enable the market
by providing an economically feasible solution for the whole data center.
Taking a statistical approach similar
to that which you have outlined seems like truly fertile ground. It
has real potential to resolve the issues. And given the precedent,
it can be handled normatively within the standard rather than by adding
informative material. By being smarter with the methodology, it simply
allows what was to be specified as a 100m solution to be rated to support
longer distances without incurring uncomfortable risk. Taking
this approach does not raise cost, and is in line with the spec levels
everyone who supported the MM baseline already agreed.
John, thanks for your insightful contribution.
1300 East Lookout Drive
Richardson, TX 75082
08/20/2008 10:22 PM
Please respond to
[802.3BA] 802.3ba XR ad hoc next step
I’m concerned that the proposal of
creating a new objective is leading us into a train wreck. This is
due to my belief that it’s very unlikely that 75% of the project members
will find this acceptable. This will be very frustrating for various
reasons, one of which, almost all the modules expected to be developed
will easily support the desired extended link reaches, will be discussed
I don’t want to wait until our next
phone conference to share this in the hope that we can make use of that
time to prepare a proposal for the September interim. I’ll try to
capture my thoughts in text in order to save some time and avoid distributing
a presentation file to such a large distribution. I may have a presentation
by the phone conference.
Optical modules are expected to either
have a XLAUI/CLAUI interface or a PMD service interface, PPI. Both
A previous presentation, petrilla_xr_02_0708,
has shown that modules with XLAUI/CLAUI interfaces will support 150 m of
OM3 and 250 m of OM4. These modules will be selected by equipment
implementers primarily because of the commonality of their form factor
with other variants, especially LR, and/or because of the flexibility the
XLAUI/CLAUI interface offers the PCB designer. Here the extended
fiber reach comes for no additional cost or effort. This is also
true in PPI modules where FEC is available in the host.
Everyone is welcome to express their
forecast of the timing and adoption of XLAUI/CLAUI MMF modules vs baseline
To evaluate the base line proposal
for its extended reach capability, a set of Monte Carlo, MC, analyses were
run. The first MC evaluates just a Tx distribution against an aggregate
Tx metric. This is to estimate the percentage removed by the aggregate
Tx test. The second MC evaluates the same Tx distribution in combination
with an Rx distribution and 150 m of worst case OM3. The third MC
repeats the second but replaces the 150 m of OM3 with 250 m of worst case
OM4. Worst case fiber plant characteristics were used in all link
The Tx distribution characteristics
follow. All distributions are Gaussian.
Min OMA, mean = -2.50 dBm, std
dev = 0.50 dBm (Baseline value = -3.0 dBm)
Tx tr tf, mean = 33.0 ps, std
dev = 2.0 ps (Example value = 35 ps)
RIN(oma), mean = -132.0 dB/Hz,
std dev = 2.0 dB (Baseline value = -128 to -132 dB/Hz, Example value =
Tx Contributed DJ, mean = 11.0
ps, std dev = 2.0 ps (Example value = 13.0 ps)
Spectral Width, mean = 0.45
nm, std dev = 0.05 nm (Baseline value = 0.65 nm).
Baseline values are from Pepeljugoski_01_0508
and where no baseline value is available Example values from petrilla_02_0508
All of the above, except spectral width,
can be included in an aggregate Tx test permitting less restrictive individual
parameter distributions than if each parameter is tested individually.
In this example distributions are chosen such that only the mean
and one std dev of the distribution satisfy the target value in the link
budget spreadsheet. If the individual parameter is tested directly
to this value the yield loss would be approximately 16%.
The Rx distribution characteristics
follow. Again, all distributions are Gaussian.
Unstressed sensitivity, mean
= -12.0 dBm, std dev = 0.75 dB (Baseline value = -11.3 dBm)
Rx Contributed DJ, mean = 11.0
ps, std dev = 2.0 ps (Baseline value = 13.0 ps)
Rx bandwidth, mean = 10000 MHz,
std dev = 850 MHz (Baseline value = 7500 MHz).
For the Tx MC, only 2% of the combinations
would fail the aggregate Tx test.
For the 150 m OM3 MC, only 2% of the
combinations would have negative link margin and fail to support the 150
m reach. This is less than the percentage of modules that would have
been rejected by the Tx aggregate test and a stressed Rx sensitivity test
and very few would actually be seen in the field.
For the 250 m OM4 MC, only 8% of the
combinations would have negative link margin. Here approximately
half of these would be due to transmitters and receivers that should have
been caught at their respective tests.
The above analysis is for a single
lane. In the case of multiple lane modules, the module yield loss
will increase depending on how tightly the lanes are correlated. Where
module yield loss is high, module vendors will adjust the individual parameter
distributions such that more than one std dev separates the mean from the
spread sheet target value. This will reduce the proportion of modules
failing the extended link criteria. Also, any correlation between lanes
results in a module distribution of units that are shipped having fewer
marginal lanes than where the lanes are independent.
So while there’s a finite probability
that a PPI interface module doesn’t support the desired extended reaches,
the odds are overwhelming that it does.
Then with all of one form factor and
more than 92% of the other form factor supporting the desired extended
reach, the question becomes, ‘what’s a rational and acceptable means
to take advantage of what is already available?’ A new objective
would enable this but, as stated above getting a new objective for this
is at best questionable. Further, it’s expected that one would test
to see that modules meet the criteria for the new objective, set up part
numbers, create inventory, etc. and that adds cost. Finally, users,
installers, etc. are intelligent and will soon find this out and will no
longer accept any cost premium for modules that were developed to support
extended reach - they will just use a standard module. There’s little
incentive to invest in an extended reach module development.
I’ll make a modest proposal: Do nothing
– just hook up the link. Do nothing to the standard and when 150
m of OM3 or 250 m of OM4 is desired – just plug in the fiber. The
odds are overwhelming that it will work. If something is really needed
in the standard, then generate a white paper and/or an informative annex
describing the statistical solution.
Even with all the survey results provided
to this project, it’s not easy to grasp what to expect for a distribution
of optical fiber lengths within a data center and what is gained by extending
the reach of the MMF baseline beyond 100 m. Here’s another attempt.
In flatman_01_0108, page 11, there’s
a projection for 2012. There for 40G, the expected adoption percentage
of links in Client-to-Access (C-A) applications of 40G is 30%, for Access-to-Distribution
(A-D) links, it is 30%, and for Distribution-to-Core (D-C)links it is 20%.
While Flatman does not explicitly provide a relative breakout of
link quantities between the segments, C-A, A-D & D-C, perhaps one can
use his sample sizes as an estimate. This yields for C-A 250000,
for A-D 16000 and for D-C 3000. Combining with the above adoption
percentages yields an expected link ratio of C-A:A-D:D-C = 750:48:6.
Perhaps Alan Flatman can comment on
how outrageous this appears.
This has D-C, responsible for 1% of
all 40G links, looking like a niche. Arguments over covering the
last 10% or 20% or 50% of D-C reaches does not seem like time well spent.
Even A-D combined with D-C, AD+DC, provides only 7% of the total.
Similarly for 100G: the 2012
projected percentage adoption for C-A:A-D:D-C is 10:40:60 and link ratio
is 250:64:18. Here D-C is responsible for 5% of the links and combined
with A-D generates 25% of the links. Now the last 20% of AD+DC represents
5% of the market.
Since the computer architecture trend
leads to the expectation of shorter link lengths and there are multiple
other solutions that can support longer lengths, activating FEC, active
cross-connects, telecom centric users prefer SM anyway, point-to-point
connections, etc., there is no apparent valid business case supporting
resource allocation for development of an extended reach solution.