Thread Links Date Links
Thread Prev Thread Next Thread Index Date Prev Date Next Date Index

Re: [802.3BA] XR ad hoc Phone Conference Notice


I have an issue with the following statement:

 Indeed, given that latency is a major performance concern for HPC, the vendors of such machines may prefer to use InfiniBand.  This could mean that one of the primary customers to which we have tuned our present objective will actually not use Ethernet, but will benefit anyway by driving InfiniBand to adopt the same 100m PMD specs that 802.3ba defines.

I reviewed the top 500 org website for their June 2008 report on the top 500 supercomputers.  Ethernet at this point has a 56.8% share of the interconnects.  See  Infiniband has 24.20%.  So I believe that this demonstrates that this area will use Ethernet.  Next, in regards to latency, there is the Data Center Bridging Task Group in 802.1 that is working on this.

Therefore, I do not agree with the statement that “one of the primary customers to which we have tuned our present objective will actually not use Ethernet.” 


Paul Kolesar wrote:

thanks for furthering the discussion.  Your views make sense to me.

I'd like to examine the super computer cabling distance distribution that Petar shared with us yesterday in a bit more detail.  I've plotted it to allow folks to see it in graphical form.

This data has several features that are remarkably similar to that of general data center cabling.  
1) The distribution is highly skewed towards the shorter end of the distribution.  
2) The distribution has a very long tail relative to the position of the mode, the most frequent length, at 20m.
3) The mode is at a distance that is one fifth of the maximum length.

The white dot on the graph represents the coordinate of equivalent coverage relative to the 100m objective to the data center cabling distribution.  Speaking to Steve's point that questions the correctness of the 100m objective for HPC environments, I would venture to say that a 25m objective, which is the roughly equivalent in coverage to the 100m objective we are attempting to apply to data centers, would not be satisfactory for the HPC environment, as it would leave a significant portion of the channels without a low-cost solution.  

It is clear that the 100m objective is a near-perfect match to the needs of HPC.  Yet I do not believe that HPC should be the primary focus of our development.  We must be developing a solution that properly satisfies a much larger market than this or we are wasting our time.  Indeed, given that latency is a major performance concern for HPC, the vendors of such machines may prefer to use InfiniBand.  This could mean that one of the primary customers to which we have tuned our present objective will actually not use Ethernet, but will benefit anyway by driving InfiniBand to adopt the same 100m PMD specs that 802.3ba defines.  This possibility reinforces my perspective that we need to properly address a broader set of customers - those that operate in the general data center environment.  It is clear from all of the data and surveys that remaining only with a 100m solution misses the mark for this broader market.  Continuing under this condition will mean that the more attractive solution for links longer than 100m in the general data center will be to deploy link aggregated 10GBASE-SR.  Its cost will be on par and it will reach the distances the customers need in their data centers.  

Is this the future you want for all our efforts, or do you want to face the facts and address the issue head on with a solution that gives data center customers what they need?  

Next week these decisions will be placed before the Task Force.  I hope we choose wisely.

Paul Kolesar
CommScope Inc.
Enterprise Solutions
1300 East Lookout Drive
Richardson, TX 75082
Phone:  972.792.3155
Fax:      972.792.3111
eMail:   pkolesar@xxxxxxxxxxxxx

"Swanson, Steven E" <SwansonSE@xxxxxxxxxxx>

07/11/2008 07:32 AM

PKOLESAR@xxxxxxxxxxxx, STDS-802-3-HSSG@xxxxxxxxxxxxxxxxx

RE: [802.3BA] XR ad hoc Phone Conference Notice

I think Paul's suggestion is a good one; I would like to add some other input (in the form of questions) from my point of view:
1. Do we have the right MMF objective (support at least 100m on OM3 fiber)?
My data suggests that we don't; we have tried to come at this from two different directions, trying to be as unbiased as possible in assessing the situation. I presented Corning sales data in November 2006 (see This data showed a need to support a link length longer than 100m and I recommended that we support 200m at that time.
We also polled our customers, offering three options, a low cost, single PMD at 100m on OM3, a slightly higher cost single PMD at 150-200m on OM3, and a third option that would specify two PMDs consisting of both option 1 and option 2. The results were overwhelmingly in favor of Option 2, a single PMD at longer length. A small number supported Option 3 (2 PMDs) but NONE supported Option 1. While it is true that many of our customers have a substantial portion of their link lengths that are less than 100m, they all have link lengths longer than 100m. One customer noted that more than half of his data center had link lengths longer than 100m.
Kolesar presented his company's sales data in September 2006 (see His data also suggested that longer link lengths were needed and he recommended 150m at that time.
All the data for datacenter seems to suggest that 100m is TOO SHORT to cover a significant portion of the datacenter application.
Pepeljugoski presented new data yesterday on HPC link lengths that show 85% being less than 20m and 98% less than 50m. This might suggest that 100m is TOO LONG for HPC applications.This leads to another question of whether there is any economic or technical advantage to a shorter MMF objective for HPC?
2. Is there consensus on supporting a longer reach objective for MMF?
I think there is, others on the call yesterday did not. I base my opinion on the straw poll conducted in Munich:
Straw Poll #15: Should we continue to work on a proposal for an annex to extend the reach of a 40GBASE-SR4 and 100GBASE-SR10 in addition to the proposal(“pepeljugoski_01_0508.pdf”) as in “jewell_01_0508.pdf”.

Yes: 55

No: 3

3. Could we achieve 75% support for adding a new MMF objective?

I don't know but if we could not, I would be forced to vote against adopting the current MMF baseline proposal (which I don't want to do) and I think others may also. This may or may not lead to an impasse similar to what we experienced in 802.3ae.

I understand the concern that adding the objective without a clear consensus on how to support the new objective could lead to delay but I have found this committee to be very resourceful in driving to a solution after we have made a decision to go forward. 40G is one recent example of a situation where no consensus turned very quickly to consensus.

I think adding a new objective is the right approach and in the long run will save the task force valuable development time.

4. Can we agree on the right assumptions on the 10G model to evaluate the various proposals?
Everyone seems to be using slightly different variations of the model to evaluate the capability of the proposal; we need to agree on a common approach of analysis.
5. Can we not let the discussion on OM4 cloud the decision?
We can get extended link lengths on OM3. By achieving longer lengths on OM3, even longer lengths will be possible on OM4 with the same specification. What I don't want people to think is that OM4 is required to get longer lengths.
6. Summary
John D'Ambrosia has provided advice that if we want to move forward with a new MMF objective, July is the time to do it - if we delay the decision, it is guaranteed to delay the overall process. Some might think if we make the decision, it will delay the overall process but we don't know that yet. I don't think adding an informative specification on a PMD is the right way to go - let's get the MMF objective(s) right - we owe it to ourselves and to our customers. To do anything less is just avoiding the issue. Let's get the objectives set, get the assumptions correct and utilize the process set up by Petrilla and Barbieri to drive toward the hard decisions that we are all very capable of making.
Steve Swanson

From: Paul Kolesar [mailto:PKOLESAR@xxxxxxxxxxxx]
Thursday, July 10, 2008 7:19 PM
Re: [802.3BA] XR ad hoc Phone Conference Notice


I'd like to continue your thread with some observations that have driven me to certain conclusions, and to follow that with a suggestion about how to parse the approach and drive to a consensus position.

First let's consider what various customers are telling us.  The Corning survey of their customers, which has been presented to the Ethernet Alliance, the XR ad-hoc, and will be presented next week to 802.3ba, shows that the large majority of customers want a single PMD solution that can provide 150m on OM3 and 250m on OM4.  A minority were willing to accept a two PMD solution set that delivers the lowest cost PMD to serve up to 100 m and a second PMD to serve the extended distances as above.  Not a single response indicated a preference for a solution limited to 100m.  We also hear strongly expressed opinions from various system vendors that a longer distance solution is not acceptable if it raises cost or power consumption of the currently adopted 100m PMD.  Under these conditions, and given the options presented and debated within the XR ad-hoc, I believe you are justified in concluding that a single PMD cannot satisfy all these constraints.  Yet it is clear to me that the market will demand a low-cost PMD that can support more than 100m to fulfill the distance needs of data centers.  Therefore I conclude that the correct compromise position is to develop a two-PMD solution.  If the committee does not undertake this development, it is likely that several different proprietary solutions will be brought to the market, with the net result of higher overall cost structures.
So let's consider how to choose from among the various proposals for an extended reach PMD and let the determination of how to document it within the standard be addressed after that.

I would propose a series of polls at next week's meeting designed to gauge the preferences of the Task Force.  I do not think that any XR proposal will garner >75% at the outset, so I would propose the use of Chicago rules wherein members may vote for all the proposals they find acceptable.  From this we can see which of the solutions is least acceptable.  Then through a process of elimination from the bottom, and repeated application of Chicago rules for the remainder, finally determine the most acceptable solution.  

Depending on the degree of maturity of the specifications or other considerations for the chosen solution, the Task Force will be better able to determine how it should be handled within the standard.  For example, a proposal with a maturity on par with the adopted baseline could be put forth under a new objective without undue concern of becoming a drag on the timeline, while a proposal of lesser maturity could be placed in an annex without an additional objective.

Paul Kolesar
CommScope Inc.
Enterprise Solutions
1300 East Lookout Drive
Richardson, TX 75082
Phone:  972.792.3155
Fax:      972.792.3111
eMail:   pkolesar@xxxxxxxxxxxxx

"Alessandro Barbieri (abarbier)" <abarbier@xxxxxxxxx>

07/10/2008 04:43 PM
Please respond to
"Alessandro Barbieri (abarbier)" <abarbier@xxxxxxxxx>


Re: [802.3BA] XR ad hoc Phone Conference Notice


here is my *personal* read of the situation in the XR ad hoc:


a) I think there could be consensus on supporting XR, as long as we pick a solution that does not impact the cost structure of the 100m PMD. Because of that I also don't feel a single PMD is realistic at this point.


a) The trouble however is that there is no consensus (>75%) on any of the technical proposals. No one proposal has a clear lead over the others.


Of the three options you list below, I think adding an objective for a ribbon XR PMD could have a major impact on the project schedule, because it seems we are nowhere near technical consensus. We could drag the discussion for several TF meetings...I am not sure delaying the project over this specific topic is worth it.  

We can always resort to non-standard solutions to fulfill market requirements we can't address within IEEE, or come back in the future with another CFI.


At the end of the conference call earlier today I requested that we get together after hours next week to see if we can accelerate consensus building.
All the data is on the table now, so if we don't show any material progress, I am not sure we should extend this ad hoc.



From: Matt Traverso [mailto:matt.traverso@xxxxxxxxx]
Thursday, July 10, 2008 10:07 AM
Re: [802.3BA] XR ad hoc Phone Conference Notice


I feel that we are coming to a situation similar to the impasse at 40G vs. 100G where different participants call different segments of the networking industry their customer.  

For MMF, I'd like to see an optimized solution at 100m per all of the work that has been done.  

I'd like to understand if folks feel that a different status for the extended reach
a) Informative
b) Normative
c) New objective

would significantly alter the technically proposed solution from the Ad Hoc.  Opinions?


The case of slow market/industry transition from LX4 to LRM is one of the reasons why I would like to see the industry adopt 40G serial from the launch.  The slow adoption of LRM has primarily been limited by end customer knowledge of the solution. 40G serial technology is available.


Hi Gourgen,

Some numbers might help clarify what close to 0 means.

For 2008, Lightcounting gives a shipment number of approximately 30,000 for 10GE-LRM (and for 10GE-LX4 it's about 60,000.) So close to 0 would apply if we were rounding to the nearest 100K. As an aside, 10GE-LRM supports 220m of MMF, not 300m.

300m of OM3 is supported by 10GE-SR, which Lightcounting gives as approximately 400,000 in 2008, so that would be close to 0 if we rounding to the nearest 1M.

Another interesting sideline in looking at these numbers is that 2 years after the 10GE-LRM standard was adopted in 2006, despite the huge investment being made in 10GE-LRM development, and despite very little new investment being made in 10GE-LX4, the 10GE CWDM equivalent (i.e. 10GE-LX4, 4x3G) is chugging along at 2x the volume of the 10GE Serial solution that was adopted to replace it.

This should put some dim on hopes that very low cost 40GE Serial technology can be developed from scratch in two years and ship in volume when the 40GE standard is adopted in 2010.


From: Gourgen Oganessyan [mailto:gourgen@xxxxxxxxxxx]
Wednesday, July 09, 2008 8:02 PM

To: STDS-802-3-HSSG@xxxxxxxxxxxxxxxxx
Re: [802.3BA] XR ad hoc Phone Conference Notice


Well, sadly that's what has been happening in the 10G world, people are forced to amortize the cost of 300m reach (LRM), while in reality the number of people who need 300m is close to 0.

That's why I am strongly in support of your approach of keeping the 100m objective as primary goal.

Frank, OM4 can add as much cost as it wants to, the beauty is the added cost goes directly where it's needed, which is the longer links. Alternatives force higher cost/higher power consumption on all ports regardless of whether it's needed there or not.

Gourgen Oganessyan

Quellan Inc.

Phone: (630)-802-0574 (cell)

Fax:     (630)-364-5724

e-mail: gourgen@xxxxxxxxxxx

From: Petar Pepeljugoski [mailto:petarp@xxxxxxxxxx]
Wednesday, July 09, 2008 7:51 PM
Re: [802.3BA] XR ad hoc Phone Conference Notice


If I interpret correctly, you are saying that all users should amortize the cost of very few who need extended reach.
We need to be careful how we proceed here - we should not repeat the mistakes of the past if we want successful standard.



Petar Pepeljugoski
IBM Research
P.O.Box 218 (mail)
1101 Kitchawan Road, Rte. 134 (shipping)
Yorktown Heights, NY 10598

phone: (914)-945-3761
fax:        (914)-945-4134

From: Frank Chang <ychang@xxxxxxxxxxx>
To: STDS-802-3-HSSG@xxxxxxxxxxxxxxxxx
Date: 07/09/2008 10:29 PM
Subject: Re: [802.3BA] XR ad hoc Phone Conference Notice

Hi Jeff;

Thanks for your comment. You missed one critical point that there is cost increase from OM3 to OM4. If you take ribbon cable cost in perspective, OM4 option is possibly the largest of the 4 options.

Besides, the use of OM4 requires to tighten TX specs which impact TX yield, so you are actually compromising the primary goal.


From: Jeff Maki [mailto:jmaki@xxxxxxxxxxx]
Wednesday, July 09, 2008 7:02 PM
Re: [802.3BA] XR ad hoc Phone Conference Notice

Dear MMF XR Ad Hoc Committee Members,

I believe our current objective of "at least 100 meters on OM3 MMF" should remain as a primary goal, the baseline.  Support for any form of extended reach should be considered only if it does not compromise this primary goal.  A single PMD for all reach objectives is indeed a good starting premise; however, it should not be paramount.  In the following lists are factors, enhancements, or approaches I would like to put forward as acceptable and not acceptable for obtaining extended reach.

Not Acceptable:

1. Cost increase for the baseline PMD (optic) in order to obtain greater than 100-meter reach

2. EDC on the system/host board in any case

3. CDR on the system/host board as part of the baseline solution

4. EDC in the baseline PMD (optic)

5. CDR in the baseline PMD (optic)


1. Use of OM4 fiber

2. Process maturity that yields longer reach with no cost increase

In summary, we should not burden the baseline solution with cost increases to meet the needs of an extended-reach solution.


Jeffery Maki


Jeffery J. Maki, Ph.D.

Principal Optical Engineer

Juniper Networks, Inc.

1194 North Mathilda Avenue

Sunnyvale, CA  94089-1206

Voice +1-408-936-8575

FAX +1-408-936-3025