RE: Data rate standards vs internal switching standards


I obviously didn't convey my point sufficiently well.  I have usually worked
on high-end, multi-protocol switches that have the characteristics you
mention.  The point I was attempting to make is that what you now label a
low-end switch was not that long ago a high end switch.  It is wrong to
brush off the issue because 10GbE won't be a low-end market anytime soon.
The rate we select will be with us for decades, and I bet low-cost
implementations will arrive sooner than you think if we make decisions that
enable them.

--Bob Grow

There is such a thing as over-simplifying an issue. Years ago when
CPUs were processing at a byte rate higher than the serial bit rate of
the link data streams, a single system clock may have been a valid
reality. This may still apply to "cut-through" L2 only switches today.
It is a very "cheap" way to build a low end data hub or switch. 

Data transfer rates are becoming so high that it is requiring multiple
parallel ASICs and parallel RISC processors, just to do the internal
frame routing. This does not include the processing that is being
added for L3 and L4 decision functionality. With today's high level
switches, with multi-functionality in the types of interfaces,
positioning the view that a single system clock as a requirement, is
an extreme over-simplification. I have too many discussions, with too
many vendors of high level switches, to buy into the scenario of a
single common clock for the internal processing as well as the I/O
link streams. While I can not discuss the technologies of the
different vendors, I can very safely say that I have not seen one high
end switch technology disclosure lately, that would have had a single
common clock.

I find it hard to believe that a switch with 10GbE interfaces will be
in a low end market anytime soon. Please stop applying low end
technology paradigm limitations to high end requirements. 

Thank you,
Roy Bynum
MCI WorldCom

In your second paragraph below, you make some invalid assumptions about
switches, and I find your arguements without substance.  Many switches are
not shared memory. Shared memory has basically reached its limits for high
performance fabrics.  (Memory speeds are not tracking the increase in
throughput requirements and you don't benchmark well if you go wider than a
minimum ethernet frame).  Alternate architectures do not share the data
buffering for the "overall switch".  A small FIFO or even a PHY frame buffer
is insignificant in monitoring the "overall 'health'" of the switch (just
look at the per port buffering in gigabit switch specifications).  One of
the things you try to avoid in a switch fabric design is allowing one port
to consume too much of the buffer because if you don't it will eventually
effect uncongested ports.

I also think you are wrong is assuming that because a factor of 10 is less
optimal than 2^n, it doesn't matter if the data rate is not an integer
multiple.  I was using the same techniques you advocate for managing the
different clock domains more than 20 years ago, and have also designed
switches with both integer multiple clock and non-integer multiple clock
domains.  It is because of this experience that I believe a 10.000 Gb/s MAC
is the better system solution.  including some with similar clock domains to
what you and because I know it I appreciate the simplification that an
integer multiple gives in building a switch.

--Bob Grow

Actually, I am not in favor of a programmable IPG. I think that the IPG
should be set to minimum for all frames in full duplex 10GbE. With 400
bytes as the current average size of Internet 802.3 frames, I don't
think that there will be enough "slop" to make up the difference
between a 10.0 gb MAC and a 9.584 gb PHY. In the future, with more
and more video based applications, the average size of the data frame
will be increasing. This will only cause the MAC buffer discard rate
to increase if the MAC and PHY are not data rate matched. I would much
rather see the data rate be defined at the MAC, not the PHY. 

I would much rather see the data buffered internal to the switch
matrix than at the MAC. This will allow the overall switch to act as a
buffer instead of perhaps only one output link that is operationally
overloaded. It will also help network management to be able to monitor
the overall "health" of a switch or network architecture without
detrimental effect to the users. Since there are no standards,
requirements, or optimal internal data interchange clock rates other
than modulo 2, it does should not matter that the higher data rate
interfaces are operating at a slightly different output clock. The
technology to move data at inconsistent transfer rates through a
system and between interfaces was invented almost 20 years ago. This
is not something new.

I am more concerned with how this technology will be implemented by
the customers'. I am concerned with how relatively "lower tech"
implemented and support people will be able use 10GbE. I am concerned
about designers and implementers that do not, can not, or will not
understand how users and their applications will make use of the
extended LAN/MAN/WAN data networking environment. People will no
longer be building extended data networks using routed meshed and
semi-meshed virtual circuits, but will be using switched virtual
segments over meshed semi-meshed 10GbE links. This is a very different
implementation environment from the simple switched LAN that 100BT
exists in. GbE has started to function in this realm. 10GbE will
definitely be used like this in a major way.

Thank you,
Roy Bynum
MCI WorldCom

Just a quick reappearance to address your follow up questions.

> I would like an additional clarification. In recognizing that the data
> clocking standards are at the exposed interface of the data link, does
> this mean that the standard applies to the MII between the MAC layer
> and the PHY or does it apply more to the PHY?

You have examples of both. The MII/GMII case is obvious.

Other exposed interfaces were defined within the PHY to reflect real
life partitioning into:

- Clock recovery being the realm of exotic circuits designed by long
haired gurus that seldom show up for work before 11 A.M.

- The coding layer, can go into CMOS based MAC ASICs, and can be
designed at any time of the day by many mortals skilled in the art of
digital design.


> As for the OC rate standard, there are several standards for mapping
> data into SONET/SDH transports. From an 802.3 view point, the
> SONET/SDH standards can be treated as layer 1 functional processes.
> From the other side of that argument, the current packet over SONET
> (POS) standard for mapping required an additional standard for
> inserting a layer 2 functionality between the layer 3 IP protocol and
> the layer 1 SONET protocol. 802.3 does not have that requirement for
> an additional functionality. In many ways it is more of a question of
> how much of the SONET/SDH standards would not used for 10GbE,
> depending on the implementation of the interfaces. 

My emphasis was that the optimization of the data rate to is
specific for a particular mapping and cannot be adopted in isolation
unless such mapping is also embraced.

Finally, I hope that your lack of objection to a 10Gbps rate with a
statically programmable IPG is somehow a sign of agreement.

> Thank you,
> Roy Bynum
> MCI WorldCom


Ariel Hendel 
Sun Microsystems