Thread Links Date Links
Thread Prev Thread Next Thread Index Date Prev Date Next Date Index

RE: Does Ten-Gigabit Ethernet need fault tolerance?


At 2:22 PM 99/7/16, Jonathan Thatcher wrote:
>A question and a suggestion:
>1. Are you suggesting that Fault Tolerence is a requirement for 10 Gig
>Ethernet or for all Ethernet?  Or, if FT is added to 10Gig Ethernet, is it of
>any particular value if 10, 100, and 1000 BASE-* don't have it?

I am suggesting fault tolerance as an optional enhancement for 10GbE only,
mainly because it's early enough in 10GbE's standards development timeline
that FT could be included without pain, if the committee so desires.

Another reason is that I would like to be able to buy FT/DT 10GbE products
a few years from now, for use in military systems.  If you recall from the
London GbE meeting, I intended to suggest this FT technology to GbE, but
the technology couldn't be released in time, and so missed the GbE
standards train.

As for the other ethernet standards, I propose nothing, although there is
no reason that they could not also take advantage of the offered FT
technology, should they so desire.

The RTFC technology allows some segments of an overall network to be FT,
and does not require all to be FT, so there is no reason for an all-or-none
approach.  In a network containing multiple FT segments, the segments react
to changes and roster independently of one another.

>A1: If all Ethernet: you should ask for a call for interest in 802.3 and
>bring presentations supporting the requirement (5 criteria, etc).
>A2: If only 10 Gig Ethernet: you should bring a presentation supporting the
>requirment to the next 802.3 HSSG meeting.  Expect questions about how this
>supports the 5 criteria. Expect questions about why only 10 Gig Ethernet.

The famous 5 criteria, lifted from slide 15 of thatcher_1_0399.pdf:

3.4.1. Broad Market Potential  -- FT is already in ATM/SONET, 802.3ad,
Rapid Reconfiguration in 802.1, etc, so there seems to be preexisting wide
agreement that fault tolerance is desirable and has a sufficiently broad
market potential.

3.4.2. Compatibility with IEEE Standard 802.3 -- Based on my experience
with 802.3z, I believe the offered technology is compatible, but the
committee is the expert here.

Some facts:  The current RTFC implementations use standard TriQuint
Fibre-Channel parts and Finisar optical transceivers for the gigabit links,
plus some code in a standard-issue FPGA.  Only Fibre Channel layers FC-0
and part of FC-1 are used, just as GbE does (although the details of use of
FC-1 differs).  The network segments (containing NICs, hubs, and fibers)
are either in "data mode" (with normal lan traffic), or in "rostering mode"
(where the new roster of NICs, hubs, and fibers are configuring themselves
into a working segment), and the protocols used in those two modes are
wholly independent of one another.  This is detailed in RTFC Principles of

3.4.3. Distinct Identity -- No problem.  No other fault and damage
tolerance algorithm works this way, and thus confers unique advantages.
For one, the technology is noticably simpler than all other FT technologies
I am aware of, and is a whole lot more robust (in that it also supports

Perhaps the key difference between this and other message-based fault
distributed tolerance schemes is that all other schemes attempted to be
stingy with mesages (because they are expensive in most distributed
systems), while RTFC is a flooding protocol with just enough population
control to prevent network saturation.  The use of flooding allowed a
radical simplification of the algorithm, and the implementation of true
damage tolerance rather than just fault tolerance.

3.4.4. Technical Feasibility -- It has been implemented, and is in use in a
military application, with others under consideration.

3.4.5. Economic Feasibility -- It has been implemented, and the algorithm
is quite simple, as detailed in RTFC Principles of Operation.  We are
basically talking about making a gate array slightly larger in those hubs
supporting fault tolerance (for which one can charge extra).

I guess the only requirement, in the sense that all of 10GbE would have to
follow it, is for the NICs to do their part in rostering, a simple task
easily buried in the NIC's state machines.  The rest is for an optional
variety of hub where one does the rest of the rostering algorithm.

If by "requirment" you mean only a one-liner like "10GbE shall support
Fault Tolerance", it wouldn't be much of a presentation.  I doubt that
anyone will argue that fault tolerance is undesirable; their question will
be "At what price?".  I claim the price is small, and the payoff large.  In
the final analysis, the matter will turn on how hard it is to implement the
algorithm, a matter of details.

I don't know that I will be able to attend many meetings, so I won't be a
very active proponent of my own technology.  As I said before, no salesman
will call.  But email is another matter.

More to the point, a few brave souls will no doubt read the RTFC Principles
of Operation, and if they think that there is something there that 10GbE
either wants or needs, and the rest of the committee comes to agree, the
technology will find its way into GbE.  Otherwise, it won't.  How else
could it be?

Basically, this technology is a gift, yours if you wish it.  I feel it is
of great value to 10GbE, and will be very interested to know what people
think after they have had time to absorb the core of the technology, and to
see the implications.


PS:  I'll be on travel, to an unrelated standards meeting, the week 19-23
July 1999.

>> -----Original Message-----
>> From: gwinn@xxxxxxxxxx [mailto:gwinn@xxxxxxxxxx]
>> Sent: Friday, July 16, 1999 2:15 PM
>> To: stds-802-3-hssg@xxxxxxxx
>> Subject: Does Ten-Gigabit Ethernet need fault tolerance?
>> The purpose of this note is to present a case for inclusion of fault
>> tolerance in 10GbE, and to offer a suitable proven technology for
>> consideration.  However, no salesman will call.
>The basic technical document, the RTFC Principles of Operation, is on the
>GbE website as " groups/802/3/ 10G_study/public/
>email_attach/ gwinn_1_0699.pdf" and "
>groups/802/3/10G_study/ public/ email_attach/ gwinn_2_0699.pdf".