Thread Links Date Links
Thread Prev Thread Next Thread Index Date Prev Date Next Date Index

RE: Does Ten-Gigabit Ethernet need fault tolerance?



Title: RE: Does Ten-Gigabit Ethernet need fault tolerance?

Unless I'm misunderstanding something, it sounds like 802.3ad link aggregation and/or normal switch/routing reconfiguration would provide the recovery mechanisms you need. 802.3ad provides redundant links between devices and the upper protocols handle reconfiguration if devices fail.

Walter Thirion
Level One Communications
512-407-2110


> -----Original Message-----
> From: owner-stds-802-3-hssg@xxxxxxxxxxxxxxxxxx
> [mailto:owner-stds-802-3-hssg@xxxxxxxxxxxxxxxxxx]On Behalf Of
> Joe Gwinn
> Sent: Friday, July 16, 1999 3:15 PM
> To: stds-802-3-hssg@xxxxxxxx
> Subject: Does Ten-Gigabit Ethernet need fault tolerance?
>
>
>
> The purpose of this note is to present a case for inclusion of fault
> tolerance in 10GbE, and to offer a suitable proven technology for
> consideration.  However, no salesman will call.
>
> Why Fault Tolerance?  Ten-Gigabit Ethernet is going to be a relatively
> expensive, high-performance technology intended for major backbones,
> perhaps even nibbling at the bottom end of the wide-area network (WAN)
> market.  In such applications, high availability is very much
> desired; loss
> of such a backbone or WAN is much too disruptive (and
> therefore expensive)
> to be much tolerated, and this kind of a market will gladly pay a
> reasonable premium to achieve the needed fault tolerance.
>
> Why add Fault Tolerance now?  Because it's easiest (and thus
> cheapest) if
> done from the start, and because having FT built in and
> therefore becoming
> ubiquitous will be a competitive discriminator, neutralizing
> one of the
> remaining claimed advantages of ATM.
>
> Isn't Fault Tolerance difficult?  In hub-and-spoke (logical
> star, physical
> loop) topologies such as GbE and10GbE, it's not hard to
> achieve both fault
> tolerance (FT) and military-level damage tolerance (DT).  In
> networks of
> unrestricted topology, it's a lot harder.  The presence of
> bridges does not
> affect this conclusion.
>
> How do I know that FT is so easily achieved?  Because it's
> already been
> done, may be bought commercially, and is in use on one
> military system and
> is proposed for others.  The FT/DT technology mentioned here
> was developed
> on a US Navy project, and is publically available without intellectual
> property restrictions.  Why was the technology made public? 
> To encourage
> its adopotion and use in COTS products, so that defense
> contractors can buy
> FT/DT lans from catalogs, rather than having to develop them again and
> again, at great risk and expense.
>
> What is the difference between Fault Tolerance and Damage
> Tolerance?  In
> fault tolerance, faults are rare and do not correlate in
> either time or
> place. The classic example is the random failure of hardware
> components.
> (Small acts of damage, such as somebody tripping over a wire
> or breaking a
> connector somewhere, are treated as faults here because they
> are also rare
> and uncorrelated.) In damage tolerance, the individual faults
> are sharply
> correlated in time and place, and are often massive in
> number. The classic
> military example is a weapon strike. In the commercial world,
> a major power
> failure is a good example. Damage tolerance is considered
> much harder to
> accomplish than fault tolerance. If you have damage
> tolerance, you also
> have fault tolerance, but fault tolerance does not by itself
> confer damage
> tolerance.
>
> How is this Damage Tolerance achieved?  All changes in LAN
> segment topology
> (the loss or gain of nodes (NICs), hubs, or fibers) are
> detected in MAC
> hardware by the many link receivers, which report both loss
> and acquisition
> of modulated light. This surveillance occurs all the time on
> all links, and
> is independent of data traffic. Any change in topology provokes the
> hardware into "rostering mode", the automatic exploration of
> the segment
> using a flood of special "roster" packets to find the best path, where
> "best" is defined as that path which includes the maximum
> number of nodes
> (NICs).
>
> Just how fault tolerant and damage tolerant is this scheme? 
> A segment will
> work properly with any number of nodes and hubs, if sufficient fibers
> survive to connect them together, and will automatically
> configure itself
> into a working segment within a millisecond of the last fault. If the
> number of broken fibers is less than the number of hubs, all surviving
> nodes will remain accessible, regardless of the fault pattern. If the
> number of fiber breaks is equal to or greater than the number of hubs,
> there is a simple equation to predict the probability of loss
> of access to
> a typical node due to loss of hubs and/or fibers, given only
> the number of
> hubs and the probability of any fiber breaking: Pnd[p,r]=
> ((2p)(1-p))^r,
> where p is the probability of fiber breakage and r is the number of
> surviving hubs (which ranges from zero to four in a quad system). This
> equation is exact (to within 1%) for fiber breakage
> probabilities of 33% or
> less, and applies for any number of hubs.
>
> The simplicity of this equation is a consequence of the
> simplicity of this
> protocol, which is currently implemented in standard-issue FPGAs (not
> ASICs), and works without software intervention.  It can also be
> implemented in firmware.
>
> To give a numerical example, in a 33-node 4-hub segment, loss
> of 42 fibers
> (16% of the segment's 264 fibers) would lead to only 0.5% of the nodes
> becoming inaccessible, on average. Said another way, after 42
> fiber breaks,
> there are only five chances out of a thousand that a node will become
> inaccessible. This is very heavy damage, with one fiber in
> six broken. To
> take a more likely example, with three broken fibers, all
> nodes will be
> accessible, and with four broken fibers, there is less than
> one chance in a
> million that a node will become inaccessible. Recovery takes
> two ring tour
> times plus settling time (electrical plus mechanical),
> typically less than
> one millisecond in ship-size networks, measured from the last fault.
> Chattering and/or intermittent faults can be handled by a number of
> mechanisms, including delaying node entry by up to one
> second. Few current
> LAN technologies approach this degree of resilience, or speed
> of recovery.
>
> In commercial systems and some military systems, a dual-ring
> solution is
> sufficient.  Up to quad-ring solutions are comercially
> available, needed
> for some military systems.  However, the ability to support up to quad
> redundant systems should be provided in 10GbE, for two
> reasons.  First,
> quad is needed for the military market, which may be economically
> significant in the early years of 10GbE.  Second, quad
> provides a clear
> growth path and a way to reassure non-military customers that
> their most
> stringent problems can be solved: One can ask them if their
> needs really
> exceed those of warships duelling with supersonic missiles.
>
> The basic technical document, the RTFC Principles of
> Operation, is on the
> GbE website as "http://grouper.ieee.org/ groups/802/3/
> 10G_study/public/
> email_attach/ gwinn_1_0699.pdf" and "http://grouper.ieee.org/
> groups/802/3/10G_study/ public/ email_attach/
> gwinn_2_0699.pdf".   I was a
> member of the team that developed the technology, and am the author of
> these documents.
>
> Although these documents assume RTFC, a form of distributed
> shared memory,
> the basic rostering technology can easily be adapted for Gigabit and
> Ten-Gigabit Ethernet as well.  For nontechnical reasons, RTFC
> originally
> favored smart nodes connected via dumb hubs.  However, the
> overall design
> can be somewhat simplified if one goes the other way, to dumb
> nodes and
> smart hubs.  This also allows the same dumb nodes to be used
> in both non-FT
> and FT networks, increasing node production volume, and does not force
> users to throw nodes away to upgrade to FT.
>
> I therefore would submit that 10GbE would greatly benefit from fault
> tolerance, and also that it's very easily achieved if included in the
> original design of 10GbE.
>
> Joe Gwinn
>