Thread Links Date Links
Thread Prev Thread Next Thread Index Date Prev Date Next Date Index

Re: Does Ten-Gigabit Ethernet need fault tolerance? (nonredundant NICs)


At 7:53 PM 99/7/27, Roy Bynum wrote:
>You use the term "rostering algorithm".  Does this mean that the using P802.3ad
>would not be a simple binary decision circuit built into the chip?  Very
>few fault
>tolerance systems have more than a binary structure, those are tertiary
>with simple
>"lockstep" hardware logic.

I don't understand 802.3ad well enough to answer the question.  Could you
expand the question a bit?

>Over the years, I have learned that the closer that you get to the level
>that is
>being "protected" the faster and more reliable fault tolerance is.  In
>this case, it
>is the optical transport that is being "protected".  I am all for the use
>of link
>aggregation for existing 802.3 interfaces, primarily because fault allowance
>technology does not exist for them otherwise.  Simple hardware fault tolerance
>technology does exist for 10gb interfaces today.

I agree that closer is better, and that RTFC protects the optical
transport.  RTFC also protects against the loss of nodes (containing NICs)
and hubs.   However, RTFC is not link aggregation;  with RTFC, the capacity
of the segment is identical to the capacity of a non-redundant segment of
the same topology, regardless of how many added hubs and links have been
provided.  The whole point of RTFC is fault and damage tolerance, not
capacity enhancement.

>10GbE will most likely not be implemented over BLSR rings in its early
>stages of
>deployment.  This is because of the massive amount of fiber transport
>that are being deployed today.  I do think that any WAN implementation
>will use the
>2km interface directly into what is called a "lite LTE".  This is an LTE
>that has
>line/segment SONET/SDH OAM&P functionality, without the TDM multiplexing of a
>standard LTE.  The 10GbE will have path overhead functionality only.  This
>type of
>interface will need very simple fiber maintenance functionality, the kind
>that is
>resolved by a simple binary hardware solution.

What's a "BLSR ring"?   A SONET component?  If so, SONET already has its
own fault tolerance provisions.  One may debate its adequacy, but I would
doubt that anybody will wish to layer RTFC on top of SONET.  Nor do I
propose any such thing.  That leaves pure LAN implementations of 10GbE
needing fault and damage tolerance.

In any case, RTFC works in ring topologies.

>The 40km implementation will be used over metropolitan, leased fiber
>systems.  These
>will be, for the most part, diverse path 1+1 systems.  This kind of
>deployment will
>need very robust, tightly coupled fault tolerance functionality.  Without the
>ability to control fiber breaks, fiber degradation, and other fiber
>related issues,
>the ability to switch to alternate receiver with minimum loss of data
>traffic will
>be paramount.  I have a hard time believing that any upper layer
>functionality can
>accomplish this with 100% reliability.

RTFC is designed to handle just such systems, altough the recovery times
will be slowed by pesky speed-of-light delays.  Assuming that 40 kilometers
is the network diameter, and that there are 100 nodes in the 10 gigabit
backbone, with a hub in the center, a ring tour time would be 20
milliseconds, so recovery would be two or three times that, call it 50
milliseconds.  Nobody can do better, as this is 99% speed of light delay.

If I understand your nomenclature, RTFC and rostering are lower-layer
functions, and I agree that recovery is easier the lower it's done.  And,
the faster it's done, the less impact on upper-layer software and users.


>Joe Gwinn wrote:
>> Roy,
>> At 9:12 PM 99/7/24, Roy Bynum wrote:
>> >
>> >Does RTFC allow a minimally trained individual to simply plug two fiber T/R
>> >pairs into the 10GbE interface to implement fault tolerance and if a
>>second T/R
>> >pair, parallel to the first, is not plugged in the fault tolerance is not
>> >implemented?  This will be the simplest and most common implementation
>> Yes, this will work, by design.  The rostering algorithm will just treat
>> the missing path as broken, and press on.  There is no problem with parts
>> of the segment having non-redundant NICs, although those NICs will be cut
>> out of the segment if those NICs or their links fail.
>> Joe