Thread Links Date Links
Thread Prev Thread Next Thread Index Date Prev Date Next Date Index

RE: Wide Area Networking for the Rest of US - the debate on BER and other issues




Roy:

For datacom applications, it is true the retransmission mechanism of TCP was
created to recover the incorrect data received at the receiving terminal,
but not intended for flow control. However, retransmission will occur from
multiple reasons: bit errors, congestion, and excessive path round-trip
time.  To have a reliable system, as you mentioned, the network bandwidth
and protocol have been vastly improved to avoid, or minimize congestion.

From our field data, the retransmissions due to congestion is much, much
smaller than the Internet link, 1-3% reported by Bill, as long as the
over-all link throughput was correctly designed.  Nevertheless, it is
important the BER is kept much, much smaller than the congestion rate, but
not to waste beyond requirement. 

Ed Chang
Unisys Corporation         




-----Original Message-----
From: Roy Bynum [mailto:rabynum@airmail.net]
Sent: Saturday, June 05, 1999 11:59 AM
To: bill.st.arnaud@canarie.ca
Cc: Larry Miller; stds-802-3-hssg@ieee.org
Subject: Re: Wide Area Networking for the Rest of US - the debate on BER
and other issues



Bill,

It is this very mechanism of using dropped TCP packets as a mechanism to
signal flow control that is at issue.  This mechanism was designed into
TCP because of the use of unreliable transmission media.  It was
designed over twenty years ago, long before optical data networking came
into existance.  It was designed at a time when a BER of 10^-8 was
considered the best that could ever be achieved.  It was designed before
there was any congestion control at the lower layers.  It was designed
before Ethernet or 802.3 came into existance.

Today, the transmission media is much more reliable.  The congestion can
be controled at the physical access point on the originating system. 
This change started when the original Ethernet started using collision
detection to control congestion.  (A loss of a frame due to collision is
at the MAC layer,2, not at the Transmission layer,4.) A congested
Ethernet shared media LAN was measured by collisions.  Even today, on
switched 802.3 LANs congestion is measured by looking at "runts".  On
full duplex 100BaseT LANs flow control prevents even this level of
congestion.  If you loose packets on a LAN today, it is generally in the
receiving system, not the transmission system.  Today, on a 802.3 GbE or
100Mb LAN, TCP flow control is a refection of the performance of the end
systems, not the transmission media.

In the past, it is was not uncommon for WAN facilities to be over
subscribed, causing congestion and loss of data.  In the past, it was
because higher bandwidth transmission systems were not available.  Today
this only occurs because of economic issues.  I have been on customer
network projects where application transmission subscription
requirements were part of the SLA in the contract.  The "rules" for
designing and implementing enterprise intranets are changing.  Designing
WAN facilities to the application requirements takes a lot more
knowledge and skill than simply depending on TCP to handle any over
subscription issues.

At present, GbE ( 1000BaseLX ) is already moving into the WAN
environment. You have made the statement that it is being used in the
Internet.  This means that the "rules" are changing for the Internet as
well as for private intranets.  This is a very "Darwinian" reality. 
What, and whomever, is unable to adapt, will not survive.

In order for the Internet, and such things as E-Commerce to continue to
grow, the issue of stability and reliability must be addressed.  The
promise of Internet based abstracted services, such as VOI and IVC
(Internet Video Conferencing), can only be realized on a stable and
reliable Internet.  Bandwidth is increasing to residential systems as
well as other access facilities.  Cable modem systems are a shared media
access facility, not too much unlike the old shared media LANs.  The
stability and reliability of the transmission media is being addressed
in the access systems.  It must also be addressed in the core
transmission media as well.

The way that the TCP flow control process works over the Internet today,
applies to today.  Why should it apply to they way the Internet works in
the future?  Should that also restrict the way that private intranets
work?  The implementation of 1000BaseLX in the WAN environment says that
the way the processes will be applied in the future is different.  This
door has been opened and can not be closed.  Can you help me and others
figure out what and how those processes will be changing, and how to go
forward and improve them?   

					Thank you,
					Roy Bynum



Bill St. Arnaud wrote:
> 
> Larry:
> 
> What you say may be true for other types of networks.  But dropped packets
> and re-transmissions are an essential feature of Internet networks.  The
> TCP/IP congestion control mechanisms uses dropped packets as a mechanism
to
> signal the source to throttle back the data flow.
> 
> In fact many ISPs use a utility called RED ( Random Early Discard ) or
WRED
> ( Weighted Early Random Discard) to deliberately drop packets as a
mechanism
> to throttle traffic on congested links.   Yes this does cause a
> re-transmission, but TCP automatically drops down to a lower speed when
this
> happens.  As a result on most Internet links about 1-3% of the traffic is
> dropped packet and re=transmissions.  However, most of these dropped
packets
> are not due to RED but to buffer overflow at the destination receiver.
> SIGCOMM'98 has some excellent papers documenting this behaviour on the
> Internet.
> 
> If I have to do packet discard in any event I might as well do it a layer
1
> just as well as at layer 3.  More importantly if I am already dropping
> packets for other reasons, then as long as the number of dropped packets
> from BER is less than the number of dropped packets from TCP congestion
> control then the actual BER (whether it is 10^-15 or 10^-8) is irrelevant
to
> me.
> 
> I am assuming that if 10XGbE is used in the long haul the primary
> application will be to carry Internet traffic.  That is why it would be
nice
> to have an option for those of use who are running Internet networks to
have
> a BER Knob.  With a BER knob I may be able to extend my repeater distance,
> use lower cost lasers, etc etc.  However, as I said before this may still
> may not be practical because of other issues particularly with respect to
> the non-linear factors that affect BER.  But it still might be worth a
> cursory investigation.
> 
> Bill
> 
> -------------------------------------------
> Bill St Arnaud
> Director Network Projects
> CANARIE
> bill.st.arnaud@canarie.ca
> http://tweetie.canarie.ca/~bstarn
> 
> 
> 
> 
> 
> > -----Original Message-----
> > From: owner-stds-802-3-hssg@majordomo.ieee.org
> > [mailto:owner-stds-802-3-hssg@majordomo.ieee.org]On Behalf Of Larry
> > Miller
> > Sent: Wednesday, June 02, 1999 11:38 AM
> > To: stds-802-3-hssg@ieee.org
> > Subject: Re: Wide Area Networking for the Rest of US - the debate on BER
> > and other issues
> >
> >
> >
> > I think the bit is that when you report bad frames upward to higher
layers
> > they have to do some work to re-request those frames and that takes much
> > longer than the time actually burned by the dropped frames. Hence, if
you
> > get too low of a raw BER you spend all (or maybe more than all)
> > of your time
> > with higher layer thrashing and never get through with the (say) file
> > transfer.
> >
> > This, I think, is the fallacy in Mr St. Arnaud's notion.
> >
> > Larry Miller
> > Nortel Networks
> >
> >
> > -----Original Message-----
> > From: Mike Dudek <mdudek@cieloinc.com>
> > To: Chang, Edward S <Edward.Chang@unisys.com>
> > Cc: bin.guo@amd.com <bin.guo@amd.com>; bill.st.arnaud@canarie.ca
> > <bill.st.arnaud@canarie.ca>; rtaborek@transcendata.com
> > <rtaborek@transcendata.com>; dwmartin@nortelnetworks.com
> > <dwmartin@nortelnetworks.com>; stds-802-3-hssg@ieee.org
> > <stds-802-3-hssg@ieee.org>; sachs@watson.ibm.com <sachs@watson.ibm.com>
> > Date: Tuesday, June 01, 1999 5:42 PM
> > Subject: Re: Wide Area Networking for the Rest of US - the debate
> > on BER and
> > other issues
> >
> >
> > >
> > >Agreed, but the percentage of good frames stays the same.  ie the
> > percentage
> > >bandwidth used for retransmissions is the same.
> > >
> > >"Chang, Edward S" wrote:
> > >
> > >> Mike:
> > >>
> > >> If the BER is maintained the same for both GbE and 10xGbE and assume
> > >> everything is equal, the frequency of getting error from 10GbE is 10
> > times
> > >> than GbE from PHY.  Of course, the whole system has other factors to
be
> > >> included to find the final throughput.  In another word, the
occurrence
> > of
> > >> frame error will be much more for 10GbE than GbE.
> > >>
> > >> I may present mathematical analysis in July, if my time is allowed.
> > >>
> > >> Ed Chang
> > >> Unisys Corporation
> > >>
> > >> -----Original Message-----
> > >> From: Mike Dudek [mailto:mdudek@cieloinc.com]
> > >> Sent: Tuesday, June 01, 1999 10:07 AM
> > >> To: Chang, Edward S
> > >> Cc: bin.guo@amd.com; bill.st.arnaud@canarie.ca;
> > >> rtaborek@transcendata.com; dwmartin@nortelnetworks.com;
> > >> stds-802-3-hssg@ieee.org; sachs@watson.ibm.com
> > >> Subject: Re: Wide Area Networking for the Rest of US - the
> > debate on BER
> > >> and other issues
> > >>
> > >> I do not agree that the BER must be improved with data rate increase
in
> > >> order to
> > >> obtain the higher throughput.  At least for packet based transmission
> > with
> > >> retransmission of errored packets, the throughput increases in
> > proportion
> > to
> > >> the
> > >> data rate for the same BER, assuming that the packet length (in
bytes)
> > >> remains
> > >> fixed.  I do not think that anyone has proposed changing the packet
> > length,
> > >> but
> > >> if they did then the BER might have to be improved.  The
> > throughput is of
> > >> course
> > >> the number of good packets in any interval of time.
> > >>
> > >> "Chang, Edward S" wrote:
> > >>
> > >> > Bin:
> > >> >
> > >> > Yes, I agree.  The BER should be improved with data rate increase,
if
> > the
> > >> > through put gained from higher data rate is to be maintained.  In
> > addition
> > >> > to the retry times wasted, the external sources of noise remain the
> > same,
> > >> > which further requires the lower BER.  These are the correct design
> > goals
> > >> we
> > >> > should work on.  Although, we also should keep the
cost-effectiveness
> > in
> > >> > mind to maintain optimum balance between performance and cost.
> > >> >
> > >> > Ed Chang
> > >> > Unisys Corporation
> > >> >
> > >> > -----Original Message-----
> > >> > From: bin.guo@amd.com [mailto:bin.guo@amd.com]
> > >> > Sent: Friday, May 28, 1999 4:57 PM
> > >> > To: Edward.Chang@unisys.com; bill.st.arnaud@canarie.ca;
> > >> > rtaborek@transcendata.com; dwmartin@nortelnetworks.com
> > >> > Cc: stds-802-3-hssg@ieee.org; sachs@watson.ibm.com;
> > "widmer@us.ibm.com
> > >> > widmer@us.ibm.com widmer"@us.ibm.com
> > >> > Subject: RE: Wide Area Networking for the Rest of US - the debate
on
> > BER
> > >> > a nd other issues
> > >> >
> > >> > Ed,
> > >> >
> > >> > If the specified BER for 1000BASE-X is 10^ -12, then to have
> > the equal
> > >> > error-free period the specified BER for 10G should be at
> > least 10^ -13.
> > >> > Based on Rich T and Rich S's BER number:
> > >> >
> > >> > A system BER of 10 E - 8 @  10 Mbps = a bit error every 10 seconds.
> > >> > (10BASE-T)
> > >> > A system BER of 10 E-12 @ 100 Mbps = a bit error every 166
> > minutes, 40
> > >> > seconds.        (100BASE-X)
> > >> > A system BER of 10 E-10 @     1 Gbps = a bit error every 1
> > minutes, 40
> > >> > seconds.                (1000BASE-T)
> > >> > A system BER of 10 E-12 @     1 Gbps = a bit error every 16
> > minutes, 40
> > >> > seconds.        (1000BASE-X)
> > >> > A system BER of 10 E-12 @   10 Gbps = a bit error every 1 minutes,
40
> > >> > seconds.
> > >> > A system BER of 10 E-13 @   10 Gbps = a bit error every 16
> > minutes, 40
> > >> > seconds.
> > >> >
> > >> > If the TCP/IP is the only protocol 10G PHY needs to support, then
the
> > >> above
> > >> > specified BER may be more than enough.  Moving from 1G to
> > 10G, the bit
> > >> > period is scaled 10X smaller while jitter and noise from some
sources
> > are
> > >> > not scaled the same way -- much tight control should be applied to
> > achieve
> > >> > even the same BER.
> > >> >
> > >> > Bin
> > >> >
> > >> > ADL,AMD
> > >> >
> > >> >
> > >> >
> > >> >
> > >> > > -----Original Message-----
> > >> > > From: Chang, Edward S [SMTP:Edward.Chang@unisys.com]
> > >> > > Sent: Friday, May 28, 1999 12:44 PM
> > >> > > To:   bill.st.arnaud@canarie.ca; Guo, Bin;
> > rtaborek@transcendata.com;
> > >> > > dwmartin@nortelnetworks.com
> > >> > > Cc:   stds-802-3-hssg@ieee.org; sachs@watson.ibm.com;
> > "widmer@us.ibm.com
> > >> > > widmer@us.ibm.com          widmer"@us.ibm.com
> > >> > > Subject:      RE: Wide Area Networking for the Rest of US - the
> > debate
> > >> on
> > >> > > BER a nd other issues
> > >> > >
> > >> > > Bill:
> > >> > >
> > >> > > I like your idea of implementing native 10xGBE for
> > intermediate long
> > >> haul
> > >> > > and WAN, which is a good move.  The advantage you are
> > mentioning will
> > >> > > greatly reduce the cost to users.
> > >> > >
> > >> > > It is true, in a TCP/IP links, the TCP flow control causes more
> > >> > > retransmission than BER. Therefore, the extremely low BER,
10^-15,
> > does
> > >> > > not
> > >> > > necessarily gain any more advantage than the specified BER
> > of 10^-12.
> > >> > >
> > >> > >
> > >> > > Ed Chang
> > >> > >
> > >> > > -----Original Message-----
> > >> > > From: Bill St. Arnaud [mailto:bill.st.arnaud@canarie.ca]
> > >> > > Sent: Friday, May 28, 1999 8:52 AM
> > >> > > To: bin.guo@amd.com; rtaborek@transcendata.com;
> > >> > > dwmartin@nortelnetworks.com
> > >> > > Cc: stds-802-3-hssg@ieee.org; sachs@watson.ibm.com;
> > "widmer@us.ibm.com
> > >> > > widmer@us.ibm.com widmer"@us.ibm.com
> > >> > > Subject: Wide Area Networking for the Rest of US - the
> > debate on BER
> > and
> > >> > > other issues
> > >> > >
> > >> > >
> > >> > >
> > >> > > All:
> > >> > > I have been following the interesting debate about BER.
> > Let me bring
> > >> some
> > >> > > further issues into the debate.
> > >> > >
> > >> > > I am assuming that on WAN and long haul GbE the upper
> > layer protocol
> > >> will
> > >> > > only be IP.
> > >> > >
> > >> > > On most IP links, even ones with BERs of 10^-15 there is about
1-3%
> > >> packet
> > >> > > loss and retransmission.  This is due to a number of
> > factors but most
> > >> > > typically it relates to TCP flow control mechanism from
> > server bound
> > >> > > congestion (not network congestion) and the use of WRED in
routers.
> > >> > >
> > >> > > So, on most IP links the packet loss due to BER is
> > significantly less
> > >> than
> > >> > > that due to normal TCP congestion.  As long as that ratio is
> > maintained
> > >> it
> > >> > > is largely irrelevant what the absolute BER value is.
> > There will be
> > >> many
> > >> > > more retransmissions from the IP layer than there will be at the
> > >> physical
> > >> > > layer due to BER.
> > >> > >
> > >> > > Other protocols like Frame Relay and SNA are a lot more
> > sensitive to
> > >> high
> > >> > > BERs.  IP ( in particular TCP/IP) is significantly more robust
and
> > can
> > >> > > work
> > >> > > quite effectively in high BER environments e.g. TCP/IP over
barbed
> > wire.
> > >> > >
> > >> > > I would like to suggest that the 802.3 HSSG group consider an 2
> > >> solutions
> > >> > > for 10xGbE WAN:
> > >> > > (1) native 10xGbE using 8b/10b; and
> > >> > > (2)10xGbE mapped to a SONET STS OC-192 frame
> > >> > >
> > >> > > For extreme long haul solutions SONET makes a lot of sense as a
> > >> transport
> > >> > > technology.  However for intermediate long haul (up to 1000 km)
and
> > WAN
> > >> > > native 10xGbE is more attractive. Native GbE can be either
> > transported
> > >> on
> > >> > > a
> > >> > > transparent optical network or carried directly on a CWDM
> > system with
> > >> > > transceivers. In medium range networks coding efficiency is not
as
> > >> > > important
> > >> > > as it is in long haul networks. If coding efficiency is important
> > then
> > >> in
> > >> > > my
> > >> > > opinion, it does not make sense to invent a new coding scheme for
> > 10xGbE
> > >> > > when it would be just as easy to map it to a SONET frame.
> > >> > >
> > >> > > The attraction of native 10xGbE for the WAN is that it is a "wide
> > area
> > >> > > networking solution for the rest of us".  You don't need to hire
> > >> > > specialized
> > >> > > SONET engineers to run and manage your networks.  The 18
> > year old kid
> > >> who
> > >> > > is
> > >> > > running your LAN can now easily learn to operate and manage a
WAN.
> > >> > >
> > >> > > In Canada and the US, there are several vendors who are willing
to
> > sell
> > >> > > dark
> > >> > > fiber at a very reasonable cost.  Right now the cost of building
a
> > WAN
> > >> > > with
> > >> > > 10xGbE and CWDM is substantially less (for comparable data rates)
> > than
> > >> > > using
> > >> > > SONET equipment.
> > >> > >
> > >> > > Bill
> > >> > >
> > >> > >
> > >> > >
> > >> > >
> > >> > >
> > >> > > -------------------------------------------
> > >> > > Bill St Arnaud
> > >> > > Director Network Projects
> > >> > > CANARIE
> > >> > > bill.st.arnaud@canarie.ca
> > >> > > http://tweetie.canarie.ca/~bstarn
> > >> > >
> > >> > >
> > >> > >
> > >> > >
> > >> > >
> > >> > >
> > >> > >
> > >> > > > -----Original Message-----
> > >> > > > From: owner-stds-802-3-hssg@majordomo.ieee.org
> > >> > > > [mailto:owner-stds-802-3-hssg@majordomo.ieee.org]On Behalf Of
> > >> > > > bin.guo@amd.com
> > >> > > > Sent: Thursday, May 27, 1999 7:28 PM
> > >> > > > To: rtaborek@transcendata.com; dwmartin@nortelnetworks.com
> > >> > > > Cc: stds-802-3-hssg@ieee.org; sachs@watson.ibm.com;
> > "widmer@us.ibm.com
> > >> > > > widmer@us.ibm.com widmer"@us.ibm.com
> > >> > > > Subject: RE: 1000BASE-T PCS question
> > >> > > >
> > >> > > >
> > >> > > >
> > >> > > > Rich,
> > >> > > >
> > >> > > > The DC balance can be directly translated into jitter
> > (when timing
> > is
> > >> > > > concerned) and offset (when threshold slicing is concerned).
You
> > >> > > > only need
> > >> > > > to deal with the former if the signal is 2-level NRZI, while
you
> > need
> > >> to
> > >> > > > deal with both if multi-level signal modulation is used.
> > >> > > >
> > >> > > > For long term DC imbalance, it translates into low
> > frequency jitter
> > >> and
> > >> > > if
> > >> > > > it's low enough(<1 KHz ?), it's called baseline wonder.  For
> > >> > > > short term, it
> > >> > > > relates to Data Dependent Jitter, which is more difficult for
> > timing
> > >> > > > recovery to handle since it's not from system or channel
> > imparity,
> > and
> > >> > > > therefore it's harder to compensate.
> > >> > > >
> > >> > > > When you have a lot of jitter margin, for example in lower
speed
> > >> > > clocking,
> > >> > > > the amount of jitter, translated from DC drift resulted from
data
> > >> > > > imbalance
> > >> > > > coupled by AC circuit, percentage wise is a small portion of
the
> > clock
> > >> > > > period and therefore does not contribute to much of the eye
> > >> > > > closing.  On the
> > >> > > > other hand, for high speed clocking at 10G (100 ps?), the
jitter
> > >> > > > translated
> > >> > > > from the same amount of DC drift can be a significant portion
of
> > the
> > >> > > clock
> > >> > > > period, so contributes to much large percentage wise jitter
which
> > >> > > > results in
> > >> > > > reduced eye opening -- higher BER.
> > >> > > >
> > >> > > > Dave said in his mail that "The limiting factor is enough RX
> > optical
> > >> > > power
> > >> > > > to provide a sufficiently open eye." but you still have to deal
> > with
> > >> the
> > >> > > > data dependent jitter due to DC imbalance generated
> > after O/E, that
> > >> can
> > >> > > > close the eye further again.
> > >> > > >
> > >> > > > Bin
> > >> > > >
> > >> > > > ADL, AMD
> > >> > > >
> > >> > > > > -----Original Message-----
> > >> > > > > From:     Rich Taborek [SMTP:rtaborek@transcendata.com]
> > >> > > > > Sent:     Thursday, May 27, 1999 3:23 PM
> > >> > > > > To:       David Martin
> > >> > > > > Cc:       HSSG_reflector; Sachs,Marty; Widmer,Albert_X
> > >> > > > > Subject:  Re: 1000BASE-T PCS question
> > >> > > > >
> > >> > > > >
> > >> > > > > Dave,
> > >> > > > >
> > >> > > > > Do you know of any research or other proofs in this
> > area? You say
> > >> that
> > >> > > > > lower speed SONET links regularly achieves BERs of < 10 E-15.
I
> > have
> > >> > > > > substantial experience with mainframe serial links such as
> > ESCON(tm)
> > >> > > > > where the effective system BERs are in the same ballpark.
SONET
> > uses
> > >> > > > > scrambling with long term DC balance and ESCON uses 8B/10B
with
> > >> short
> > >> > > > > term DC balance. The following questions come to mind:
> > >> > > > >
> > >> > > > > - How important is DC balance?
> > >> > > > > - How does this importance scale in going to 10 Gbps?
> > >> > > > >
> > >> > > > > I'll see if I can get some 8B/10B experts to chime in
> > here if you
> > >> can
> > >> > > > > get scrambling experts to bear down on the same problem.
> > >> > > > >
> > >> > > > > --
> > >> > > > >
> > >> > > > > >(text deleted)
> > >> > > > > >
> > >> > > > > >The point here is that the SONET scrambler is not the
limiting
> > >> issue
> > >> > > in
> > >> > > > > >achieving low error rates. The issue is having enough
> > photons/bit,
> > >> or
> > >> > > > > >optical SNR (eye-Q) to accurately recover the data.
> > >> > > > > >
> > >> > > > > >...Dave
> > >> > > > > >
> > >> > > > > >David W. Martin
> > >> > > > > >Nortel Networks
> > >> > > > > >+1 613 765-2901
> > >> > > > > >+1 613 763-2388 (fax)
> > >> > > > > >dwmartin@nortelnetworks.com
> > >> > > > > >========================
> > >> > > >
> > >
> > >
> >
> >
> >