Thread Links Date Links
Thread Prev Thread Next Thread Index Date Prev Date Next Date Index

RE: [10GBASE-T] latency

Current Gigabit Ethernet Switch products use store and forward and have latencies on the order of 4us-6us from last bit in to first bit out (LIFO).  Conceivably, cut-through versions of these same switches could be designed with sub 2us latencies.  Other high end IPC switches like InfiniBand, Myrinet, and Quadrics use cut-through routing with FIFO (first bit in to first bit out) on the order of a few hundred nsecs in latencies through the switches. 

We expect RDMA enabled 10Gb NIC to eventually become a common alternative to the specialized 10Gb IBx4 RNIC.  It would not be considered a viable alternative if it had an order of magnitude longer latencies.  Rather the latencies should be in the range of 2x or 3x of IBx4 to be viable.  So, I believe a 10GE cut-through switch FIFO latencies should be in the hundreds of nsecs, or <1 usec.

David J. Koenen                   email: 
Network H/W Development SE        phone: 512-432-8642 
HP, Austin                          fax: 512-432-8247

-----Original Message-----
[]On Behalf Of Jonathan
Sent: Tuesday, February 24, 2004 4:07 AM
Subject: RE: [10GBASE-T] latency

So, is there any gutsy person ready to throw out the latency estimates (and, ideally, the break outs) for the complex / high power vs the simple / low power versions of 10GBASE-T?

At this point, even "order of magnitude" estimates would be helpful. I can't tell if we are talking 10's of ns or 10's of ms. Depending on the outer limits of the range, this entire discussion may be moot (which is the essence, I think, of Pat's append below).


p.s. There do exist Ethernet switches that transparently support cut-through switching. RDMA over TOE over Ethernet is one option. There are others.

> -----Original Message-----
> From:
> []On Behalf Of
> Sent: Monday, February 23, 2004 6:05 PM
> To:
> Cc:
> Subject: RE: [10GBASE-T] latency
> Stephen,
> This is a tough question because latency is important for 
> some applications that might use RDMA NICs but there are also 
> constraints on the power available. An RDMA NIC is an 
> interface card that includes the RDMA protocol plus a TCP/IP 
> offload engine and MAC/PHY. The MAC/PHY would usually be Ethernet). 
> This is kind of long so here is the executive summary:
> Practically all of the the 10GBASE-T market will require 
> reasonable power requirements. There is additional market 
> available if the PHY is very low latency. However BMP is very 
> dependent on resonable power so if a trade-off of power for 
> latency pushes the power too high, one will lose more market 
> than one gains. Note that low latency in Infiniband and Fibre 
> Channel can be around 100 ns port to port through a switch. 
> --- The details ---
> The upper layers on these cards use a plenty of power 
> themselves so I doubt there is more than 5 W available for 
> the 10GBASE-T PHY given the power that a card slot can 
> provide. That number would probably be workable though 
> painful. Less would be better. Much more and it will probably 
> be hard to find slots that can provide the power and remove 
> the heat. There may be early bleeding edge products made with 
> higher power but for broader use the technology should be 
> able to get here.
> Hopefully other NIC vendors will chime in if they disagree 
> about the power.
> Like Ethernet NICs, RDMA NICs are intended to support a wide 
> variety of applications. Some of these applications are 
> pretty traditional networking applications and aren't 
> especially latency sensitive. Other potential applications 
> such as storage and clustering are currently served by more 
> specialized networks (e.g. Fibre Channel and the proprietary 
> predecessors to Infiniband) and are latency sensitive.
> What do clustering (Infiniband) and storage (Fibre Channel) 
> customers consider low latency?
> In Infiniband, the systems vendors generally wanted less than 
> 100 ns port to port through the switch. Fibre Channel 
> switches are about the same. In both technologies they 
> typically are using cut through switching to get this speed. 
> Ethernet switches moved away from their early cut-through 
> operation and generally have much higher latency. If Ethernet 
> wants to serve the very latency sensitive applications, then 
> more than PHYs has to be low latency.
> Neither of these technologies are planning on a 10GBASE-T 
> type PHY. They have PHYs similar to CX4 and the optical PHYs. 
> Infiniband is working on a quad speed version of their 
> existing 2.5 Gig signaling (as 802.3 may end up doing if the 
> backplane study group is chartered). It could be argued that 
> for these very latency applications, Ethernet also can use 
> the CX4 and optical PHYs. 
> I'm not sure what the latency range of the proposals under 
> consideration currently is. It seems likely that even the 
> fastest of them doesn't achieve the ultra low latency that 
> the systems vendors want for this class of application.
> Given this, it makes sense to accept some extra delay in 
> return for lower power.
> Regards,
> Pat 
> -----Original Message-----
> From: Stephen Bates []
> Sent: Friday, February 20, 2004 12:34 PM
> To: THALER,PAT (A-Roseville,ex1)
> Cc:
> Subject: RE: [10GBASE-T] latency
> Hi All
> My thanks to everyone who has responded to my email. 
> The responses I've been getting tend to suggest that PAUSE 
> should not be
> (and is not) enabled in most Ethernet systems. If flow control is
> required it should be handled higher up the stack. This obviously
> increases the latency and if TCP/IP is implemented in 
> software no exact
> bound on latency can be given since it will be architecture specific.
> However, although latency is not an issue for PAUSE it is a 
> major issue
> for certain applications 10G may be targetting. I believe 
> this brings us
> back to Brad's original request for some figures on end to end latency
> for applications such as cluster computing and RDMA.
> Serag mentioned that we should stick to the low latency 
> solutions if the
> power remains comparable with that of 10G optical transponder. My
> concern is that the balance between digital and analog power will be
> totally biased to the analog. This implies power consumption will not
> drop as much with technology scaling. In this case we will not see the
> same kind of power consumption drop over time that we saw for
> 1000BASE-T.
> Regards
> Stephen
> --------------------------------------------------------------
> ------------
> Dr. Stephen Bates
> Dept. of Electrical and Computer Engineering      Phone: +1 
> 780 492 2691
> The University of Alberta                         Fax:   +1 
> 780 492 1811
> Edmonton                            
Canada, T6G 2V4