Thread Links Date Links
Thread Prev Thread Next Thread Index Date Prev Date Next Date Index

Re: [HSSG] 40G MAC Rate Discussion

Title: Re: [HSSG] 40G MAC Rate Discussion

There is a 32 lane version.  The question is how viable is that from an implementation stand-point given that a 32 lane interface is 128 PCB traces.

There is also overhead associated with moving data from the adapter over the PCIe host bus and into the host memory. 

Therefore, the bandwidth capability of PCIe is not just what can be calculated.

Looking at when 10GbE first showed up, the only host bus available was PCI-X 133.  It is okay for the network to exceed the capability of the host bus, because over time, the host bus and the servers will catch up, just as they are starting to do now.


-----Original Message-----
From: sanjeev mahalawat <sanjeevmahalawat@GMAIL.COM>
To: <>
Sent: Tue Apr 10 10:39:20 2007
Subject: Re: [HSSG] 40G MAC Rate Discussion


Peter is right here. PCIe 2.0 goes upto 160 Gbps.

Brad "only mentioned" x8 and x16 lane configurations There is x32 lane
configuration too. And with 5Gbps it is 160 Gbps.

Regrading overhead, even with 25% overhead you get 120 Gbps throughput. This is by no means a 100GE bottleneck. Now, could a processor/memory system fill such a pipe is a separate discussion, but then I have my doubts about the 40GE too.

Since PCIe spec. is subscriber only I can't post the spec. here.


On 4/10/07, Vandoorn, Schelto <> wrote:



        I don't believe your PCIe Gen2 statement is correct. See Brad Booth's reply on an earlier thread regarding the even faster following Gen3.



        PCIe gen 3 is expected to be 10 Gbps.  The calculation would be 8 Gbps (unencoded data) * 8 (more typical lane count) * 75% (PCIe efficiency) = 48 Gbps.  A 16 lane PCIe host bus would be able to handle about 96 Gbps which would be close to the maximum line rate of 100 GbE.


        While the host bus may be able to handle that bandwidth, the CPU and memory will lag that bandwidth capability.  Therefore, 40 GbE is probably sufficient for most servers over the 5-10 years.






        -----Original Message-----
        From: Peter Harrison [mailto: pharrison@NETFLIX.COM <mailto:pharrison@NETFLIX.COM> ]
        Sent: Tuesday, April 10, 2007 9:26 AM
        To: <>
        Subject: Re: [HSSG] 40G MAC Rate Discussion


        I also agree with Donn's comments.


        I additionally don't see the overwhelming need for 40Gbps to the server



        a. The newly created PCI Express v2.0 standard already has a max data

        rate in excess of 100Gbps


        b. There is a trend in the thread that is concerned about the additional

        complexities of dual purposed hardware and software, and the distraction

        of a dual standard.


        c. Web video on demand applications have NIC input/output ratios in

        excess of 50:1 versus approximately 10:1 for traditional HTML. On the

        surface 40Gbps would seem sufficient, but it doesn't account for the

        viral proliferation of video.


        My primary concern is the backbone capacities of the ISPs closely

        followed by the aggregate Web capacity of server farms. LAG for servers

        is an acceptable compromise.


        If a dual standard is to be pursued for servers, instead of 40Gbps, I'd

        rather see 100Gbps interfaces clocked down to 50 Gbps (using inverse mux

        hardware?) as a driver configuration option.







        Peter Harrison

        Netflix Networking

        100 Winchester Circle, Los Gatos, CA 95032


        -----Original Message-----

        From: donnlee [mailto:donnlee@GMAIL.COM]

        Sent: Tuesday, April 10, 2007 12:47 AM

        To: <>

        Subject: Re: [HSSG] 40G MAC Rate Discussion


        As an end-user who presented to the HSSG along with other end-users

        that 100GE is too late, I feel like our urgency and pain has fallen on

        deaf ears when I see messages like those below. Does the IEEE want

        end-user input or not?


        To reiterate for those who did not hear the end-user presentations:


        a. 10GE pipe is too small. We have hit the LAG & ECMP ceilings of 10GE

        implementations today.


        b. We have to use multiple 10GE LAGs and build a Clos network to keep

        up with traffic demands. This results in a ridiculous number of cables

        and an operational nightmare. 100GE links would greatly exorcize and

        scale our networks. See "A Web Company's View on Ethernet", HSSG,



        c. If 10GE LAGs have grown to nightmare-ish size today, imagine what

        additional pent-up demand will be added between now and 2010?


        d. The largest 10GE switch commercially available today is too small.

        We would like much larger switches but because of (b), we really

        require 100GE switches. See "Saturating 100G and 1T Pipes", HSSG,



        e. When 100GE is available in 2010, we will have to LAG them on Day

        One because a single 100GE will be too small.


        f. I had no idea 10GE was a "failure" or "too early" until I visited

        an IEEE meeting. As far as we're concerned, we can't buy enough of it.

        Problem we have is the 10GE boxes do not have enough 10GE interfaces.

        We need more; a lot more.


        g. As 100GE is late, many of us are working with vendors who have

        PRE-STANDARD 100GE plans. Because the need is so great, I have no

        problems building a fabric of proprietary links as long as the links

        on the outer edges of the fabric are standard.



        Network Architecture Team

        Google Inc.