Thread Links Date Links
Thread Prev Thread Next Thread Index Date Prev Date Next Date Index

Re: Question about Link Fault Signalling




Ben, Bob,

I also like the idea of not requiring some # of IDLEs to exit a fault
condition. If the fault no longer occurs that should cause the exit.

I do have a bit of a simple question, but I'd like to ask it just for
clarity:

When we say that "the reception of four status messages of the same type
shall indicate that the corresponding fault condition has occurred" I
interpret this to be within a time period of 4 clocks - that is 8
recieved groups. This seems to make sense if the recieved message is
alternated with IDLE characters (if IDLEs are deleted for timing, it
doesn't matter since we'll get 4 fault messages sooner). Since there was
no wording describing how long we have to detect the four fault messages
I thought I'd mention it just to be certain. Please correct me if my
assumption is wrong. Thanks.

Dave

Ben Brown wrote:
> 
> Bob,
> 
> I'll go one step further, in line with a recent note from Rich.
> The presence or absence of LF or RF ordered sets should decide
> what state the RS is in. Requiring some number of IDLEs to
> exit a state is undesirable (in my opinion). The case where
> a MAC is sending data when an RS leaves the LF state results
> in the link partner seeing RF followed directly by packets.
> Though this first packet will likely be trashed (and I have
> no problem with that), it should not require 8 words of IDLEs
> to exit the RF state, resulting in the loss of perhaps many
> packets if they all have minimum IPGs.
> 
> Ben
> 
> David Gross wrote:
> >
> > Hi Bob,
> >
> > I think I might have been a bit unclear in my question judging from the
> > range of responses, but I think Ben saw what I was getting at. The
> > question arose because of the wording of a paragraph, namely:
> >
> > "The reception of four status messages of the same type shall indicate
> > that the corresponding fault condition has occurred. The reception of
> > four Idle control characters on successive RX_CLK edges (eight
> > consecutive Idle control characters) shall clear all fault conditions."
> >
> > The point is you make no mention that one particular fault condition can
> > be cleared if another one is being detected (or likewise if it stops
> > being detected, but the RS is NOT recieving IDLEs).
> >
> > This is the case we seem to be talking about when we come out of an LF
> > condition and immediately start recieving RF at the RS. There is no
> > continuous IDLE stream to remove fault conditions as seems to be
> > signified. There should be a sentence adding that since faults are
> > exclusive when recieved at the RS (it can get LF or RF or neither, but
> > not both), if a different fault condition is detected while a previous
> > condition has not yet been cleared by IDLEs, then that fault shall be
> > asserted and the previous one cleared.
> >
> > I hope this is a bit clearer than my last attempt at this :)
> >
> > cheers,
> > Dave Gross
> >
> > Grow, Bob wrote:
> > >
> > > See responses in line <RMG>
> > >
> > > --Bob Grow
> > >
> > > -----Original Message-----
> > > From: Ben Brown [mailto:bbrown@xxxxxxxx]
> > > Sent: Wednesday, January 03, 2001 5:23 PM
> > > To: 802.3ae
> > > Subject: Re: Question about Link Fault Signalling
> > >
> > > Bob,
> > >
> > > Please explain how LF and RF can be simultaneously detected
> > > or rather simultaneously exist on a link? When an RS is
> > > sending RF, if the link is up and working fine, the RF will
> > > make it all the way to the link partner's RS. If the link
> > > is broken somewhere, the RF is lost due to the broken link
> > > and a device somewhere in the link will generate LF to
> > > report the broken link.
> > >
> > > <RMG>  I think we agree on the important things, we are differing on
> > > "detect".  The remote DTE is transmitting RF, and an LF is generated because
> > > XAUI can't deskew.  The RF could be detected below the fault (it exists
> > > within the DTE), but the RS can't see it because of the LF.  Thus a LF
> > > naturally has precedence over an RF for signaling.  (In this case your
> > > saying the RF is lost.)
> > >
> > > Also, I think David's point was this:
> > >
> > >           Device A                     Device B
> > > Time X    PCS not in sync              PCS not in sync
> > >           PCS sending LF               PCS sending LF
> > >           RS detecting LF              RS detecting LF
> > >           RS sending                   RS sending RF
> > >
> > > Time Y    PCS in sync                  PCS not in sync
> > >           PCS forwarding RF            PCS sending LF
> > >           RS detecting RF              RS detecting LF
> > >           RS sending IDLE              RS sending RF
> > >
> > > In the transition from Time X to Time Y, the RS doesn't
> > > see enough IDLEs to transition from LF to IDLE so it must
> > > have a means of transitioning directly from LF to RF.
> > >
> > > <RMG>I don't see anything in Shimon's rules that requires Idle to change
> > > from LF to RF.  The Idle clears the LinkStatus variable, not which type of
> > > fault is received.  The fault type is determined by receiving the specified
> > > number of messages of the same type.
> > >
> > > It must be clear that detection of RF clears the detection
> > > of LF and vice versa. You can't always expect IDLEs between
> > > LF and RF detection.
> > >
> > > <RMG>I agree, I think it is clear in Shimon's rules.
> > >
> > > Regards,
> > > Ben
> > >
> > > "Grow, Bob" wrote:
> > > >
> > > > Your proposed change is too subtitle for me while I am trying to finish my
> > > > ballot comments, but I think you erred in thinking RF is repeated, it
> > > isn't.
> > > >
> > > > I don't think the existing text has a fatal embrace nor a race condition
> > > iso
> > > > can't see any reason to change it.  RF is not repeated, so once LF goes
> > > > away, only Idle is transmitted until RF is no longer received.
> > > >
> > > > BTW, RF and LF can be simultaneously detected though the RS will only see
> > > > the higher priority LF.
> > > >
> > > > FYI, someone else has pointed out that I may have misread Shimon's
> > > > SuggestedRemedy that the reset to LinkStatus = OK occurs on 8 * 4 bytes of
> > > > Idle, not 2 * 4 bytes as my edited version records.)
> > > >
> > > > --Bob
> > > >
> > > > -----Original Message-----
> > > > From: David Gross [mailto:dgross@xxxxxxxxxxxxxxxxxx]
> > > > Sent: Wednesday, January 03, 2001 2:35 PM
> > > > To: Grow, Bob
> > > > Cc: rtaborek@xxxxxxxxxxxxx; HSSG
> > > > Subject: Re: Question about Link Fault Signalling
> > > >
> > > > Thanks for the response Bob,
> > > >
> > > > I'd just like to make one clarification which I think might be
> > > > necessary. I'd like to see "In the case of a Local Fault condition..."
> > > > rather than "Upon detection of a Local Fault condition..." (and likewise
> > > > for Remote Fault). The reason for this is that since upon start-up, one
> > > > can assume that both devices will be in LF and transmitting RF. This
> > > > implies that once a device can start recieving data (i.e.: no longer
> > > > have a LF) it will be recieving RF. As a result, as the definition seems
> > > > to imply, the Fault conditions won't be cleared (The IDLE control words
> > > > won't be detected for 2 clock edges), but instead Remote Fault will be
> > > > detected. Since RF and LF cannot be detected at the same time (LF
> > > > prevents the transmission of recieved RF), it is logical that LF will be
> > > > cleared while RF will be achieved. There should be something in there
> > > > which allows for the clearing of LF in such a case, and jumping from the
> > > > LF condition immediately to the RF condition. Let me know what you
> > > > think.
> > > >
> > > > -Dave Gross
> > > >
> > > > Grow, Bob wrote:
> > > > >
> > > > > Shimon submitted a comment proposing changing the entry to link down
> > > > (eitner
> > > > > RF or LF) from 3 to 4 status messages, with exit on 8 consecutive idle
> > > > > bytes.  While I am open to discussion on the numbers, I think his
> > > proposed
> > > > > text with improved description of the protocol is a great starting point
> > > > for
> > > > > discussion.  Since this has come up again, here is a slightly edited
> > > > version
> > > > > of his proposed text.
> > > > >
> > > > > "46.2.6 Link fault signaling
> > > > >
> > > > > "Two link fault conditions are specified for 10Gb/s operation: Local
> > > Fault
> > > > > and Remote Fault. The Local Fault condition at the Reconciliation
> > > Sublayer
> > > > > indicates that a link failure has been detected on the receive path by a
> > > > > local DTE sublayer. The source of the failure could be at the remote
> > > > > transmitter, the interconnect between the two DTEs, at one of the local
> > > > > DTE's devices or the interconnect between the local DTE's devices. The
> > > > > Remote Fault condition is generated by the Reconciliation Sublayer, and
> > > > when
> > > > > received by at a Reconciliation sublayer indicates that a link failure
> > > has
> > > > > been detected  by the remote DTE. The source of the failure could be at
> > > > the
> > > > > local transmitter, the interconnect between the two DTEs, at one of the
> > > > > remote DTE's devices or the interconnect between the remote DTE's
> > > devices.
> > > > >
> > > > > " Fault conditions are conveyed over the XGMII using status messages.
> > > All
> > > > > status messages are four bytes in length, and are sent on a single XGMII
> > > > > clock edge. A status message is indicated by a Pulse control character
> > > > > aligned to lane 0, with the status condition encoded in the three data
> > > > bytes
> > > > > of lanes 1, 2 and 3. The status encodings are shown in Table 46-4."
> > > > >
> > > > >                               <Table 46-4>
> > > > >          <For the sake of completeness, also show Lane 0 encoding>
> > > > >
> > > > > "A PHY indicates Local Fault conditions to the Reconciliation sublayer
> > > by
> > > > > alternating the corresponding status message with Idle control
> > > characters
> > > > on
> > > > > RXC<3:0> and RXD<31:0>.  The Reconciliation sublayer sends the Remote
> > > > Fault
> > > > > indication to the remote DTE by alternating the Remote Fault message
> > > with
> > > > > Idle control characters on TXC<3:0> and TXD<31:0>.
> > > > >
> > > > > "The PHY repeats a Remote Fault indication received from the remote DTE
> > > > > unless a Local Fault condition is detected resulting in the PHY over
> > > > writing
> > > > > the received data with the Local Fault indication.
> > > > >
> > > > > "The Reconciliation sublayer continuously monitors RXC<3:0> and
> > > RXD<31:0>
> > > > > for status messages. The reception of four status messages of the same
> > > > type
> > > > > shall indicate that the corresponding fault condition has occurred. The
> > > > > reception of  four Idle control characters on successive RX_CLK edges
> > > > (eight
> > > > > consecutive Idle control characters) shall clear all fault conditions.
> > > > >
> > > > > " Upon detection of a Local Fault condition, the Reconciliation sublayer
> > > > > shall:
> > > > >  1) Set the link_fail status indication.
> > > > >  2) Inhibit the transmission of MAC frames.
> > > > >  3) Continuously send alternating Remote Fault messages and Idle control
> > > > > characters.
> > > > >
> > > > >  "Upon detection of a Remote Fault condition, the Reconciliation
> > > sublayer
> > > > > shall:
> > > > >  1) Set the link_fail status indication.
> > > > >  2) Inhibit the transmission of MAC frames.
> > > > >  3) Continuously send Idle characters.
> > > > >
> > > > > "After detecting that the Fault condition has cleared (both Local and
> > > > > Remote), the Reconciliation sublayer shall:
> > > > >  1) Clear the link_fail status indication.
> > > > >  2) Enable the transmission of MAC frames."
> > > > >
> > > > > --Bob Grow
> > > > >
> > > > > -----Original Message-----
> > > > > From: David Gross [mailto:dgross@xxxxxxxxxxxxxxxxxx]
> > > > > Sent: Wednesday, January 03, 2001 8:48 AM
> > > > > To: rtaborek@xxxxxxxxxxxxx
> > > > > Cc: HSSG
> > > > > Subject: Re: Question about Link Fault Signalling
> > > > >
> > > > > Hi Rich,
> > > > >
> > > > > I have a quick question about Remote Fault I was hoping you could
> > > > > answer. In 46.2.6, it says:"Reception of multiple local fault messages
> > > > > causes the Reconcilliation Sublayer to inhibit the transmission of
> > > > > frames by MAC, and to encode remote fault status messages on TXC<3:0>
> > > > > and TXD<31:0>" It goes on to specify that reception of three LF messages
> > > > > sets link_fail to 1, and none n 6 clock periods clears link_fail.
> > > > >
> > > > > My question is this: I believe we said that upon recieving RF, the RS
> > > > > will output an IDLE stream until it no longer recieves RF. If this is
> > > > > so, how many RF messages set this condition to be true, and in how many
> > > > > clocks do we say that this condition is cleared if no RFs are detected.
> > > > > Is it similar to LF, or do we only require that one RF be detected (and
> > > > > then for how long before we reset this IDLE output condition of the RS
> > > > > Tx?)
> > > > >
> > > > > Thanks in advance.
> > > > >
> > > > > -Dave Gross
> > >
> > > --
> > > -----------------------------------------
> > > Benjamin Brown
> > > AMCC
> > > 2 Commerce Park West
> > > Suite 104
> > > Bedford NH 03110
> > > 603-641-9837 - Work
> > > 603-491-0296 - Cell
> > > 603-626-7455 - Fax
> > > 603-798-4115 - Home Office
> > > bbrown@xxxxxxxx
> > > -----------------------------------------
> 
> --
> -----------------------------------------
> Benjamin Brown
> AMCC
> 2 Commerce Park West
> Suite 104
> Bedford NH 03110
> 603-641-9837 - Work
> 603-491-0296 - Cell
> 603-626-7455 - Fax
> 603-798-4115 - Home Office
> bbrown@xxxxxxxx
> -----------------------------------------