Thread Links Date Links
Thread Prev Thread Next Thread Index Date Prev Date Next Date Index

RE: Link Status thoughts


I appreciate your recent reflector's notes that helps me to understand 
how my LSS proposal sent out for Tampa has been mixing up the two 
issues; link status mechanism by the remote_fault(RF)/local_fault(LF)
(in your term) and that by the channel_reset triggered by STA.

I am mixing these two issues up since I believe that both of them are 
eventually just for a single status register bit: overall link status. 
I see no reason to separate these two issues since we can use MDIO for 
fault debugging within the Local Device.  (Here I will not discuss the 
fault debugging out-to Link Partner since this is nothing other than 
OAM&P that the community seems to prefer not mixing it up with BL/RF.)

In this message I will discuss these two issues separately. 

1) link status mechanism by RF/LF

I would like to point out that we are in good agreement if you read my 
Break Link(BL) as your Local Fault(LF).  Note that the original link 
status mechanism proposed with the LSS has defined link_status with a 
simple Boolean equation:

  link_status = !remote_fault * !RF_detect

This implies that the link is up when (1) no fault on the entire 
receive path from Link Partner to Local Device (and hence the Local 
Device does not send RF), and (2) no fault on the entire transmit path 
from Local Device to Link Partner (and hence RF is not received via 
the receive path).  I agree with you that we may need to change the 
signal names, like Alarm Indication Signal (AIS) or Link Alarm (LA) 
instead of Local Fault (or Break Link in my term).

Therefore, I think we could agree that the Local Fault signal (or Break 
Link in my term) on the receive path will be responded by the Remote 
Fault signal on the transmit path at the sublayer that the RF 
mechanism is implemented. 

I think the remaining different preferences between us on this RF/LF(BL) 
issue are:

- Relay the RF&LF status, sublayer by sublayer, by defining each 
  (RS/)XGXS/PCS-specific RF/LF signals.
- In Local Fault condition,  LF must be sent up the receive path.
  (NoLF means 'link is fine')
- Use some variety of /Z/Z/Z/Z/ Column for signaling identification.

- Define RF&LF(BL) mechanism only in RS.  RF&LF(BL) signals be 
  transparent in intermediate sublayers (XGXS/PCS) in the standard.
  Leave the practical instantiation point (chip) for the implementation.
- In Local Fault condition,  LF(BL) need not be sent up the receive path.
  NoLF means another Local Fault 'Link Signaling is NOT fine'.
  Instead, when 'link is fine',  NoRF&NoLF should be passed at regular 
  intervals as a heartbeat.
- Use some variety of /Z/D/D/D/ Column for signaling identification.

I would like to add my comment on a heartbeat.  In New Orleans the 
main objection to this LSS nature seemed to come from the argument 'we 
already have Idles for a heartbeat', resulted in Y:5, N:29 ( A:>40) 
(straw poll).  However, I am not yet convinced that many of them 
recognized the fact that 802.3ae will have multiple intermediate 
links such as XGXS-to-XGXS, PCS-to-PCS, and XGXS-to-XGXS.  I could 
agree that Idles are best used as a heartbeat for each intermediate 
link.  My argument here is that we had better adopt another heartbeat 
for overall link status since this minimizes the intermediate PCS/XGXS 
requirement on the standard; just producing Idles if they do not have 
input sync.   Neither Idle Equivalent nor BL/RF relay at each PCS/XGXS 
is required.  Note that this might work well even without a pin from 
PMD/PMA to PCS for out-of-band LF signaling, while I myself has no 
preference whether or not use a Pin here.

2) link status mechanism by the channel_reset

I have assumed that 802.3ae MAC requires a control register bit 
for resetting the Layer-1 channel; i.e. clearing the link status bit in 
the Link Partner by the Local Device's STA.  I am 
using the term link_reset for this control register bit.  Furthermore, 
I have also assumed another control bit power_down that notifies the Link 
Partner that the Local Device is going to shut down.  If either of 
these control register bit is asserted, I have designed that the Local 
Device sends Break Link on the transmit path to the Link Partner.

Looking from the Link Partner's side, this is completely the same 
as Local Fault signaling since both come on the receive path from the 
Local Device to the Link Partner.  Both are Alarm Indication Signal 
(AIS) received in the Link Partner.  Both should clear the overall 
link status bit in the Link Partner.  

That's why I am mixing up these two mechanisms; RF/LF and BL.

Furthermore, this Local Fault signal (or Break Link in my term) will 
eventually be responded by Remote Fault signal at the RF mechanism 
in the Link Partner.  This implies that the channel_reset issued by 
the Local Device's STA can be acknowledged by receiving this RF.  

So, if we design that the control register bit channel_reset is 
latched high until RF_rcvd, we can perform the complete reset of 
the duplex channel without employing Shimon's state synchronization 
process between the Local Device and the Link Partner.  I have not 
yet convinced that we should employ such status synchronization that 
requires longer waiting time (-300ms) or unnecessary link-distance 

I believe that BL responded by RF is better than BL responded by BL.

Best Regards,

At 12:55 00/11/01 -0700, pat_thaler@xxxxxxxxxxx wrote:
> Ben,
> Link_status is not complete at the PCS. The XAUI link
> is long enough that it is possible that a marginal 
> device has been attached and lock is not obtained. 
> We can have the case where an XGXS connects to a 
> daughter card or transceiver module and that device
> is not present. Also, as you point out, a DTE XGXS 
> can function as a PCS. 
> For indicating a local fault in the receive direction,
> this is pretty simple to cope with. For the sake of
> easing the discussion, lets say that the status line
> is a local fault line - asserted when there is loss
> of signal, loss of lock, or loss of sync. It may be
> an inband signal or an out of band signal.
> At each sublayer, the local fault out (going up the chain) 
> should then be the OR of the local fault in (from
> the direction toward the MDI) and the internal fault
> detection (loss of signal, loss of lock, loss of sync)
> of the sublayer.
> If the signal is out of band, this is just an OR gate.
> If the signal is in band, then if I'm getting valid
> signal from below I send it on - if the layer below
> has detected a local fault the signal I forward will
> indicate the local fault. If I'm not getting valid
> signal from below, I send the local fault in band signal.
> I would suggest that we either keep local fault out of
> band all the way up the chain or we convert it from
> out of band to in band at the PCS. My preference would
> be the latter though the one concern I have is it uses
> more of our fairly limited code space to signal it simply.
> I have thought of one way around the code space problem.
> Right now in the Muller proposal, RF is signaled on 
> all 4 lanes. One could send the RF code word on two lanes
> with K or R on the other two lanes and reuse the same
> code word for LF but put it on the other two lanes or
> across all 4 lanes (in case one is worried about lane
> mix-ups). This is a small complication of the Muller
> proposal since currently the same thing is sent on 
> all 4 lanes, but perhaps no more difficult to implement
> then having two kinds of code and it preserves our
> unused code space.
> The other question you raise is what do we do when the
> receive link is fine but there is a problem on the
> transmit side between the MAC and the MDI. For instance
> a PHY XGXS can't get sync on the XAUI sighal. I think
> that that should also assert local fault toward the MAC.
> Identifying the exact nature of the local fault should 
> be left to the MDIO registers. That still leaves the
> question of what that PHY XGXS should transmit. Candidates
> arme:
> Remote Fault 
> Local Fault
> Idle
> yet another signal.
> If we send Remote Fault then Remote Fault's meaning becomes
> the other end of the line detects a problem. The problem
> might be in the receive or transmit side, but something
> isn't in lock over there. You have to go over to the remote
> node to narrow it down.
> If we send Local Fault, then Local Fault becomes more like
> Receive fault - there is a problem somewhere between the
> transmit of the other MAC and me - and Remote Fault is 
> a Transmit fault - there is a problem between my MAC transmit
> and the other MAC's receive. To figure out if the "Local
> Fault" is from the other guy's Phy I have to query my
> sublayers via MDIO to see if any of them is out of lock
> and generating Local Fault. This seems better to me
> but it implies we should change the signal names.
> If we send Idle, then it seems a hole in our fault detection.
> Why do we bother to publish some faults to the remote node
> and leave out others?
> A new signal - I really want to keep this simple and our
> code space for simple signals is limited so I'd rather
> not do this one.
> I'll try to get a presentation ready for next week.
> Pat
> -----Original Message-----
> From: Ben Brown [mailto:bbrown@xxxxxxxx]
> Sent: Tuesday, October 31, 2000 5:46 PM
> To: 802.3ae
> Subject: Re: Link Status thoughts
> Pat,
> Would link_status be complete at the PCS? What if a
> XAUI link was used to extend the XGMII? Does this
> need any indication of the link from the PCS toward
> the MDI? What about the case of a DTE XGXS turned
> into a PCS because the PHY XGXS/PCS/PMA is really
> only a retimer? Does the DTE XGXS not care about
> signal_detect in one case but care about it in the
> other?
> All these cases show me a pretty confusing picture.
> I'm hoping someone can enlighten all of us next
> week :)
> Ben
> pat_thaler@xxxxxxxxxxx wrote:
> > 
> > Ben,
> > 
> > I've looked at your slides. I'm not sure if you intend all
> > the signals shown coming out of blocks (especially those on
> > slides 8 and 10) to be pins. If so, that is way too many
> > pins for link status.
> > 
> > What I propose is one pin from the PMD to the PMA indicating
> > whether the link is good (that is, signal detect is okay on
> > all inputs) or not. One pin from the PMA to the PCS/WIS
> > indicating whether the receive link at the PMA and below
> > is okay. That would be an AND of the signal it gets from
> > the PMD and the PMA's lock signal. The WIS to PCS interface
> > is only defined as a logical interface, so it would have
> > a message defined to pass whether the link below was up -
> > the AND of the signal it gets from the PMA and its frame
> > sync acquired status.
> > 
> > Pat
> > 
> > -----Original Message-----
> > From: Ben Brown [mailto:bbrown@xxxxxxxx]
> > Sent: Thursday, October 26, 2000 8:44 AM
> > To: 802.3ae
> > Subject: Link Status thoughts
> > 
> > Hey,
> > 
> > After a discussion some of us had during the editorial
> > meeting yesterday, I thought I'd put some of our thoughts
> > down on paper. There are still quite a few questions
> > buried in these slides. Anyone with comments, please
> > feel free to share them. I'd like to have these comments
> > before the November meeting so we can put a proposal on
> > the table.
> > 
> > Thanks,
> > Ben
> > 
> > --
> > -----------------------------------------
> > Benjamin Brown
> > AMCC
> > 2 Commerce Park West
> > Suite 104
> > Bedford NH 03110
> > 603-641-9837 - Work
> > 603-491-0296 - Cell
> > 603-647-2291 - Fax
> > 603-798-4115 - Home Office
> > bbrown@xxxxxxxx
> > -----------------------------------------

NTT Network Innovation Laboratories
TEL +81-468-59-3263  FAX +81-468-55-1282