[EFM-OAM] OAM and 802.3ae
Back in March, we opened up Clause 46 of 802.3ae to make some minor
modifications to allow unidirectional OAM to exist across a 10 gigabit link.
Previously, sending data across the unidirectional link was not allowed.
The proposed changes to clause 46 allow for frames to be sent that are
completely surrounded by FAULT codes.
As the editors note on page 104 mentions, in Clause 49, the 64B/66B PCS
encoder does not allow a frame to be followed directly by a fault code. It
needs to see idle after the frame or it is errored and discarded. There are
four possible cases when a frame is terminated. Since the XGMII uses four
lanes, the frame can end and be terminated on any of the four lanes, 0, 1,
2, or 3.
Now, before I get into the gory details of fixing 802.3ae to make OAM work,
I have a question on Figure 57-4. According to 57.3.2.1, on the bottom of
page 127, it says that if local_link_status equals FAIL, that the state
machine will reset to the CHECK_MODE state and will therefore set
local_stable <= UNSTABLE. However, it will maintain the previous value of
local_tx. If the previous state was SEND_ANY, then local_tx will still be
ANY.
If local_link_status stays set to FAIL, the DUT will not enter the
ACTIVE_SEND_LOCAL or PASSIVE_WAIT states due to the global transition into
the CHECK_MODE state. Therefore, the DUT will not reset local_tx to INFO or
NONE. It will stay in the CHECK_MODE state until local_link_status changes
from FAIL and will not change local_tx until this happens. My first
question is whether or not this is the desired mode of operation or do you
want the DUT to reset local_tx to the appropriate value (INFO or NONE) when
you go to a unidirectional link from a full link?
Regardless, I'll describe a couple of options for modifications to 802.3ae.
One option, which probably won't make it past working group ballot, would be
to modify Clause 49 to allow for a frame to have a fault code following the
terminate. Of course, there is only one available code word that maintains
the necessary Hamming distance, and as mentioned above, 4 possible
terminations. If a device resets local_tx when local_link_status=FAIL, then
it will only be sending information OAMPDUs, which are a fixed 64 byte
length. The option would be to use the remaining Clause 49 code to account
for when the terminate is on lane 0 (which happens when the frame length is
an even multiple of 4) and the problem is solved. If a device doesn't reset
local_tx, then frames of any length could be sent from the CHECK_MODE state
and this really can't be solved with a change to Clause 49.
The other solution is to keep all of the changes in Clause 46. We could
require that the RS transmits at least one full column of idle following the
column that has the terminate before transmitting any fault codes. This is
effectively done automatically in the Clause 48 PCS already. I think this
is really the only option we have open to us if we want to support
unidirectional OAM traffic for 802.3ae. This also future-proofs the
existence of other ordered_sets that may be defined to use a Clause 49 PCS.
I forget about submitting a comment on this, but I'm going to put together
some text to be included in Clause 46 anyway. Does anyone have any comments
on this?
- Eric Lynskey