Re: [RPRWG] RE: Divergent Simulator Results Explained
- To: David V James <dvj@xxxxxxxxxxxx>
- Subject: Re: [RPRWG] RE: Divergent Simulator Results Explained
- From: Mike Takefman <tak@xxxxxxxxx>
- Date: Wed, 16 Jul 2003 09:26:11 -0400
- CC: Rpr GroupOf Ieee <stds-802-17@xxxxxxxx>,       Jon Schuringa<jon.schuringa@xxxxxxxxxxxx>, bjornfd@xxxxxxxxx,       bjoernal@xxxxxxxxxx, petterte@xxxxxxxxxx, nuzun@xxxxxxxxx,       leonb@xxxxxxxxxxxxx, kkrama@xxxxxxxxxxxxxxxx, JLemon@xxxxxxxxxxxx,       hpeng@xxxxxxxxxxxxxxxxxx, yan@xxxxxxxxxxxxxxx, huang@xxxxxxxxxxxxxxx,       mei@xxxxxxxxxxxxxxxx, SteinGjessing <steing@xxxxxxxxx>
- References: <FMEBLOEMFEFGGFLELMLNKEEDCLAA.dvj@xxxxxxxxxxxx>
- Sender: owner-stds-802-17@xxxxxxxxxxxxxxxxxx
- User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.0.1) Gecko/20020823
David,
I agree SRP != RPR, but you missed my point.
Real network deployments don't require solutions that have
0 corner cases. The evaluation is whether something is good
enough to solve real world problems.
mike
David V James wrote:
> Mike,
> 
> With regards to:
> 
>>>Frankly, certain
>>>people can complain about SRP fairness, but it has been
>>>deployed for years in REAL networks and we have not
>>>had any complaints of performance issues.
>>
> 
> Yes, but:
>   assert(RPR!=SRP);
>   if (RPR!=SRP)
>     RPR_Behavior != SRP_Behavior;
> 
> In particular, I believe:
> 1) classA/classB/classC behaviors of RPR are different from SRP.
> 2) single-queue options is not supported in SRP.
> 
> In any case, RPR will have to stand-up to sponsor-ballot
> analysis. I know several reviewers, including myself, that
> will not buy the "its like SRP and (trust us) that works"
> type of arguments.
> 
> Comprehensible text, specific equations, and proofs of worst-case
> latencies are needed. I had to provide such things when writing
> a Masters thesis; I hope we apply similar quality controls to
> IEEE Standards.
> 
> DVJ
> 
> David V. James
> 3180 South Ct
> Palo Alto, CA 94306
> Home: +1.650.494.0926
>       +1.650.856.9801
> Cell: +1.650.954.6906
> Fax:  +1.360.242.5508
> Base: dvj@xxxxxxxxxxxx
> 
> 
>>>-----Original Message-----
>>>From: Mike Takefman [mailto:tak@xxxxxxxxx]
>>>Sent: Monday, July 14, 2003 8:53 AM
>>>To: Jon Schuringa
>>>Cc: Stein Gjessing; dvj@xxxxxxxxxxxx; mei@xxxxxxxxxxxxxxxx;
>>>huang@xxxxxxxxxxxxxxx; yan@xxxxxxxxxxxxxxx; hpeng@xxxxxxxxxxxxxxxxxx;
>>>JLemon@xxxxxxxxxxxx; kkrama@xxxxxxxxxxxxxxxx; leonb@xxxxxxxxxxxxx;
>>>nuzun@xxxxxxxxx; petterte@xxxxxxxxxx; bjoernal@xxxxxxxxxx;
>>>bjornfd@xxxxxxxxx
>>>Subject: Re: Divergent Simulator Results Explained
>>>
>>>
>>>All,
>>>
>>>I know that our implementation of SRP had fair
>>>sharing of the BW between STQ and stage buffer
>>>when the STQ threshold was below the low limit.
>>>Necdet had assured me that this behavior made
>>>it into the standard. If it did not, or got cut
>>>somehow, we may want to put it back in.
>>>
>>>I am not sure why some of the changes from D2.2 to D2.3
>>>were made, but they appear to have created this
>>>problem where it did not exist before.
>>>
>>>We also have to take a careful look at these scenarios
>>>and do a sanity check on them. Frankly, certain
>>>people can complain about SRP fairness, but it has been
>>>deployed for years in REAL networks and we have not
>>>had any complaints of performance issues. We have
>>>to be careful not to be optimizing for corner cases
>>>that cause worse problems in the general case.
>>>
>>>mike
>>>
>>>cheers,
>>>
>>>mike
>>>
>>>Jon Schuringa wrote:
>>>
>>>>All,
>>>>
>>>>I agree with Mike that the implementation of the shapers could
>>>
>>>be the reason
>>>
>>>>for the different simulation results. I also found some
>>>
>>>problems with my
>>>
>>>>implementation some time ago.
>>>>
>>>>BUT, it is very clear to me now why shaperD has a serious
>>>
>>>design flaw. I
>>>
>>>>will explain that here without "prove by simulations". Please
>>>
>>>take some time
>>>
>>>>to read it.
>>>>
>>>>
>>>>
>>>>Following very simple situation (like Wang Chao proposed some
>>>
>>>months ago):
>>>
>>>>------(A)-------------(B)-------------(C)----
>>>>
>>>>       o-------------flow1------------->
>>>>                       o------flow2---->
>>>>
>>>>
>>>>Both flows are class C flows and have enough to send, say 100%
>>>
>>>lineRate.
>>>
>>>>Furthermore we have 50% classA0, so unreservedRate is 50% of
>>>
>>>the lineRate.
>>>
>>>>It does not matter if classA0 is actually send.
>>>>
>>>>What do we expect to happen?
>>>>1) STQ in station B grows
>>>>2) STQ in B reaches lowThreshold
>>>>3) Station B advertises a fair rate
>>>>4) Flow1 and flow2 both get unreservedRate/2 = 25% of the lineRate
>>>>
>>>>What will happen?
>>>>1) STQ in B will not fill until lowThreshold:
>>>>Each time that a small number of packets is in the STQ, station B will
>>>>forward these OR add local packets. The output will not be
>>>
>>>idle until the
>>>
>>>>STQ is empty. This is *very* important to understand.
>>>>Now, all these packets decrement the shaperD credits, so:
>>>>    a) We are decrementing credits at lineRate, and
>>>>    b) Increasing at unreservedRate, which is less than lineRate.
>>>>
>>>>ShaperD will get below the loLimit, thus stopping add traffic.
>>>
>>>Station B now
>>>
>>>>drains the STQ at lineRate, keeping the shaperD below loLimit.
>>>
>>>Thus the STQ
>>>
>>>>cannot fill at all! Initial credits, or timimg in simulation
>>>
>>>programs are
>>>
>>>>actually not important in this case.
>>>>
>>>>2) Interestingly, both flows get 25% in the current RPR
>>>
>>>version. Station B
>>>
>>>>is reporting congestion, but not because of STQ thresholds! A
>>>
>>>station is
>>>
>>>>also reporting congestion if (lpNrXmitRate > unreservedRate).
>>>
>>>And that is
>>>
>>>>exactly what happens because station B is Xmitting at lineRate
>>>
>>>as long as
>>>
>>>>there is something in its STQ. The fact that the outcome of
>>>
>>>this experiment
>>>
>>>>is right, might be the reason why some people did not
>>>
>>>recognize this to be a
>>>
>>>>problem.
>>>>
>>>>
>>>>So, the external observable behavior of RPR is actually ok in
>>>
>>>this case, but
>>>
>>>>it is easy to come with other scenarios where it is not. Why
>>>
>>>do we need a
>>>
>>>>STQ of multiple MB if it cannot fill.
>>>>
>>>>
>>>>I hope this was understandable,
>>>>Jon
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>----- Original Message -----
>>>>From: "Stein Gjessing" <steing@xxxxxxxxx>
>>>>To: <tak@xxxxxxxxx>
>>>>Cc: <jon.schuringa@xxxxxxxxxxxx>; <dvj@xxxxxxxxxxxx>;
>>>><mei@xxxxxxxxxxxxxxxx>; <huang@xxxxxxxxxxxxxxx>; <yan@xxxxxxxxxxxxxxx>;
>>>><hpeng@xxxxxxxxxxxxxxxxxx>; <JLemon@xxxxxxxxxxxx>;
>>>><kkrama@xxxxxxxxxxxxxxxx>; <leonb@xxxxxxxxxxxxx>; <nuzun@xxxxxxxxx>;
>>>><petterte@xxxxxxxxxx>; <bjoernal@xxxxxxxxxx>; <bjornfd@xxxxxxxxx>;
>>>><steing@xxxxxxxxx>
>>>>Sent: Thursday, July 10, 2003 10:33 PM
>>>>Subject: Re: Divergent SImulator Results
>>>>
>>>>
>>>>
>>>>
>>>>>(This is written in a login window,  using a cell-phone as modem,
>>>>>please excuse spelling errors etc.)
>>>>>
>>>>>All,
>>>>>All our simulators contain errors, and will continue to do so
>>>>>forewer. What we can hope to achieve is to reduce the number of
>>>>>errors that make our results very incorrect.
>>>>>During the implementation of our simulator we also found errors in the
>>>>>RPR draft. I believe one goal of the simulation activity is to remove
>>>>>such errors.
>>>>>
>>>>>When we run simulations another source of error is configuration.  I
>>>>>recently found an error in our configuration of the 62 staion scenario
>>>>
>>>>>from David and Harry, and I want to correct that before going
>>>
>>>public with
>>>
>>>>>more results.
>>>>>
>>>>>Answering Mikes questions I also found en error in the Shaper class.
>>>>>I am sure we will find many more before the Simula-RPR simulator is
>>>>>relatively stable and accurate. Changes to the RPR draft and hence the
>>>>>Java code will introduce more errors, but also reveal some, I hope.
>>>>>
>>>>>Now to Mikes questions:
>>>>>1. The Simula-RPR simulator is exact to the level of transmitting one
>>>>>byte.
>>>>>2. The shapers are accurate to the level of transmitting one byte.
>>>>> What we do  (what the code is intended to do) is  that whenever a
>>>>>shaper is used, its value is calculated. This is done by saving the
>>>>>value, the time this value was computed and the increment. Hence it is
>>>>>easy to calculate a new value whan the shaper is used the next time.
>>>>>As I said above, I just found a  bug in the Shaper class, and I
>>>>>am sure there are more, but we have tested the simulator I while now,
>>>>>and  the shapers  seems to work pretty well.
>>>>>I need to go more detailed in to the code to answer all of a, b, c and
>>>>>d, but in a sinulator I do not think simultaneous or near simultaneous
>>>>>actions should be a problem (or more correctly: we must take that into
>>>>>consideration all the time)
>>>>>
>>>>>3. We decrement to max_credit when we go above,
>>>>>  and we handle
>>>>>  negative credits (as we showed in the 62 station example)
>>>>>
>>>>>4. We decrease the full amount of creditrs the moment we move a packet
>>>>>  to the stage buffer (ie. 4b)
>>>>>
>>>>>Finally I fully agree that we must have several simulators agree
>>>>>before we make changes to the draft based on simulations. And we can
>>>>>not really  make changes based on simulations, but on the new
>>>>
>>>>understanding we
>>>>
>>>>
>>>>>get from the results of the simulations.
>>>>>
>>>>>Stein
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>
>>>--
>>>Michael Takefman              tak@xxxxxxxxx
>>>Manager of Engineering,       Cisco Systems
>>>Chair IEEE 802.17 Stds WG
>>>2000 Innovation Dr, Ottawa, Canada, K2K 3E8
>>>voice: 613-254-3399       cell:613-220-6991
>>>
>>
> 
> 
-- 
Michael Takefman              tak@xxxxxxxxx
Manager of Engineering,       Cisco Systems
Chair IEEE 802.17 Stds WG
2000 Innovation Dr, Ottawa, Canada, K2K 3E8
voice: 613-254-3399       cell:613-220-6991