Thread Links Date Links
Thread Prev Thread Next Thread Index Date Prev Date Next Date Index

fwd from Jim Demmel: More on repeatability



> ... same answer on different machines.  His motivation was a number
> of his customers (civil engineers) who had contractual obligations to
> their customers to get repeatable answers, i.e. "the bridge is safe"
> will not change to "the bridge is not safe" if you run the code again.

A more likely situation is the following (at least with a well-designed
bridge safety assessment tool):

  (a)  The safety score is 0.995  --  where > 0.95 is considered safe
  (b)  The safety score is 0.993  --  where > 0.95 is considered safe

I doubt this discrepancy would be found alarming.  And, regardless of
repeatability, if I got the answer:
  (c)  The safety score is 0.952  --  where > 0.95 is considered safe
I would not be happy; I would re-check my assumptions to find out why
I got such a marginal score.  Human experience with the methods used
and the means to interpret results cannot totally be discarded.

Where Interval methods can contribute is by giving results as follows:

  (d)  The safety score is 0.995 +- 0.002  where > 0.95 is considered safe
  (e)  The safety seore is 0.993 +- 0.002  where > 0.95 is considered safe

A genuinely marginal result would then look like:
  (f)  The safety score is 0.951 +- 0.002  where > 0.95 is considered safe

As a human, I would probably also be unhappy with:
  (g)  The safety score is 0.953 +- 0.002  where > 0.95 is considered safe
(but this would be affected by my experiences with such scores, and the
distribution of good and bad scores).

Michel.
---Sent: 2011-08-04 22:20:09 UTC