fwd from Jim Demmel: More on repeatability
> ... same answer on different machines. His motivation was a number
> of his customers (civil engineers) who had contractual obligations to
> their customers to get repeatable answers, i.e. "the bridge is safe"
> will not change to "the bridge is not safe" if you run the code again.
A more likely situation is the following (at least with a well-designed
bridge safety assessment tool):
(a) The safety score is 0.995 -- where > 0.95 is considered safe
(b) The safety score is 0.993 -- where > 0.95 is considered safe
I doubt this discrepancy would be found alarming. And, regardless of
repeatability, if I got the answer:
(c) The safety score is 0.952 -- where > 0.95 is considered safe
I would not be happy; I would re-check my assumptions to find out why
I got such a marginal score. Human experience with the methods used
and the means to interpret results cannot totally be discarded.
Where Interval methods can contribute is by giving results as follows:
(d) The safety score is 0.995 +- 0.002 where > 0.95 is considered safe
(e) The safety seore is 0.993 +- 0.002 where > 0.95 is considered safe
A genuinely marginal result would then look like:
(f) The safety score is 0.951 +- 0.002 where > 0.95 is considered safe
As a human, I would probably also be unhappy with:
(g) The safety score is 0.953 +- 0.002 where > 0.95 is considered safe
(but this would be affected by my experiences with such scores, and the
distribution of good and bad scores).
Michel.
---Sent: 2011-08-04 22:20:09 UTC