Thread Links			Date Links
Thread Prev	Thread Next	Thread Index	Date Prev	Date Next	Date Index

Re: Fw: min / max and empty intervals - Entire and Missing Data

To: stds-1788@xxxxxxxxxxxxxxxxx
Subject: Re: Fw: min / max and empty intervals - Entire and Missing Data
From: Ian McIntosh <ianm@xxxxxxxxxx>
Date: Mon, 13 Jun 2011 17:29:42 -0400
Cc: Arnold Neumaier <Arnold.Neumaier@xxxxxxxxxxxx>
Delivered-to: mhonarc@xxxxxxxxxxxxxxxx
In-reply-to: <4DF1F106.2080903@xxxxxxxxxxxx>
List-help: <http://listserv.ieee.org/cgi-bin/wa?LIST=STDS-1788>, <mailto:LISTSERV@LISTSERV.IEEE.ORG?body=INFO%20STDS-1788>
List-owner: <mailto:STDS-1788-request@LISTSERV.IEEE.ORG>
List-subscribe: <mailto:STDS-1788-subscribe-request@LISTSERV.IEEE.ORG>
List-unsubscribe: <mailto:STDS-1788-unsubscribe-request@LISTSERV.IEEE.ORG>
References: <OF1E175E2E.70A07814-ON852578AA.00577CD1-852578AA.00686BA6@xxxxxxxxxx> <4DF1F106.2080903@xxxxxxxxxxxx>
Sender: stds-1788@xxxxxxxx

Arnold Neumaier wrote:
> Ian McIntosh wrote: >> Let's take a concrete example. For the set of all adults in the USA on >> July 1st 2011, measure their height. Since there's some measurement error >> and height can vary throughout the day, use intervals with reasonable >> bounds. >> >> Now find out the minimum and maximum heights. No problem. >> >> But what if you only have data for 1% of the people? If you treat the >> unknowns as Entire, the minimum height is -oo and the maximum +oo. > >And that's the only thing you can say with certainty. Interval analysis >is about certainty, not about the estimation of probabilities. The >latter is the domain of statistics, and will never be certain. I disagree. You can say more than one thing with certainty:
1. The minimum height is -oo and the maximum +oo (your only certain result).
2. Using domain knowledge, you can make that 0 to +oo.
3. The minimum height in the known data is X and the maximum is Y, where 0 <= X <= Y and X <= Y << +oo.
(Obviously what you CAN'T say is that the minimum height is X and the maximum is Y.)
4. The decoration would say that some unknown values had been omitted, so the results should not be confused with calculations that included all values.
(The program should detect that and print it in some meaningful application-specific way.)
5. If you bother to count, you can say that the known data is a specific fraction (in this example 1%) of the total data.

If the calculation used operations with names like minknown(a,b), maxknown(a,b) and countknown(a,previouscount) then only users wanting to would see any difference. Those wanting to know only that the height is between 0 and +oo would not be affected.

Where's the harm in providing better tools to deal with unknown data? Partial data is almost universal in the real world, and we have to deal with it, just as we have to deal with finite precision in measurements.

Of course any programmer could write the functions I suggested on top of the standard, so they don't have to be part of the standard. The main advantage is the proposed "unknown data omitted" decoration, to distinguish certainty from uncertainty. It's a lot like using ranges instead of single values, to make the uncertainty in the exact value of a result visible and quantifiable instead of hidden and possibly infinite.

- Ian McIntosh IBM Canada Lab Compiler Back End Support and Development

References:
- Fw: min / max and empty intervals - Entire and Missing Data
  - From: Ian McIntosh
- Re: Fw: min / max and empty intervals - Entire and Missing Data
  - From: Arnold Neumaier

Prev by Date: Re: min / max and empty intervals - Entire and Missing Data
Next by Date: Re: Neumaier-Pryce proposed decoration system (v03.2)
Previous by thread: Re: Fw: min / max and empty intervals - Entire and Missing Data
Next by thread: Re Neumaier-Pryce proposed decoration system (v03.2).eml 3rd attempt
Index(es):
- Date
- Thread