Thread Links			Date Links
Thread Prev	Thread Next	Thread Index	Date Prev	Date Next	Date Index

Fw: min / max and empty intervals - Entire and Missing Data

To: stds-1788@xxxxxxxxxxxxxxxxx
Subject: Fw: min / max and empty intervals - Entire and Missing Data
From: Ian McIntosh <ianm@xxxxxxxxxx>
Date: Thu, 9 Jun 2011 15:00:27 -0400
Delivered-to: mhonarc@xxxxxxxxxxxxxxxx
List-help: <http://listserv.ieee.org/cgi-bin/wa?LIST=STDS-1788>, <mailto:LISTSERV@LISTSERV.IEEE.ORG?body=INFO%20STDS-1788>
List-owner: <mailto:STDS-1788-request@LISTSERV.IEEE.ORG>
List-subscribe: <mailto:STDS-1788-subscribe-request@LISTSERV.IEEE.ORG>
List-unsubscribe: <mailto:STDS-1788-unsubscribe-request@LISTSERV.IEEE.ORG>
Sender: stds-1788@xxxxxxxx

From:	Ralph Baker Kearfott <rbk@xxxxxxxxxxxx>
To:	Ian McIntosh/Toronto/IBM@IBMCA
Date:	06/09/2011 02:11 AM
Subject:	Re: min / max and empty intervals

On 6/9/2011, Ralph Baker wrote:
>On 6/7/2011 7:07 AM, Arnold Neumaier wrote: >> Dan Zuras Intervals wrote: >>>> Date: Tue, 07 Jun 2011 08:43:03 +0200 >>>> From: Arnold Neumaier <Arnold.Neumaier@xxxxxxxxxxxx> >>>> To: Dan Zuras Intervals <intervals08@xxxxxxxxxxxxxx> >>>> CC: John Pryce <j.d.pryce@xxxxxxxxxxxx>, stds-1788@xxxxxxxxxxxxxxxxx >>>> Subject: Re: min / max and empty intervals >>>> >>>> Dan Zuras Intervals wrote: > >>> Actually, John touched on the reasonable application for >>> NaNs in a matrix. That of as yet unknown data. >> >> Unknown values represented in interval analysios by Entire, not by Empty! > >I personally agree with Arnold on this point. It is tied to the basic >philosophy underlying interval analysis. > >Baker
I understand and agree with the reasons that unknown should normally be Entire, not Empty (and BTW, one of the multiple meanings of NaN is equivalent to Entire).

At the same time, a standard is better if it can apply to diverse situations, not just the usage that led to it, so we should think about other potential applications and their implications.

Suppose I had a large set of data and wanted to do some analysis on it, but many values were unknown. If unknown is represented as Entire and I use the obvious approach, then most of my answers will be Entire and I will know nothing except that I know nothing. If I skip unknown values then I can produce answers tighter than Entire, and I will know something, with the caveat that the answers are not guaranteed to be correct. I may consider that useful.

Let's take a concrete example. For the set of all adults in the USA on July 1st 2011, measure their height. Since there's some measurement error and height can vary throughout the day, use intervals with reasonable bounds.

Now find out the minimum and maximum heights. No problem.

But what if you only have data for 1% of the people? If you treat the unknowns as Entire, the minimum height is -oo and the maximum +oo. Ruling out negatives still doesn't give a useful answer. If you treat the unknowns as "Ignore this unknown value", then you get minimum and maximum heights for the people you have data for. You can't claim that the answers are exactly what you were asked, but you can say they are correct for the 1% subset of the cases you have data for, and if you know statistics you may say that the true answers should not be a much larger range than the answers for this subset.

There are many real applications where intervals could be useful if missing data can be ignored and the limitations are understood as part of the results.

So here are my questions: Can we define a decoration for "missing data" or "unknown", and decoration operations which when encountering that produce "some data is missing" or "some data is unknown"? Is it better to define specific operations to be used in such cases (eg, max_known_value)? Can either or both of those be done in a consistent way? Can they be done without damaging other things? Would that increase (or decrease?) the usefulness of the standard?

- Ian McIntosh IBM Canada Lab Compiler Back End Support and Development

Follow-Ups:
- Re: Fw: min / max and empty intervals - Entire and Missing Data
  - From: Arnold Neumaier
- Re: Fw: min / max and empty intervals - Entire and Missing Data
  - From: Dominique Lohez
- Re: Fw: min / max and empty intervals - Entire and Missing Data
  - From: N.M. Maclaren
- Re: Fw: min / max and empty intervals - Entire and Missing Data
  - From: Ralph Baker Kearfott

Prev by Date: Re: Fw: min / max and empty intervals - Entire and Missing Data
Next by Date: Re: As simple as it is now, I am still against motion 24.03...
Previous by thread: As simple as it is now, I am still against motion 24.03...
Next by thread: Re: Fw: min / max and empty intervals - Entire and Missing Data
Index(es):
- Date
- Thread