Re: Decorated intervals ARE intervals...
Baker,
In response to your question, here is some of my understanding and personal
experience on the issue:
Modern computer architectures usually have several levels of cache. This is
because processing the data once it arrives at the CPU is usually much, much
faster than the time it takes to read the data from memory (sometimes by as
much as an order of magnitude or more, I believe). So the performance of
most HPC software applications these days is determined almost entirely by
how efficiently data can be moved from memory into the CPU.
Memory busses (to my knowledge) usually carry a number of bits that is a
power of two. Moving a non-power of two datatype (like a 17-byte decorated
interval) over such a bus either requires a) changing the hardware bus so it
can carry more bits or b) breaking the datatype into smaller pieces that can
effectively be sent over the bus in multiple moves, which of course takes
more time.
Buffering a 17-byte datatype into a 32-bit window within the various levels
of cache means less cache will be available to store additional data. This
may cause excessive reading and writing of data to and from main memory, a
situation that is often reffered to as "thrashing" of the cache. Depending
on the memory-access patterns of the HPC application, this can cause the CPU
to become starved for resources as the memory latencies increase and the
efficiency of the cache decreases.
The semiconductor industry is seeing a trend away from single-core
processors that seemed every year to double in speed. Instead, the trend is
towards multi-core processors and parallel computing. In our own laboratory,
we have a 48-core machine, and Intel has predicted by 2015 that desktop
computers may have more than a hundred processors.
Both the issues of memory latency and cache "thrashing" mentioned above are
the death-knell of massively parallel software. The problem is that reading
data into and out of the processor (via the various cache levels and
hardware busses) is not a parallel activity, i.e., it is mostly sequential.
Since the maximum theoretical speedup of a parallel software program on a
certain number of processors is bounded by Amdahl's Law:
http://en.wikipedia.org/wiki/Amdahl's_law
and since Amdahl's Law is a function of the ratio of the parallel and
sequential portions of a program, a tiny increase in the sequential portion
of a program will decrease the overall performance of the HPC application in
a nonlinear fashion.
For example, a program that is 99% parallel and only 1% sequential has a
maximum theoretical speedup potential of 100X, even if the number of
available processors is unlimited. If the sequential portion of the program
increases even by a very small amount to, say, 2% (so the program is now
98% parallel), the maximum theoretical speedup potential drops to 50X!
In other words, it takes only a very small increase of the sequential
portion of the program to quickly lose an order of magnitude (or more) in
theoretical speedup potential. So if you are speaking about 12% or 34%, this
could easily translate into several orders of magnitude of actual loss of
performance.
The good news is that interval branch-and-bound algorithms are naturally
parallel. This is one of the reasons I believe so strongly that interval
arithmetic is the future of the HPC industry.
However, keep in mind another very important fact about branch-and-bound:
decorated intervals are *useless* to these algorithms. The reasons is
because as long as the computation is "safe", only the bare interval is
required. On the other hand, once an "unsafe" operation occurs in the middle
of a lengthy computation, only the bare decoration is required to propagate.
Nowhere is both the interval and decoration portion of the decorated
interval needed simultaneously.
Taking all the above into consideration, these are a few reasons why I
believe computing with bare intervals and bare decorations will be very
important for a future IEEE 1788 standard.
Sincerely,
Nate
BTW, attached is neat article about Amdahl's Law in the Multi-Core Era:
http://www.cs.wisc.edu/multifacet/papers/ieeecomputer08_amdahl_multicore.pdf
----- Original Message -----
From: "Ralph Baker Kearfott" <rbk@xxxxxxxxxxxx>
To: "John Pryce" <j.d.pryce@xxxxxxxxxxxx>
Cc: "stds-1788" <stds-1788@xxxxxxxxxxxxxxxxx>
Sent: Sunday, July 03, 2011 3:20 PM
Subject: Re: Decorated intervals ARE intervals...
John, P-1788,
Can someone clarify what "severe storage and communication penalties"
means? For example, my naive software implementation of decorations
would be to tack on an integer (in whatever format is most efficient
for the machine) to the two doubles. If it is a 2-byte integer, the
storage penalty would be 2/(16+2) < 12 percent. Tacking on an extra
8 bytes (wasteful from the point of view of storage, but conceivably
logical to do in some circumstances), the storage penalty would be
< 34 percent.
In my own work, reliability and predictability of the interval
operations has usually (but not necessarily always)
trumped speed of the raw arithmetic. What is the experience of
others?
Baker
On 07/02/2011 08:49 AM, John Pryce wrote:
Dan& all
On 2 Jul 2011, at 01:04, Dan Zuras Intervals wrote:
I have stated this before& the current discussion only
strengthens my opinion that decorated intervals should
BE intervals. Full stop. Certainly at level 1 but as
far down as we can push it as well.
It would be nice if the standard could take that view, but what about the
"17-byte problem"?
I.e., decorating the most popular kind of interval (a pair of doubles)
looks as if it
incurs severe storage and communication penalties.
That's why the decoration system includes the Hayes/Neumaier "bare object
arithmetic"
scheme, which otherwise would be pointless.
As someone who has long experience of how computer systems evolve, will
you hazard
to predict whether the 17-byte problem will be solved by hardware advances
or software
advances (so we needn't worry about it much); or will continue to be a
problem?
John
--
---------------------------------------------------------------
R. Baker Kearfott, rbk@xxxxxxxxxxxxx (337) 482-5346 (fax)
(337) 482-5270 (work) (337) 993-1827 (home)
URL: http://interval.louisiana.edu/kearfott.html
Department of Mathematics, University of Louisiana at Lafayette
(Room 217 Maxim D. Doucet Hall, 1403 Johnston Street)
Box 4-1010, Lafayette, LA 70504-1010, USA
---------------------------------------------------------------