Thread Links Date Links
Thread Prev Thread Next Thread Index Date Prev Date Next Date Index

Re: Will someone make a formal motion? Re: mid-rad, inf-sup, a caution...



Baker,

That's a fair summary on the cache issue. Please excuse me, but in this context I found John's comment very shocking!

The recently discussed conversions for mid-rad (ala Vienna and Arnolds comment about the x+-r constructor) are good enough for P1788; I gave the computer graphics and non-754 mid-rad encoding examples to support this idea!!!

My only lingering concern is that we should not prevent (whenever possible) internal implementations of mid-rad. I've continued to be a little confused on this point, because I hear what I think are conflicting messages.

Nate



----- Original Message ----- From: "Ralph Baker Kearfott" <rbk@xxxxxxxxxxxx>
To: "Nate Hayes" <nh@xxxxxxxxxxxxxxxxx>
Cc: "R. Baker Kearfott" <rbk@xxxxxxxxxxxxx>; "John Pryce" <j.d.pryce@xxxxxxxxxxxx>; "P1788" <stds-1788@xxxxxxxxxxxxxxxxx>
Sent: Wednesday, May 12, 2010 6:57 PM
Subject: Re: Will someone make a formal motion? Re: mid-rad, inf-sup, a caution...


Nate et al,

Ah, so may I summarize (subject to your correction)?:

The program is somehow parallelizable because it fits into cache,
and it may not fit into cache if the data elements are too large.
I'll need to trust you on that, since I have not worked on that level with
cache and do not see the exact situation or context.

The question remains about whether or not this is a compelling
reason to standardize mid-rad (in view of arguments that we are
not preventing implementation of mid-rad internally).

Baker

On 5/12/2010 16:10, Nate Hayes wrote:
R. Baker Kearfott wrote:
Nate,

On 5/12/2010 8:07 AM, Nate Hayes wrote:
John Pryce wrote:
.
.
.

Nate, you point out mid-rad can save memory, hence bandwidth, on
large interval calculations. But the saving is always less than 50%,
ne c'est pas? because one has to store the "mid" value to full
precision. If you had a method that saves an order or two of
magnitude, it would be more convincing. But that usually comes from
an improved algorithm.

John,

I think you don't know what you're talking about. Amdahl's Law is
nonlinear. When you're already above 99% parallel, reducing the
sequential portion of a program by a very tiny amount can mean the
difference between 1,000X and 10,000X speedups.


Nate, I've very puzzled by this. John's point was that the "mid" in
mid-rad usually needs to be stored in full precision, while only
the "rad" part could be economized. What does that have to do
with Amdahl's law? Please explain or give an example. Where was
John incorrect?

Baker,

Reading and writing RAM is NOT a parallel activity, so any memory
spillage of processor cache increases the sequential part of the
program! Worse, if memory is spilled, it must later be read back in so
this can double the hit to the sequential portion. Worse again if an
algorithm "almost" fits into processor cache but repeatedly spills to
RAM because it is just "slightly over" the cache limit. This can be
HUGELY detrimental.

If a program is already 99.99% parallel (because its using compact
mid-rad intervals, for example, and fits snugly into cache), but
repeated memory spills during a long computation with larger inf-sup
intervals increase the sequential portion of the program to .06%, the
original 10,000X speedup is reduced to 2,000X (almost an order of
magnitude).

John says it will always be less than a 50% difference. How can he
justify this!? I would like to know. It seems to me he is just drawing a
linear correlation between the speedup and the size, in bits, of a
single compact mid-rad vs. inf-sup interval!!!

Nate



--

---------------------------------------------------------------
R. Baker Kearfott,    rbk@xxxxxxxxxxxxx   (337) 482-5346 (fax)
(337) 482-5270 (work)                     (337) 993-1827 (home)
URL: http://interval.louisiana.edu/kearfott.html
Department of Mathematics, University of Louisiana at Lafayette
(Room 217 Maxim D. Doucet Hall, 1403 Johnston Street)
Box 4-1010, Lafayette, LA 70504-1010, USA
---------------------------------------------------------------