Re: Revised Motion 26 decoration scheme
Vincent Lefevre wrote:
On 2011-07-20 16:45:06 -0500, Nate Hayes wrote:
Vincent Lefevre wrote:
>That's the goal of the compiler to ensure that data are aligned. For
>instance, if a 64-bit alignment is OK, then 8 decorated intervals
>(I1,D1), ..., (I8,D8) could be stored in the following way:
>
>I1 I2 ... I8 [D1 D2 ... D8]
>
>where the 8 decorations are packed in a 64-bit word (4 would be
>sufficient for a 32-bit alignment).
I understand.
But think about the problem now of "putting back together", say (I1,D1)
and
(I7,D7) at the processor. You have to do the following:
-- Load I1
-- Load I7
-- Load D1
-- Load D7 (likely an unaligned memory access)
Thats a total of 4 memory moves, and one of them likely will be
unaligned.
Actually only 3 possibly slow ones, without unaligned memory accesses.
Yes, thanks for the correction. But still 4 moves if the decorations are far
enough apart in memory, e.g., (I1,D1) and (I70,D70).
* Decorations can be loaded individually with load-byte instructions
(by definition, bytes are always aligned, but how they are grouped
can be important for performance). Internally, this is probably
decomposed by the processor into a load-word and byte extraction.
One would seek to group decorations in a single L1 cache line, so
that a second decoration load would generally be very fast.
Only if the locality of the decorations are packed closely to each-other (it
may depend greatly on the application).
If the data is accessed in any other pattern, then actual
performance can be worse, of course.
Yes, but I still think that packing would improve cache-related
performance.
Oh, I don't doubt that. If decorated intervals *are* being used in an
application, all of these strategies might be helpful. Wether people will
take the necesessary development effort to make such implementations could
be another question, though.
Anyhow, its all still going to be worse than just using bare intervals and
bare decorations. So if the nature of the application doesn't require
decorated intervals, then programmers (not just compilers) should be able to
specify this by some mechanism.
>Do you mean dropping decorations when an interval needs to be returned?
Yes, if it is safe to do so.
If it is safe (e.g. the compiler can detect that decorations are
not used), I agree. There should also be an interval datatype so
that the user explicitly says that decorations are not used.
Yes.
Still, the standard should also specify decorated intervals, with
performance in mind even in this case.
Yes, I agree.
Since Motion 8, I've always believed this.
There is an algebraic structure for operations involving bare objects
(see
the appropriate section in my Nov. 14 DRAFT paper, e.g.). Very briefly:
operations on bare intervals give the usual bare interval result, unless
an
exception occurs and then a bare decoration is given as result instead.
[...]
I partly disagree on that. There is a second possibility: if an
exception occurs, then one may still want a bare interval (just
because an exception can occur only because ranges will generally
become larger than the real ones due to rounding and variable
duplications).
Yes (we don't disagree on this). In my paper, one can specify a threshold to
determine what constitutes an "exception". If one sets the threshold
accordingly, then no operation on bare intervals can generate an exception
at all. In this case, Ulrich's Motion 5 arithmetic on bare intervals is the
result; in other words, Motion 5 is the special case of arithmetic on bare
intervals when no exceptions (and hence no bare decorations) are introduced
into the computation.
However the standard could support both. The second
possibility shouldn't be a problem as it can semantically be
decomposed into a normal decorated operation and a decorated
interval -> bare interval conversion.
Yes.
Nate