Re: BLAS-3 in interval compuations

Thread Links	Date Links
Thread Prev	Thread Next	Thread Index	Date Prev	Date Next	Date Index

Dear colleagues,

I completely agree with the remarks of Siegfried Rump.

Due to the memory access, it is not possible to develop efficient functions for BLAS 1 and BLAS 2. The ratio between the data number and the operation number is too weak. So, the peak performance of processors can not be reached with these functions.

Concerning BLAS 3, the matrix product X*Y = sum_j X(:,j)*Y(j,:) algorithm is very bad with an efficiency point of view. New algorithms are based on block operations which reduce the memory accesses.

For 8-9 years, there has been a convergence between the multimedia and the scientific computing. The new processors architecture will embed SIMD operators for the vectorisation of computations. If you want to reach a high level of performance, you need to avoid comparison. With this point of view, only the mid-rad form of intervals allows to obtain a high level of performance. If we want to make more popular the interval arithmetic, we have to propose an interval form allowing to reach a level of performance in computing time.

If the standard proposes only the infsup form, developers will have to perform 2 conversions to implement efficient functions on interval: infsup -> midrad ; function midrad-> infsup. I think that a standard without a mid-rad form would be a big error.

The standard must take care to this point.

regards,

Le 13 févr. 09 à 01:22, Siegfried M. Rump a écrit :

On Thu, 12 Feb 2009 16:03:28 -0100, Michel Hack <hack@xxxxxxxxxxxxxx> wrote:

When comparing MidRad methods with InfSup, Siegfried Rump wrote:
The reason is first that matrix mid-rad IV-multiplication can use
BLAS routines. Second it needs only 2 times to switch the rounding
mode compared to (at best) nµ2 times for inf-sup multiplication.

A good InfSup implementation would avoid the cost of rounding mode
changes; the method used may be platform-specific. When directed
operations (or pipes) are available, there is no switching. Otherwise
one would most likely use sign reversal for one bound in the internal
representation. Efficient multiplication might still be a challenge
when intervals contain zero; I have not worked out the details...

On most of today's architectures there are no directed operations
available. And the occurance of zero intervals is the whole point.

Then mid-rad interval matrix multiplication is the only known way
to use BLAS-3 routines. And this is unbeatably fast.

Siegfried

--

Jean-Luc Lamotte

Laboratoire LIP6

Universite P. et M. Curie-Paris 6

http://anp.lip6.fr/~lamotte