[
Date Prev][
Date Next][
Thread Prev][
Thread Next][
Date Index][
Thread Index]
Re: differences between implementations of IEEE-754 basic operators
getting reproducible results ...
is effectively incompatible with most forms of parallelism.
Finding ANY useful form of parallelism in most programs is pretty hard.
How many laptop users can keep even three threads active?
But the most typical type that can be profitably exploited is doing
the same or similar things at the same time, e.g. computing two or more elements
of a matrix product at once. That is perfectly reproducible,
with any number of processors.
What is NOT reproducible is reduction operations - using a varying number
of processors to divide up the work of computing one element of a matrix
product, for instance. In many big simulations that dominate big
technical computing, there is no present economical alternative to parallelizing
reductions - so reproducibility is indeed not economical. However most
users of floating-point arithmetic are not doing those.
And in an era when memory latency may be addressed more by large numbers of
active threads rather than by cache blocking, reduction optimizations
may become less often important even in big technical computing.