Re: Question on performance
Dear Arnold,
> I have a question about performance on current 754-conforming hardware:
>
> Suppose I write code consisting only of 754 floating-point operations
> and calls to simple customized additional functions such as
> nan2zero(x), which returns 0 if isnan(x), and x otherwise.
>
>
> Will the code generated by a standard, good compiler run --
>
> (i) essentially as efficient as without these function calls?
>
> (ii) essentially as efficient as if it contained explicit case
> distinctions?
>
> (iii) intermediate but still efficient?
>
> (iv) intermediate but still inefficient?
>
> In case of (ii) or (iv), could a special purpose compiler do
> significantly better?
>
> Arnold Neumaier
here is a test case. On a Core-2 under Fedora Core 12 with gcc 4.4.4 and
glibc 2.11.2, I get a slowdown by a factor of 3:
tarte% gcc -O3 -g neumaier.c -lm ; time ./a.out
s=2.7182818284590455e+00
2.129u 0.000s 0:02.15 98.6% 0+0k 0+0io 0pf+0w
tarte% gcc -DTEST -O3 -g neumaier.c -lm ; time ./a.out
s=2.7182818284590455e+00
6.368u 0.000s 0:06.39 99.5% 0+0k 0+0io 0pf+0w
I let you decide whether it corresponds to (i), (ii), (iii) or (iv).
Do other compilers give better results?
Paul Zimmermann
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
double
nan2zero (double x)
{
return isnan (x) ? 0.0 : x;
}
int
main()
{
double t, s, i, N = 1000000000.0;
/* compute s = sum(1/k!, k=0..N) */
for (t = 1.0, s = t, i = 1; i <= N; i++)
{
t /= i;
#ifdef TEST
s += nan2zero (t);
#else
s += t;
#endif
}
printf ("s=%.16e\n", s);
}