RE: fwd from Jim Demmel: More on repeatability

Thread Links	Date Links
Thread Prev	Thread Next	Thread Index	Date Prev	Date Next	Date Index

In an indirect support of Ian's statement: when we worked on measuring instruments and measuring systems for measuring dynamically changing quantities, we had to live with the fact that the resulting time delays (and hence computation results) differed from measurement to measurement. For example, an operating system would perodically go into a garbage-collection-type mode or checking-whether-everything-works-OK mode, in both cases delaying the procedures that wanted it to perform.

This was a big difference, and we also had smaller differences due to other factors that Ian mentions.

From: stds-1788@xxxxxxxx [stds-1788@xxxxxxxx] On Behalf Of Ian McIntosh [ianm@xxxxxxxxxx]
Sent: Friday, August 05, 2011 9:03 AM

Repeatability on the same machine is both widely expected and sometimes unachieved. Several things can interfere with it:
- Uninitialized variables.
- Race conditions, where it's partly a matter of luck whether a load in one thread is before or after a store in another. Race conditions are usually bugs but sometimes deliberately allowed and considered a valid part of the algorithm.
- Less obviously, you can get race conditions without multithreading if the operating system and hardware let you access the same physical memory via different virtual addresses; eg, lets you use mmap to map the same file to two different blocks of memory.
- Adaptive multithreading, where the number of threads used depends on how many are available, which may change due to workload, upgrades, scheduled maintenance, or partial system failures. Changing the number of threads may affect how the work is divided which can affect things like parallelized reductions.
- Running the program in a debugger versus directly. It's not obvious, but some instructions in some architectures (including some used for atomic operations, thread synchronization or transactional memory) cannot function exactly as normal when a debugger does a context switch to stop at breakpoints, to step through instructions one at a time, or to trace instructions.
- Dependencies on external data or events, like data files, networks, using the time to set a random seed or other data.
- A hardware design bug (most CPUs ever built have had at least one).
- A hardware malfunction.

Some of those are rare, but I've experienced every one of them.

Some of them would manifest themselves as slightly wider intervals. Some would just produce wrong answers violating containment. Generally they should not affect our standard, except that we need to be careful of wording. Requiring "always reproducible results" is impossible, even running the same executable program on the same system. Requiring "reproducible results under identical conditions" is practical, whether it is part of the standard or not.

- Ian McIntosh IBM Canada Lab Compiler Back End Support and Development