Comparison of Iai-Callgrind with Criterion-rs
This is a comparison with Criterion-rs but some of the points in Pros and Cons also apply to other wall-clock time based benchmarking frameworks.
Iai-Callgrind Pros:
-
Iai-Callgrind can give answers that are repeatable to 7 or more significant digits. In comparison, actual (wall-clock) run times are scarcely repeatable beyond one significant digit.
This allows to implement and measure "microoptimizations". Typical microoptimizations reduce the number of CPU cycles by
0.1%
or0.05%
or even less. Such improvements are impossible to measure with real-world timings. But hundreds or thousands of microoptimizations add up, resulting in measurable real-world performance gains.1 -
Iai-Callgrind can work reliably in noisy environments especially in CI environments from providers like GitHub Actions or Travis-CI, where Criterion-rs cannot.
-
The benchmark api of Iai-Callgrind is simple, intuitive and allows for a much more concise and clearer structure of benchmarks.
-
Iai-Callgrind can benchmark functions in binary crates.
-
Iai-Callgrind can benchmark private functions.
-
Although Callgrind adds runtime overhead, running each benchmark exactly once is still usually much faster than Criterion-rs' statistical measurements.
-
Criterion-rs creates plots and graphs about the averages, median etc. which adds considerable execution time to the execution time for each benchmark. Iai-Callgrind doesn't need any of these plots, since it can collect all its metrics in a single run.
-
Iai-Callgrind generates profile output from the benchmark without further effort.
-
With Iai-Callgrind you have native access to all the possibilities of all Valgrind tools, including Valgrind Client Requests.
Iai-Callgrind/Criterion-rs Mixed:
- Although it is usually not significant, due to the high precision of the
Iai-Callgrind measurements changes in the benchmarks themselves like adding a
benchmark case can have an effect on the other benchmarks. Iai-Callgrind can only try to reduce these effects to a minimum but never completely eliminate them. Criterion-rs does not have this problem because it cannot detect such small changes.
Iai-Callgrind Cons:
- Iai-Callgrind's measurements merely correlate with wall-clock time. Wall-clock time is an obvious choice in many cases because it corresponds to what users perceive and Criterion-rs measures it directly.
- Iai-Callgrind can only be used on platforms supported by Valgrind. Notably, this does not include Windows.
- Iai-Callgrind needs additional binaries,
valgrind
and theiai-callgrind-runner
. The version of the runner needs to be in sync with theiai-callgrind
library. Criterion-rs is only a library and the installation is usually simpler.
Especially, due to the first point in the Cons
, I think it is still required
to run wall-clock time benchmarks and use Criterion-rs
in conjunction with
Iai-Callgrind. But in the CI and for performance regression checks, you
shouldn't use Criterion-rs
or other wall-clock time based benchmarks at all.