Introduction

This is the guide for Iai-Callgrind, a benchmarking framework/harness which uses Valgrind's Callgrind and other Valgrind tools like DHAT, Massif, ... to provide extremely accurate and consistent measurements of Rust code, making it perfectly suited to run in environments like a CI.

Iai_Callgrind is fully documented in this guide and in the api documentation at docs.rs.

Iai-Callgrind is

  • Precise: High-precision measurements of Instruction counts and many other metrics allow you to reliably detect very small optimizations and regressions of your code.
  • Consistent: Iai-Callgrind can take accurate measurements even in virtualized CI environments and make them comparable between different systems completely negating the noise of the environment.
  • Fast: Each benchmark is only run once, which is usually much faster than benchmarks which measure execution and wall-clock time. Benchmarks measuring the wall-clock time have to be run many times to increase their accuracy, detect outliers, filter out noise, etc.
  • Visualizable: Iai-Callgrind generates a Callgrind (DHAT, ...) profile of the benchmarked code and can be configured to create flamegraph-like charts from Callgrind metrics. In general, all Valgrind-compatible tools like callgrind_annotate, kcachegrind or dh_view.html and others to analyze the results in detail are fully supported.
  • Easy: The API for setting up benchmarks is easy to use and allows you to quickly create concise and clear benchmarks. Focus more on profiling and your code than on the framework.

Design philosophy and goals

Iai-Callgrind benchmarks are designed to be runnable with cargo bench. The benchmark files are expanded to a benchmarking harness which replaces the native benchmark harness of rust. Iai-Callgrind is a profiling framework that can quickly and reliably detect performance regressions and optimizations even in noisy environments with a precision that is impossible to achieve with wall-clock time based benchmarks. At the same time, we want to abstract the complicated parts and repetitive tasks away and provide an easy to use and intuitive api. Iai-Callgrind tries to stay out of your way so you can focus more on profiling and your code!

When not to use Iai-Callgrind

Although Iai-Callgrind is useful in many projects, there are cases where Iai-Callgrind is not a good fit.

  • If you need wall-clock times, Iai-Callgrind cannot help you much. The estimation of cpu cycles merely correlates to wall-clock times but is not a replacement for wall-clock times. The cycles estimation is primarily designed to be a relative metric to be used for comparison.
  • Iai-Callgrind cannot be run on Windows and platforms not supported by Valgrind.

Improving Iai-Callgrind

No one's perfect!

You want to share your experience with Iai-Callgrind and have a recipe that might be useful for others and fits into this guide? You have an idea for a new feature, are missing a functionality or have found a bug? We would love to here about it. You want to contribute and hack on Iai-Callgrind?

Please don't hesitate to open an issue.

You want to hack on this guide? The source code of this book lives in the docs subdirectory.