Software Performance Benchmarking with JMH

measuring, predicting, and optimizing software performance in Java using JMH

Performance is one of the most central non-functional properties of modern software. And yet we all experience the applications we use on a daily basis to continuously become slower, less reliable, and more bloated.

One of the reasons for this is that actually testing performance is much harder than testing functional correctness, and hence much more rarely done.

For the last 10 years, ICET-lab has studied how Java developers can use the Java Microbenchmark Harness (JMH) to continuously benchmark their system, for example as part of their CI pipeline.

A JMH benchmark example from the Protobuf project

Concrete research results include detecting anti-patterns in JMH benchmarks which can lead to misleading measurement results (Costa et al., 2021), demonstrating that statistical methods can be used to significantly reduce required benchmark repetitions (Laaber et al., 2020), or experiments with coverage-based benchmark selection (Laaber et al., 2021).

In this line of research, we have also developed multiple open source tools that can support benchmarking research and practice, including Junit-to-JMH, a tool to generate performance benchmark suites from unit tests (Jangali et al., 2022), and Bencher, a tool to analyse static and dynamic coverage of JMH benchmarks.

The impact of bad JMH practices on benchmark results
Dynamically reconfiguring JMH to reduce benchmark execution time

In our ongoing work in this research theme, we are particularly interested in:

  • How to bootstrap performance testing in a project by generating (initial) performance test suites. Junit-to-JMH (Jangali et al., 2022) is a first stab into this direction.

  • How to predict the execution time of benchmarks (and, hence, performance) prior to execution. We have already achieved initial success predicting the execution time of small pieces of code using graph-based neural networks (Samoaa et al., 2022). The ultimate vision, of course, is to be able to warn developers before committing slow code, without the need for expensive performance testing.

  • How to make performance testing easier, through performance assessment bots (Markusse et al., 2022) or good visualizations (Cito et al., 2019).


Dr. Christoph Laaber (probably the world’s foremost expert on academic research about JMH benchmarking)

Dr. Philipp Leitner

  1. What’s Wrong with My Benchmark Results? Studying Bad Practices in JMH Benchmarks
    Diego Costa, Cor-Paul Bezemer, Philipp Leitner, and Artur Andrzejak
    IEEE Transactions on Software Engineering, Apr 2021
  2. Dynamically Reconfiguring Software Microbenchmarks: Reducing Execution Time without Sacrificing Result Quality
    Christoph Laaber, Stefan Würsten, Harald C. Gall, and Philipp Leitner
    In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering , Virtual Event, USA, Apr 2020
  3. Applying test case prioritization to software microbenchmarks
    Christoph Laaber, Harald C. Gall, and Philipp Leitner
    Empirical Software Engineering, Apr 2021
  4. Automated Generation and Evaluation of JMH Microbenchmark Suites from Unit Tests
    Mostafa Jangali, Yiming Tang, Niclas Alexandersson, Philipp Leitner, Jinqiu Yang, and Weiyi Shang
    IEEE Transactions on Software Engineering (TSE), Apr 2022
  5. TEP-GNN: Accurate Execution Time Prediction of Functional Tests using Graph Neural Networks
    Hazem Peter Samoaa, Antonio Longa, Mazen Mohamad, Morteza Haghir Chehreghani, and Philipp Leitner
    In Proceedings of the 23nd International Conference on Product-Focused Software Process Improvement (PROFES) , Apr 2022
  6. Using Benchmarking Bots for Continuous Performance Assessment
    Florian Markusse, Philipp Leitner, and Alexander Serebrenik
    IEEE Software, Apr 2022
    To appear.
  7. Interactive Production Performance Feedback in the IDE
    Jürgen Cito, Philipp Leitner, Martin Rinard, and Harald Gall
    In Proceedings of the 41st International Conference on Software Engineering (ICSE) , Apr 2019