In βTurbo-Charging Your PHPUnit Suiteβ I made a small promise: Before you change a single line of a slow test suite, you should measure, and a small custom script is enough to parse PHPUnit's Open Test Reporting logfile and surface the slowest tests.
I wrote that script. Then it grew up. It is now a tool called otr-report, and this article is about what it does and why I think it is worth installing.
The data is already there
Every time PHPUnit runs your tests, it knows a great deal that it never shows you. It knows how long each individual test took, how much CPU time it consumed, how much memory it allocated, how many assertions it made, and exactly why it failed when it did. By default, almost all of that knowledge evaporates the moment the run finishes and the summary line scrolls past.
Open Test Reporting (OTR) is how you keep it. OTR is a structured XML format for test results, designed to be language-agnostic and tool-agnostic. PHPUnit can write an OTR logfile for any run with the --log-otr option:
$ phpunit --log-otr /tmp/otr/run.xml
PHPUnit has been able to write OTR logfiles since version 12.2. What changed with PHPUnit 13.2 is how much it records. The format now carries even more information that is specific to PHP and PHPUnit: per-test and per-test-suite resource usage (wall-clock time, CPU time, and peak memory usage), the TestDox names of your classes and methods, the number of assertions each test performed, and structured failure details with expected value, actual value, and diff.
OTR itself is generic, but the logfile that PHPUnit 13.2 produces includes more details than the generic schema requires. otr-report is the tool that reads that detail back out and turns it into something you can act on. It does not replace PHPUnit's own output. It answers the questions that output was never meant to.
Which tests are slow
The first question, and the one that started the whole project, is simple. Which tests are eating my time?
The slowest command reads the logfile of a single run and prints the slowest tests, ordered from slowest to fastest:
$ otr-report slowest /tmp/otr/run.xml otr-report 1.0.1 by Sebastian Bergmann. Time(s) Test ------- ---- 4.441529 SebastianBergmann\Raytracer\PuttingItTogetherTest::test_chapter_8 3.771325 SebastianBergmann\Raytracer\PuttingItTogetherTest::test_chapter_6 1.375265 SebastianBergmann\Raytracer\PuttingItTogetherTest::test_chapter_5 0.845473 SebastianBergmann\Raytracer\PuttingItTogetherTest::test_chapter_10 0.039408 SebastianBergmann\Raytracer\PuttingItTogetherTest::test_chapter_4 ...
This is almost always more honest than intuition. As I argued in the earlier article: intuition lies, and the slowest test is rarely the one the team expects.
A flat list of the ten slowest tests is a good start, but it does not tell you whether those ten are outliers or simply the top of a uniformly slow suite. The --above-mean option answers that. It calculates the mean runtime across all tests and then lists only the tests that are slower than the mean, each annotated with how many times slower than the mean it is:
$ otr-report slowest --above-mean /tmp/otr/run.xml otr-report 1.0.1 by Sebastian Bergmann. Mean test runtime: 0.059520 s (177 tests, 4 slower than mean) Time(s) x mean Test ------- ------ ---- 4.441529 74.62x SebastianBergmann\Raytracer\PuttingItTogetherTest::test_chapter_8 3.771325 63.36x SebastianBergmann\Raytracer\PuttingItTogetherTest::test_chapter_6 1.375265 23.11x SebastianBergmann\Raytracer\PuttingItTogetherTest::test_chapter_5 0.845473 14.20x SebastianBergmann\Raytracer\PuttingItTogetherTest::test_chapter_10
Four tests out of 177, each between fourteen and seventy-five times slower than the average, is a very different diagnosis from βthe suite is slowβ. It tells you exactly where half a day of focused work will pay off, and which 173 tests you can safely leave alone.
Wall-clock time is not the only thing that matters. A test can be fast and still allocate hundreds of megabytes, and memory pressure is its own kind of slowness once a parallel runner is fighting for it. The --sort option lets you rank by time (the default), cpu (user plus system CPU time), or memory (peak memory usage):
$ otr-report slowest --sort memory --limit 3 /tmp/otr/run.xml otr-report 1.0.1 by Sebastian Bergmann. Memory Test ------- ---- 23489616 SebastianBergmann\Raytracer\PuttingItTogetherTest::test_chapter_6 22907872 SebastianBergmann\Raytracer\PuttingItTogetherTest::test_chapter_4 22044384 SebastianBergmann\Raytracer\PuttingItTogetherTest::test_chapter_5
The --limit option, shown above, controls how many tests are listed; it defaults to ten and combines with every other option.
From which to why
otr-report slowest tells you which tests are slow. It does not tell you why, and that boundary matters.
A ranked list is a starting point for an investigation, not the end of one. When you know which test dominates the runtime but not what it is doing with that time, that is the moment to reach for the disciplines I describe in βDebugging Performance in PHPβ: tracing to see what happened, profiling to see what cost the most, and benchmarking to prove that your fix actually helped.
The two tools complement each other. otr-report narrows a suite of thousands of tests down to the handful worth profiling, so that you point your profiler at the right code instead of the whole run. The profiler then tells you whether that slow test is rebuilding a database schema, booting a framework, or sleeping in a helper three layers deep. Measure broadly first, then zoom in.
Getting better or worse
A single run is a snapshot. Another interesting question over the life of a project is the trend. Is the suite getting slower, one innocuous-looking commit at a time, or is your optimisation work actually holding?
The trends command reads every OTR logfile in a directory and generates a single self-contained HTML report. The natural way to feed it is to archive one logfile per continuous-integration run into a shared directory:
$ phpunit --log-otr /tmp/otr/2026-02-02.xml $ phpunit --log-otr /tmp/otr/2026-02-09.xml $ phpunit --log-otr /tmp/otr/2026-02-16.xml $ otr-report trends /tmp/otr /tmp/trends.html otr-report 1.0.1 by Sebastian Bergmann. Wrote trends report to /tmp/trends.html
The report charts the total runtime of the suite and the number of tests across all runs, ordered by when each run started. It also lists the ten slowest tests of the most recent run, each with a sparkline of its runtime across every recorded run and the change relative to the first time that test was measured. A test that has slowly tripled in runtime over three months is the kind of drift no single run will ever reveal, but that a trend report shows you immediately.
The whole picture
Performance is the reason I started this tool, but the OTR logfile records the outcome of every test, not just its timing. The results command turns a single run into a self-contained HTML report of the whole picture:
$ otr-report results /tmp/otr/run.xml /tmp/results.html otr-report 1.0.1 by Sebastian Bergmann. Wrote test results report to /tmp/results.html
The report opens with a summary of the run and gives you a sticky sidebar with a collapsible tree that groups tests by namespace, then class, then method, with a status indicator at every level. Each test shows its status, its runtime, and, where relevant, its reason, its throwable, and any issues it triggered, such as being marked risky. Tests that did not pass are expanded by default; the ones that passed stay out of your way. With the --testdox option, the report identifies classes and methods by their prettified TestDox names rather than their PHP names, which turns a results report into something you can hand to someone who does not read PHP for a living.
Installing otr-report
otr-report is distributed as a PHP Archive (PHAR). The recommended way to manage it as a project tool dependency is Phive:
$ phive install otr-report $ ./tools/otr-report --version
You can also download the PHAR directly from phar.phpunit.de. As with PHPUnit itself, I do not recommend installing it with Composer.
It needs nothing more than an OTR logfile, which means it adds nothing to your test run except a single command-line option. The data has been there all along, on every run, waiting for someone to read it.
Now you can.