This article is based on a presentation I gave for the first time today at the International PHP Conference in Berlin.
It has been fourteen minutes. The tests are still running. Your hot beverage has gone cold. Somewhere in the back of your mind, you have already accepted that you will not finish what you started before lunch.
Slow tests are not a virtue. They are a bug. A test suite that nobody runs is worse than no test suite at all, because it gives you the comforting illusion of safety while quietly drifting out of step with the code it is supposed to protect.
In my presentation “Turbo-Charging Your PHPUnit Suite”, I took a detailed look at why test suites get slow, what you can actually do about it, and where the easy wins are.
This article distils the central ideas without the code examples and the live profiling output. If you would like more information, please look at the presentation material or join me when I give this presentation again.
The real cost of slow tests
The arithmetic of waiting is unforgiving. A team of eight developers, each running the suite twenty times a day, spending ten minutes per run on it, burns more than twenty-six hours every working day. That is three full-time engineers, doing nothing.
The wall-clock cost is only the visible part. Slow tests destroy flow state. They quietly kill test-driven development, because nobody iterates on a fix when each iteration takes ten minutes. Worst of all, they erode trust in the suite, until developers stop running it and start guessing whether their changes broke something.
A suite people no longer trust is no longer a safety net.
Why tests get slow
Most slow test suites are slow because the tests are doing more than they need to, not because PHPUnit is slow. The same patterns turn up over and over again: Integration tests in disguise that quietly hit a database and call themselves unit tests. Fixtures that rebuild the world before every single test method. Tests that touch the filesystem or the network without realising it. Elaborate mock object setups that do more work than the code they are testing. Bootstraps that boot an entire framework just to assert on a pure function.
Each of these has its place; the problem is that they accumulate, often invisibly, until your unit tests no longer behave like unit tests at all.
The test pyramid is usually presented as a quality heuristic. It is just as useful as a performance budget. Unit tests should be measured in milliseconds, integration tests in seconds, and end-to-end tests in tens of seconds. When the pyramid inverts, performance dies, and feedback dies with it.
Measure before you optimise
Intuition lies. Every team I have worked with that knew exactly which tests were slow turned out to be wrong about at least one of them. The autoloader configuration that nobody suspected. The listener that ran on every test and rebuilt the same cache every time. The lonely sleep() call hidden three layers deep in a helper.
Before you change anything, measure.
My preferred starting point is Open Test Reporting: the --log-otr option writes a logfile in a structured XML format that contains precise timing information for every test. A small custom script is enough to parse that file and surface the slowest tests, ranked however you like.
If you would rather not write your own tooling, sebastianbergmann/phpunit-otr-report does the job out of the box.
When measurement has shown you which tests are slow but not why, reach for profiling and tracing tools.
Four tiers of fixes
Once you know where time is being spent, the work falls into four tiers, and the order matters: each tier is cheaper and more effective when the previous one is already in place.
Tier 1: Configuration wins
These are free. You change configuration, you save minutes. The most common offender is global process isolation, often switched on years ago to work around a single misbehaving test that has long since been refactored. Code coverage enabled by default for every local run is another classic, easily moved to an opt-in flag and a separate continuous integration job. Using --order-by=defects does not make the suite faster on a clean run, but it transforms the iteration loop, because failing tests run first and you get feedback in seconds rather than minutes. Half a day of work in this tier routinely saves several minutes per run.
Tier 2: Test Design
This is where the largest wins live, and it is where the central lesson of the talk sits. If a test claims to be a unit test but rebuilds the database schema before every method, it is not a unit test, and the cost of pretending it is one will dominate everything else. In-memory databases are often a viable substitute for tests that exercise your own logic rather than database-specific features. Lazy fixtures that build only what each test needs replace fixtures that prepare the world in setUp(), often with a five- to ten-fold speedup. Data providers let PHPUnit handle iteration efficiently and give you per-case reporting in return. None of this is exotic. All of it adds up.
Tier 3: Parallelisation
Parallel runners such as ParaTest sound like the answer, and sometimes they are. The trap is that parallelising a slow suite gives you slowness in parallel, plus a brand-new flakiness factory if your tests share state with each other. Fix correctness first. Then parallelise. For tests that need a database, the right isolation strategy depends on what you are testing: transactional rollback is cheap and works for most cases; PostgreSQL template databases give you full isolation almost for free if you happen to be on PostgreSQL; one container per worker is the universal answer when you need real isolation across any database, at the cost of a second or two of startup per worker. Start simple. Graduate when the constraints bite.
Tier 4: Infrastructure
RAM disks for SQLite and temporary files, layer caching for Docker, dependency caching for Composer, splitting fast and slow groups so that the slow groups only run on merge to the main branch, sharding the suite across runners in continuous integration. All of them are real wins, but rarely the ones worth chasing first.
Test design beats infrastructure
The pattern that emerges, almost without exception, is that test design beats infrastructure. A team that goes straight to parallelisation without doing the design work ends up with a suite that is faster but no more trustworthy. A team that does the design work first finds that parallelisation, when they get to it, is a polish step rather than the main event.
Some tests should be slow
Not every slow test is a problem. End-to-end tests, real-browser tests, real-database integration tests are slow for good reasons. The fix is not to make them fast. The fix is to run them at the right cadence: on merge to the main branch, nightly, before a release. But not on every save. Fast feedback for the things you change often. Thorough coverage for the things you ship.
Three things to take away
If you remember nothing else from this article or the presentation, remember three things. Measure first, because intuition lies. Test design beats infrastructure, every time. And fast tests are a feature, not a coincidence. Treat them like one.