Who has not encountered this situation: an old PHP project with outdated documentation, classes without tests, global variables, magic methods and confusing dependencies. No one knows exactly how the code actually behaves. And when a PHP upgrade is due, the fear becomes palpable. How can we make safe changes in this situation?
In a previous article, I wrote about how test oracles know the truth and can tell us whether our tests are successful. But there is a small problem: what do we do when the oracle does not speak to us? When it has no idea what the system is actually supposed to do? Then we need the ability to document the past in order to find the way to the future. And that is exactly what this article is about.
The central thesis is simple but powerful: we must introduce tests into our code in order to be able to change it. However, in order to introduce tests, the code often has to be changed first. This is the classic dilemma of modernising legacy code. And it can be solved.
Why we change software
Before we look at solutions, it is important to understand why the code needs to be changed:
- Add features
- Modify or remove existing features
- Fix bugs
- Improve design (refactoring)
- Optimise performance
- Adapt to changes in the technology stack (e.g. PHP upgrade)
Each of these changes carries the risk of undesirable side effects. Without testing, this risk is virtually impossible to control.
Characterization Tests
The solution is called characterization testing. This approach documents the actual behaviour of the code without judging whether this behaviour is "correct". This is the pragmatic way to achieve test coverage in legacy code in the first place:
- Select the part of the code you want to test
- Write a test with an assertion that you know will fail
- Run the test: the error message will show you the actual behaviour
- Change the assertion so that it contains the observed value
- Repeat this with different inputs until you have sufficient coverage
This method is particularly valuable because it:
- Documents the actual behaviour of the code through testing
- Creates a safety net for future changes
- Even detects errors or quirks in the code that can be specifically corrected later
- Practically "interviews" the code – we learn about its behaviour without having to interpret it
A practical example is a simple API that processes GET and POST requests. With the help of characterization tests, we can first characterise the behaviour. To do this, we send a GET request to the software, which has previously been set to a clearly defined state. We then check the response, both the HTTP headers and the JSON body.
We can go one step further and automate the creation of our characterization tests to a certain extent by implementing request/recorder middleware that logs all HTTP interactions. This middleware is loaded when the application starts and records the most important information for each request: URI, HTTP method, parameters and payload are recorded, as is the complete response with body and headers. All this data is stored, for example in a JSON file per request/response cycle.
These recorded request/response data pairs can then be used in tests via data providers. This is an elegant method for quickly achieving comprehensive test coverage based on real interaction with the software. The tests run through all recorded requests, resend them to the application and check whether the new response matches the recorded one. In this way, hundreds of real-world scenarios can be tested automatically with minimal effort. In practice, of course, this is not quite so simple, as side effects and the initial state of the application must be taken into account.
The path to the future
The key to successfully modernising legacy code is an iterative process. Characterisation tests provide a vital safety net for the existing code, enabling it to be systematically refactored into smaller, testable units. Once this safety net has been established, the characterisation tests can be retained as acceptance tests and supplemented gradually with specific, meaningful unit tests.
This is a well-planned path, step by step. The aviation metaphor illustrates the difference:
- Tests before the flight (TDD): Planned, safe landing – optimal condition
- Tests after take off: Emergency landing – pragmatic, but slightly hectic
- No tests: Crash landing – high risk
Legacy code is not your enemy. It is valuable and often holds business-critical functions together. With the right tools and strategies, modernisation becomes a manageable task.