Seeing the Truth: Test Oracles

The image shows two hands that seem to be forming a glowing light that appears magical. The person is wearing several rings and bracelets and has their arm tucked into a purple jumper. In the foreground is a table covered with a dark cloth, on which lie candles, coins, chains and a small bottle containing a blue-green liquid. The warmly lit scene evokes a spiritual or mystical atmosphere, reminiscent of a ritual or fortune telling.

Snakes and Cookies

For thousands of years, oracles have exerted a magical attraction on the human imagination. In ancient Greece, kings, generals and ordinary citizens made pilgrimages to Delphi to ask the famous Pythia, the priestess of the Oracle of Delphi, their most pressing questions. The Pythia, also known as the “serpent priestess”, sat above a mystical crevice in the earth and uttered cryptic prophecies that would determine the fate of entire empires. Kings based their war strategies on her words, merchants planned their ventures, and no one dared make an important decision without first consulting the oracle.

More than 2,000 years later, we encounter a new oracle. Not in an ancient temple, but in a dilapidated flat in the heart of a simulated reality. In “The Matrix”, the Oracle is an old woman who bakes cookies and possesses a surprising amount of wisdom. She does not sit atop a crevice in the earth, but at the kitchen table, offering cookies and prophesying to Neo that he will not be “The One” – in which she is, of course, mistaken (or not, depending on how we want to see it). The old woman in the film is no less mysterious than her ancient counterpart, but instead of being cryptic, she is comforting; instead of being inaccessible, she is motherly; and instead of relying on spiritual powers, she relies on her impressive ability to see through human emotions.

These two oracles have something fascinating in common. They need to know what will happen next. They need to foresee what is right and what is wrong. In short, they need to know the truth, even if it is complex, ambiguous, or even paradoxical.

Test Oracles

And that is exactly what this article is about. In software development, there is a modern equivalent to the Oracle of Delphi and the Oracle from “The Matrix”: the test oracle. This is not a mystical being or a wise programme, but a concept we use to address one of the most difficult questions in quality assurance: “Is the result of this test correct or not?”

The term “test oracle” was coined by William E. Howden in his 1978 paper “Theoretical and empirical studies of program testing”:

In order to use testing to validate a program it is necessary to assume the existence of a test oracle which can be used to check the correctness of test output. The most common kind of test oracle is one which can be used to check the correctness of output values for a given set of input values.

At its core, a test oracle is just like its mythological counterpart: a procedure or source that knows the “truth”. It is the authority that decides whether the test has been passed or not. And like the ancient priestess of Delphi or the wise woman in “The Matrix”, it sometimes has to find creative, indirect, or surprisingly simple ways to reveal this truth. Especially when it is anything but obvious.

Unit Tests

In unit tests, test oracles cover both direct outputs (return values), local side effects (state changes), and indirect outputs (messages sent to collaborators). Good unit tests make explicit which of these is being asserted.

Return values

For functions and methods without side effects, the oracle compares the actual return value to an expected one.

Local side effects

Many objects modify their own internal state or the state of objects passed as arguments. Here, the oracle checks that the state after execution matches expectations.

Indirect outputs

When an object collaborates with other objects (services, repositories, event buses, loggers, etc.), the primary observable behavior may be how it talks to its collaborators rather than what it returns. In this case, the oracle is about communication: did the object under test send the right messages (method calls, arguments) to its collaborators? This is where mock objects come into play.

The distinction between using a test stub (to control inputs from dependencies) and a mock object (to verify outputs to dependencies) is crucial for clarity. When you care about interactions, your mock expectations are the oracle for that unit's correctness with respect to its collaborators.

Integration Tests

Integration test oracles validate that multiple real components cooperate correctly and that their technical side effects (database writes, messages, files, network calls, etc.) are exactly what the system promises at that integration boundary.

Combined outcome

When several components are wired together (e.g., a service, its repository, and an external API client), the oracle often checks the combined outcome. For example, it checks whether a service returns a DTO that reflects data loaded from a real database table or whether a query handler returns the right projection after a command handler has emitted events.

Infrastructure side effects

Integration tests frequently use side effects as their primary oracle:

Database: After executing a command, assert that the expected rows exist with correct values in the database.
File system: Verify that a file was created, updated, or deleted, and that its contents match expectations.
Message queues / event buses: Check that a specific message was sent or an event was emitted.

Here, side effects are not an implementation detail. They are the observable behavior the test cares about. The oracle is expressed as queries against those external systems.

Interaction verification at higher level

While unit tests tend to use mock objects to verify fine-grained communication between collaborating objects, integration tests may still verify interactions at a coarser level. For example, that a service made an HTTP call to a test double of a third-party API with specific parameters.

Conceptually, this is similar to what I described in “Testing with(out) dependencies” for unit tests, just at a higher architectural boundary: using a stubbed or mocked gateway to control inputs and verify outputs across service boundaries.

End-to-End Tests

End-to-End tests treat the system as a black box and validate its observable, end-to-end behavior. Side effects across user interfaces, databases, queues, and external integrations are central to the oracle: if any of these expected outcomes is missing or incorrect, the workflow is considered a failure.

User-visible outcomes

These oracles check what the user sees or receives:

After a checkout flow in a browser, the page shows “Order placed” and displays the correct order summary
A generated invoice PDF contains the right customer, line items, totals, and tax calculation
The API response body and status code match the contract for a complex workflow

Business state across bounded contexts

End-to-End tests often confirm that a business action has propagated correctly through the system:

A newly registered user appears in the customer database, has an activated account, and is assigned default roles
Placing an order creates records in the order context, reserves stock in the inventory context, and records a payment transaction

Here, side effects are spread across multiple subsystems. The oracle is a collection of assertions over several data sources that together embody “business correctness”.

External side effects and integrations

Finally, end-to-end test oracles frequently include real-world side effects that matter to users or external partners:

An order confirmation email or SMS was actually sent (and has the correct content)
An audit log entry was written for a critical action
A webhook was triggered to a partner system with the correct payload

Conclusion

Understanding test oracles across all three levels, unit tests, integration tests, and end-to-end tests, is essential for writing effective automated tests. Each level answers a different question about correctness:

Unit tests verify the behaviour of individual objects and the collaboration of multiple objects
Integration tests confirm that multiple components cooperate and produce the right side effects
End-to-End tests validate that complete workflows deliver real business value

In practice, a comprehensive test suite uses all three, each with its own oracle strategy tailored to what matters at that level. Whether you're asserting a return value, verifying a object collaboration, checking database state, or confirming an order confirmation email, you're applying the same fundamental principle: use an oracle to make the expected behavior explicit, observable, and testable.

By being deliberate about which oracles you choose and what they measure, you build confidence that your system works. Not just in isolation, but end-to-end, in the ways that actually matter to your users.

I have over 35 years' experience developing software, including almost 30 years working with PHP. I have also been developing PHPUnit for over 25 years. The knowledge I have gained during this time is reflected in my articles, but this is just the tip of the iceberg.

If you and your team want to achieve measurable progress, I would be happy to support you with targeted advice and individual coaching. Let's get talking!

About the Author

Sebastian Bergmann is the creator of PHPUnit and an internationally recognised expert in software quality and testing.