When to Mock

The use of mocks in unit testing is a controversial topic (maybe less so now than several years ago). I remember how, throughout my programming career, I went from mocking almost every dependency, to the "no-mocks" policy, and then to "only mock external dependencies".

None of this practices are good enough. In this article, I’ll show you which dependencies to mock, and which to use as is in your tests.

What is a mock?

Before jumping to the topic of when to mock, let’s discuss what a mock is.

Mock vs test double

People often use the terms test double and mock as synonyms, but technically, they are not:

  • A test double is an overarching term that describes all kinds of non-production-ready, fake dependencies in tests. Such a dependency looks and behaves like its release-intended counterpart but is actually a simplified version that reduces complexity and facilitates testing.

    This term was introduced by Gerard Meszaros in his book xUnit Test Patterns: Refactoring Test Code. The name itself comes from the notion of a stunt double in movies.

  • A mock is just one kind of such dependencies.

According to Gerard Meszaros, there are 5 types of test doubles:

  • Dummy

  • Stub

  • Spy

  • Mock

  • Fake

Such a variety may look intimidating, but in reality, they can all be grouped together into just two types: mocks and stubs.

2020 04 13 mocks and stubs
All variations of test doubles can be categorized into two types: mocks and stubs

The difference between these two types boils down to the following:

  • Mocks help to emulate and examine outcoming interactions. These interactions are calls the system under test (SUT) makes to its dependencies to change their state.

  • Stubs help to emulate incoming interactions. These interactions are calls the SUT makes to its dependencies to get input data.

For example, sending an email is an outcoming interaction: that interaction results in a side effect in the SMTP server. A test double emulating such an interaction is a mock.

Retrieving data from the database is an incoming interaction — it doesn’t result in a side effect. The corresponding test double is a stub.

2020 04 13 incoming outcoming
Mocks are for outcoming interaction; stubs — for incoming

All other differences between the five types of test doubles are insignificant implementation details:

  • Spies serve the same role as mocks. The distinction is that spies are written manually, whereas mocks are created with the help of a mocking framework. Sometimes people refer to spies as handwritten mocks.

On the other hand, the difference between stubs, dummies, and fakes is in how intelligent they are:

  • A dummy is a simple, hard-coded value such as a null value or a made-up string. It’s used to satisfy the SUT’s method signature and doesn’t participate in producing the final outcome.

  • A stub is more sophisticated. It’s a fully fledged dependency that you configure to return different values for different scenarios.

  • A fake is the same as a stub for most purposes. The difference is in the rationale for its creation: a fake is usually implemented to replace a dependency that doesn’t yet exist.

Notice the difference between mocks and stubs (aside from outcoming versus incoming interactions). Mocks help to emulate and examine interactions between the SUT and its dependencies, while stubs only help to emulate those interactions. This is an important distinction. You will see why shortly.

Mock-the-tool vs. mock-the-test-double

The term mock is overloaded and can mean different things in different circumstances. I mentioned already that people often use this term to mean any test double, whereas mocks are only a subset of test doubles.

But there’s another meaning for the term mock. You can refer to the classes from mocking libraries as mocks, too. These classes help you create actual mocks, but they themselves are not mocks per se:

[Fact]
public void Sending_a_greetings_email()
{
    // Using a mock-the-tool to create a mock-the-test-double
    var mock = new Mock<IEmailGateway>();
    var sut = new Controller(mock.Object);

    sut.GreetUser("[email protected]");

    // Examining the call from the SUT to the test double
    mock.Verify(
        x => x.SendGreetingsEmail("[email protected]"),
        Times.Once);
}

This test uses the Mock class from the Moq mocking library. This class is a tool that enables you to create a test double — a mock. In other words, the class Mock (or Mock<IEmailGateway>) is a mock-the-tool, while the instance of that class, mock, is a mock-the-test-double.

It’s important not to conflate a mock-the-tool with a mock-the-test-double because you can use a mock-the-tool to create both types of test doubles: mocks and stubs.

Here’s another example of a test that uses the Mock class. The instance of that class is a stub, not mock:

[Fact]
public void Creating_a_report()
{
    // Using a mock-the-tool to create a stub
    var stub = new Mock<IDatabase>();
    // Setting up a canned answer
    stub.Setup(x => x.GetNumberOfUsers()).Returns(10);
    var sut = new Controller(stub.Object);

    Report report = sut.CreateReport();

    Assert.Equal(10, report.NumberOfUsers);
}

This test double emulates an incoming interaction — a call that provides the SUT with input data.

On the other hand, in the previous example, the call to SendGreetingsEmail() is an outcoming interaction. Its sole purpose is to incur a side effect — send an email.

Don’t assert interactions with stubs

As I mentioned above, mocks help to emulate and examine outcoming interactions between the SUT and its dependencies, while stubs only help to emulate incoming interactions, not examine them.

The difference between the two stems from this guideline: you should never assert interactions with stubs. A call from the SUT to a stub is not part of the end result the SUT produces. Such a call is only a means to produce the end result; it’s an implementation detail. Asserting interactions with stubs is a common anti-pattern that leads to brittle tests.

The only way to avoid test brittleness is to make those tests verify the end result (which, ideally, should be meaningful to a non-programmer), not implementation details.

In the above examples, the check

mock.Verify(x => x.SendGreetingsEmail("[email protected]"))

corresponds to an actual outcome, and that outcome is meaningful to a domain expert: sending a greetings email is something business people would want the system to do.

At the same time, the call to GetNumberOfUsers() is not an outcome at all. It’s an internal implementation detail regarding how the SUT gathers data necessary for the report creation. Therefore, asserting this call would lead to test fragility. It doesn’t matter how the SUT generates the end result, as long as that result is correct.

Here’s an example of such a fragile test:

[Fact]
public void Creating_a_report()
{
    var stub = new Mock<IDatabase>();
    stub.Setup(x => x.GetNumberOfUsers()).Returns(10);
    var sut = new Controller(stub.Object);

    Report report = sut.CreateReport();

    Assert.Equal(10, report.NumberOfUsers);
    // Asserting an interaction with a stub
    stub.Verify(
        x => x.GetNumberOfUsers(),
        Times.Once);
}

This practice of verifying things that aren’t part of the end result is also called overspecification. Most commonly, overspecification takes place when examining interactions. Checking for interactions with stubs is a flaw that’s quite easy to spot because tests shouldn’t check for any interactions with stubs.

Mocks are a more complicated subject: not all uses of mocks lead to test fragility, but a lot of them do. You’ll see why shortly.

Using mocks and stubs together

Sometimes you need to create a test double that exhibits the properties of both a mock and a stub:

[Fact]
public void Purchase_fails_when_not_enough_inventory()
{
    var storeMock = new Mock<IStore>();
    // Setting up a canned answer
    storeMock
        .Setup(x => x.HasEnoughInventory(Product.Shampoo, 5))
        .Returns(false);
    var sut = new Customer();

    bool success = sut.Purchase(storeMock.Object, Product.Shampoo, 5);

    Assert.False(success);
    // Examining a call from the SUT to the mock
    storeMock.Verify(
        x => x.RemoveInventory(Product.Shampoo, 5),
        Times.Never);
}

This test uses storeMock for two purposes: it returns a canned answer and verifies a method call made by the SUT.

Notice, though, that these are two different methods: the test sets up the answer from HasEnoughInventory() but then verifies the call to RemoveInventory(). Thus, the rule of not asserting interactions with stubs is not violated here.

When a test double is both a mock and a stub, it’s still called a mock, not a stub. That’s mostly because you need to pick one name, but also because being a mock is a more important fact than being a stub.

Mocks vs. stubs and commands vs. queries

The notion of mocks and stubs ties to the command query separation (CQS) principle. The CQS principle states that every method should be either a command or a query, but not both:

  • Commands are methods that produce side effects and don’t return any value (return void). Examples of side effects include mutating an object’s state, changing a file in the file system, and so on.

  • Queries are the opposite of that — they are side-effect free and return a value.

In other words, asking a question should not change the answer. Code that maintains such a clear separation becomes easier to read.

Test doubles that substitute commands become mocks. Similarly, test doubles that substitute queries are stubs:

2020 04 13 cqs
Commands correspond to mocks; queries — to stubs

Look at the two tests from the previous examples again (I’m showing their relevant parts here):

var mock = new Mock<IEmailGateway>();
mock.Verify(x => x.SendGreetingsEmail("[email protected]"));

var stub = new Mock<IDatabase>();
stub.Setup(x => x.GetNumberOfUsers()).Returns(10);

SendGreetingsEmail() is a command whose side effect is sending an email. The test double that substitutes this command is a mock.

On the other hand, GetNumberOfUsers() is a query that returns a value and doesn’t mutate the database state. The corresponding test double is a stub.

When to mock

With all these definitions out of the way, let’s talk about when you should use mocks.

You obviously don’t want to mock the system under test (SUT) itself, so the question of "When to mock?" boils down to this: "Which types of dependencies you should replace with a mock, and which — use as is in tests?"

Here are all types of unit testing dependencies I listed in the previous article:

2020 04 02 dependencies 7
Types of unit testing dependencies

To re-iterate:

  • A shared dependency is a dependency that is shared between tests and provides means for those tests to affect each other’s outcome.

  • A private dependency is any dependency that is not shared.

A shared dependency corresponds to a mutable out-of-process dependency in the vast majority of cases, that’s why I’m using these two notions as synonyms here. (Check out my previous post for more details: Unit Testing Dependencies: The Complete Guide.)

There are two schools of unit testing with their own views on which types of dependencies to replace with mocks:

  • The London school (also known as the mockist school) advocates for replacing all mutable dependencies (collaborators) with mocks.

  • The classical school (also known as the Detroit school) advocates for the replacement of only shared (mutable out-of-process) dependencies.

2020 04 13 dependencies schools
Types of unit testing dependencies and the schools of unit testing

Both schools are wrong in their treatment of mocks, though the classical school is less so than the London school.

Mocks and immutable out-of-process dependencies

What about immutable out-of-process dependencies? Shouldn’t they be mocked out too, according to at least one of the schools?

Immutable out-of-process dependencies (such as a read-only API service), should be replaced with a test double, but that test double would be a stub, not a mock.

That’s, once again, due to the differences between mocks and stubs:

  • Mocks are for outcoming interactions (commands) — interactions that leave a side effect in the dependency-collaborator.

  • Stubs are for incoming interactions (queries) — interactions that don’t leave a side effect in the dependency.

Interactions with immutable out-of-process dependencies are, by definition, incoming and thus shouldn’t be checked for in tests, only stubbed out with canned answers (both schools are OK with that).

I’ll first describe why the London school is wrong, and then — why the classical approach is wrong too.

Don’t mock all mutable dependencies

You shouldn’t mock all mutable dependencies. To see why, we need to look at two types of communications in a typical application: intra-system and inter-system.

  • Intra-system communications are communications between classes inside your application

  • Inter-system communications are when your application talks to other applications.

Here they are on a diagram:

2020 04 14 inter intra
Intra-system and inter-system communications

There’s a huge difference between the two: intra-system communications are implementation details; inter-system communications are not.

Intra-system communications are implementation details because the collaborations your domain classes go through in order to perform an operation are not part of their observable behavior. These collaborations don’t have an immediate connection to the client’s goal. Thus, coupling to such collaborations leads to fragile tests.

Inter-system communications are a different matter. Unlike collaborations between classes inside your application, the way your system talks to the external world forms the observable behavior of that system as a whole. It’s part of the contract your application must hold at all times.

2020 04 14 inter intra 2
Intra-system communications are implementation details; inter-system communications form the observable behavior of your application as a whole

This attribute of inter-system communications stems from the way separate applications evolve together. One of the main principles of such an evolution is maintaining backward compatibility. Regardless of the refactorings you perform inside your system, the communication pattern it uses to talk to external applications should always stay in place, so that external applications can understand it. For example, messages your application emits on a bus should preserve their structure, the calls issued to an SMTP service should have the same number and type of parameters, and so on.

The use of mocks cements the communication pattern between the system under test and the dependency (makes that pattern harder to change). This is exactly what you want when verifying communications between your system and external applications. Conversely, using mocks to verify communications between classes inside your system couples your tests to implementation details, making them fragile.

Intra-system communications correspond to mutable in-process dependencies:

2020 04 14 dependencies
Intra-system communications are communications with mutable in-process dependencies

And so, the London school is wrong because it encourages the use of mocks for all mutable dependencies and doesn’t differentiate between intra-system (in-process) and inter-system (out-of-process) communications.

As a result, tests check communications between classes just as much as they check communications between your application and external systems. This indiscriminate use of mocks is why following the London school often results in fragile tests — tests that couple to implementation details.

Don’t mock all out-of-process dependencies

The classical school is better at this issue because it advocates for substituting only out-of-process dependencies such as an SMTP service, a message bus, and so on. But the classical school is not ideal in its treatment of inter-system communications, either. This school also encourages excessive use of mocks, albeit not as much as the London school.

Not all out-of-process dependencies should be mocked out. If an out-of-process dependency is only accessible through your application, then communications with such a dependency are not part of your system’s observable behavior. An out-of-process dependency that can’t be observed externally, in effect, acts as part of your application. Communications with such a dependency become implementation details: they don’t have to stay in place after refactoring and therefore shouldn’t be verified with mocks.

2020 04 14 inter intra 3
Some inter-system communications are implementation details too

Remember, the requirement to always preserve the communication pattern between your application and external systems stems from the necessity to maintain backward compatibility. You have to maintain the way your application talks to external systems. That’s because you can’t change those external systems simultaneously with your application; they may follow a different deployment cycle, or you might simply not have control over them.

But when your application acts as a proxy to an external system, and no client can access it directly, the backward-compatibility requirement vanishes. Now you can deploy your application together with this external system, and it won’t affect the clients. The communication pattern with such a system becomes an implementation detail.

A good example here is an application database: a database that is used only by your application. No external system has access to this database. Therefore, you can modify the communication pattern between your system and the application database in any way you like, as long as it doesn’t break existing functionality. Because that database is completely hidden from the eyes of the clients, you can even replace it with an entirely different storage mechanism, and no one will notice.

The use of mocks for out-of-process dependencies that you have a full control over also leads to brittle tests. You don’t want your tests to turn red every time you split a table in the database or modify the type of one of the parameters in a stored procedure. The database and your application must be treated as one system.

This distinction splits out-of-process dependencies into two subcategories:

  • Managed dependencies — out-of-process dependencies you have full control over. These dependencies are only accessible through your application; interactions with them aren’t visible to the external world. A typical example is the application database. External systems don’t access your database directly; they do that through the API your application provides.

  • Unmanaged dependencies — out-of-process dependencies you don’t have full control over. Interactions with such dependencies are observable externally. Examples include an SMTP server and a message bus: both produce side effects visible to other applications.

Only unmanaged dependencies should be replaced with mocks. Use real instances of managed dependencies in tests.

2020 04 14 dependencies 2
Only unmanaged dependencies can be replaced with mocks

Further reading

Of course, using real instances of managed dependencies in tests poses an obvious issue: how do you test them such that your tests remain fast and reliable?

You’ll see this subject covered in depth in my book:

Unit Testing Principles

Unit Testing Principles, Practices, and Patterns

Summary

  • Test double is an overarching term that describes all kinds of non-production-ready, fake dependencies in tests.

    • There are five variations of test doubles — dummy, stub, spy, mock, and fake — that can be grouped in just two types: mocks and stubs.

    • Spies are functionally the same as mocks; dummies and fakes serve the same role as stubs.

  • The differences between mocks vs stubs:

    • Mocks help emulate and examine outcoming interactions: calls from the SUT to its dependencies that change the state of those dependencies.

    • Stubs help emulate incoming interactions: calls the SUT makes to its dependencies to get input data.

    • Asserting interactions with stubs always leads to fragile tests.

    • Test doubles that substitute CQS commands are mocks. Test doubles that substitute CQS queries are stubs.

  • A mock-the-tool is a class from a mocking library that you can use to create a mock-the-test-double or a stub.

  • Out-of-process dependencies can be categorized into 2 subcategories: managed and unmanaged dependencies.

    • Managed dependencies are out-of-process dependencies that are only accessible through your application. Interactions with managed dependencies aren’t observable externally. A typical example is the application database.

    • Unmanaged dependencies are out-of-process dependencies that other applications have access to. Interactions with unmanaged dependencies are observable externally. Typical examples include an SMTP server and a message bus.

    • Communications with managed dependencies are implementation details; communications with unmanaged dependencies are part of your system’s observable behavior.

    • Use real instances of managed dependencies in integration tests; replace unmanaged dependencies with mocks.

Subscribe


I don't post everything on my blog. Don't miss smaller tips and updates. Sign up to my mailing list below.

Comments


comments powered by Disqus