Non-determinism in tests

April 16, 2018

This is the final post in my series about unit testing anti-patterns. This one is about non-determinism in tests.

Non-determinism in tests

Non-determinism basically stands for flickering tests. These are tests that pass most of the time but fail once in a while and then if you try to run them one more time - turn green again.

Martin Fowler wrote a great article on this topic a while ago. The main takeaway from it is that non-determinism can have a devastating effect on your entire test suite. Once you get accustomed to such variable failures, legit test failures get ignored with them and the test suite is no longer perceived as a reliable source of feedback.

The article also outlines pretty much every source of non-determinism out there. In this post, I’d like to specifically dwell on two of them which are the most common.

Non-determinism due to asynchronous execution

The first one is when the system runs asynchronously. Let’s take an example. Assume that you’ve got a fire-and-forget job that runs once in some period of time:

public class SomeService
{
    public void FirePeriodicJob()
    {
        /* Runs asynchronously */
    }
}

It takes your request, puts it into an in-memory queue and returns the control back to the caller. A separate thread then picks the request up and does the actual processing of it. Pretty simple.

But how to test such a job?

The first thing that comes to mind is to use Thread.Sleep():

[Fact]
public void Test()
{
    var sut = new SomeService();

    sut.FirePeriodicJob();

    // Give the job time to complete
    Thread.Sleep(10000);

    /* Verify the effects of the job */
}

Or the modern version of it, Task.Delay():

[Fact]
public void Test()
{
    var sut = new SomeService();

    sut.FirePeriodicJob();

    // Give the job time to complete
    Task.Delay(10000).Wait();

    /* Verify the effects of the job */
}

This way, you give the job some time to finish before verifying its outcome.

It looks good at first glance but that’s a horrible way to do this kind of verification.

The reason is that you cannot know for sure when exactly this job will complete and thus you are essentially guessing with the time interval in the test. The job execution can take a variable amount of time and it may be that in most cases your tests will successfully pass. But when the processing time spikes for some reason, the exact same test fails because the delay turns out to be not long enough.

Using the taxonomy I described in the unit test value proposition article, such tests become less valuable because they have a high chance of producing a false positive. Which is the second component out of four that determine the value of a test:

High chance of catching a regression bug.
Low chance of producing a false positive.
Fast feedback.
Low maintenance cost.

One way to deal with this problem is to increase the cushion. By putting, say, 20 seconds instead of 10. And that would fix it. But having cured the first issue you’d introduce another one. This time, the 3rd component (fast feedback) would be affected as the entire test suite would slow down.

That’s why the use of Thread.Sleep() or Task.Delay() in tests is almost always a bad idea. You are basically stuck between two bad options: having tests with a high chance of producing a false positive or slowing the feedback loop. The second option is slightly less horrible than the first but it’s still bad.

So how to deal with asynchrony in tests then? There are several ways to do so. The first and the major one is avoiding testing asynchronous code entirely.

The idea here is the same as in the How to do painless unit testing article I wrote a while ago: separate the domain logic from the code that deals with asynchrony, and then test only the former part:

Different Types of Code

That’s the idea behind the humble object pattern, and it can be applied broadly in many similar situations.

So, instead of dealing with FirePeriodicJob(), you could introduce another method in SomeService that would do the processing synchronously and unit test that method instead. You can even split these two responsibilities - putting a request to a queue and the processing of it - into two separate classes. That would probably be an even better solution.

The other ways to handle asynchronous code relate to the use of various types of signaling. You could make the SUT return a task and then wait for its completion in the test. Or if that is impossible to do for some reason, you could probe the SUT once in some period of time. Here’s a good video on that topic: Unit testing patterns for concurrent code

Non-determinism due to working with time

Another source of non-determinism in tests is the work with time. If the SUT relies on the current date and time, for example, to set timestamps, you cannot know for sure what value it used at the time of execution. The time that was current then might not be current when you assess the results of the execution.

Here’s an example:

[Fact]
public void Test()
{
    var sut = new CustomerService();

    Customer customer = sut.Create("Name");

    Assert.Equal(DateTime.Now, customer.DateCreated);
}

As you can see, the test instantiates a customer service, asks it to create a new customer and then checks that the date created equals DateTime.Now. If this test runs quickly enough, it may pass, but if not, it will fail because the current time might have gone ahead already.

There are two ways to mitigate this problem. The first one is to inject the current date and time to the SUT so that it wouldn’t need to refer to the DateTime.Now property anymore:

[Fact]
public void Test()
{
    DateTime now = DateTime.Now;
    var sut = new CustomerService();

    Customer customer = sut.Create("Name", now);

    Assert.Equal(now, customer.DateCreated);
}

Here, we capture the time in the test, pass it to the SUT and then verify that the SUT sets it to the DateCreated property.

This approach works in most cases but might become quite cumbersome if you have lots of layers of indirection in your code base. In such situations, you would need to pass that value from the composition root all the way down to the classes that use it, and there might be a lot of them.

One way to overcome this problem is to avoid having a deep hierarchy of communications between the classes. Keep it short and wide, not tall and narrow. In other words, have either classes that coordinate the work between multiple participants, or do the actual job, but not both.

Another way to deal with this kind of non-determinism is to introduce a new static class, for example, SystemDateTime, and agree upon using this class in place of the standard DateTime.Now property from the .NET framework:

public static class SystemDateTime
{
    private static Func<DateTime> _func;
 
    public static DateTime Now
    {
        get { return _func(); }
    }
 
    public static void Init(Func<DateTime> func)
    {
        _func = func;
    }
}

The benefit of this class is that it allows you to initialize it with a method that returns the current date and time.

So, in production, you could use this function:

// Initialization code for production
SystemDateTime.Init(() => DateTime.UtcNow);

But in tests, you can use some hard-coded value which you can then compare against in your tests:

// Initialization code for unit tests
SystemDateTime.Init(() => new DateTime(2016, 10, 3));

This way, you avoid having to pass the time around and still able to verify it in your tests.

Note that this second approach has a quite significant drawback. If you run tests in parallel and those tests setup different SystemDateTime values, they can interfere with each other. That’s because the _func function is static and shared across all threads in the application domain.

Because of that, prefer using dependency injection and passing the time around by default. Referring to global mutable state (be it DateTime.Now, SystemDateTime.Now or something else) is never the best option.

Summary

Avoid non-determinism in tests.
To overcome non-determinism due to asynchronous execution:
- Separate domain logic from the code that deals with asynchrony
- Use various types of signaling (returning Task or probing the SUT)
To overcome non-determinism due to working with time:
- Use dependency injection to pass the current time around
- Introduce a custom SystemDateTime that can be initialized with a value for testing purposes
- Prefer dependency injection over SystemDateTime as the latter doesn’t work with multiple tests running in parallel

Alright, this concludes the unit-testing anti-patterns series.

Next, I’m going to do something that I was planning for a long time. I’ll do another comparison between Entity Framework and NHibernate from the Domain-Driven Design perspective. Will describe how EF Core 2.0 deals with encapsulation issues, handles Value Objects, and more.

If you enjoy this article, check out my Pragmatic Unit Testing training course.