3 ways of achieving code correctness

February 16, 2016

Code correctness verification is not an easy task. At the same time, it’s one of the most important problems we have when we build software projects. In this article, I’ll compare 3 different ways of achieving code correctness.

3 ways of achieving code correctness

Generally, there are 3 kinds of tools we can use to do that: unit-testing, code contracts, and the type system. Let’s take an example. Let’s say we have the following class:

public class Customer
{
    public string Email { get; private set; }
    public Status Status { get; private set; }
    public int Discount { get; private set; }
 
    public Customer(string email, Status status, int discount)
    {
        if (discount > 5 && status == Status.Standard)
            throw new InvalidOperationException();
        if (discount > 10 && status == Status.Gold)
            throw new InvalidOperationException();
        if (!email.Contains("@")) // put a better email validation here
            throw new InvalidOperationException();
 
        Email = email;
        Status = status;
        Discount = discount;
    }
}
 
public enum Status
{
    Standard,
    Gold
}

And we need to verify its invariants:

Standard customers can’t have discounts that are larger than 5 percent
Gold customers can’t have discounts that are larger than 10 percent
Customers must have only valid emails

Here are the unit tests we can write to do that:

[Fact]
public void Standard_customers_cannot_have_discounts_more_than_5_percent()
{
    Action action = () => new Customer("[email protected]", Status.Standard, 6);
    action.ShouldThrow<InvalidOperationException>();
}
 
[Fact]
public void Gold_customers_cannot_have_discounts_more_than_10_percent()
{
    Action action = () => new Customer("[email protected]", Status.Gold, 11);
    action.ShouldThrow<InvalidOperationException>();
}
 
[Fact]
public void Customers_must_have_valid_emails()
{
    Action action = () => new Customer("invalid email", Status.Standard, 0);
    action.ShouldThrow<InvalidOperationException>();
}

Unit tests are essentially the default mode for validating code correctness. It works pretty well but has several drawbacks. First of all, the invariants of the class, while being shown explicitly in the constructor, aren’t evident to the consumers of that class. In order to get to know them, clients of this code must navigate to the source code or act in the hit-and-miss fashion: try to provide some values and see if that leads to any run-time exceptions.

That, in turn, results in extension of the feedback loop. More frictions needed in order to get the feedback. With unit tests, we attribute the correctness validation to the run-time.

Wouldn’t it be better to make the invariants more apparent somehow? That is the task code contracts try to solve. There’s no good support for them in C# at the moment, but we can get it at some point in the future.

With code contracts, we could re-write the class above using a special precondition statement, like this:

public class Customer
{
    public string Email { get; private set; }
    public Status Status { get; private set; }
    public int Discount { get; private set; }
 
    public Customer(string email, Status status, int discount)
        requires (status == Status.Standard && discount <= 5) || 
            (status == Status.Gold && discount <= 10)
        requires email.Contains("@")
    {
        Email = email;
        Status = status;
        Discount = discount;
    }
}

The difference between this version and the previous one might seem insignificant but in the reality, it’s pretty substantial.

First of all, the validations are now part of the constructor signature, they aren’t just buried inside of it. It means that the clients can get to know what to expect from that class right away, without trying it at the run-time.

Secondly, this approach opens a whole new world of building static analysers. They can process the information we specify as pre- and post-conditions and provide us with helpful tips and warning. For example, such code could be easily validated in the compile-time:

// Warning: cannot use status == Status.Standard with discount == 20
var customer = new Customer("[email protected]", Status.Standard, 20);

Code contracts help tighten the feedback loop. They raise the class’s invariants to the signature level where they can be easily seen by programmers even if they don’t have access to the source code. And of course, we still need to employ unit testing to make sure our code meets the preconditions.

Code contracts make a good job, but we can do better.

It turns out that the type system is a superior choice when it comes to defining invariants. Note that the customer’s emails are represented as strings in the Customer class. What we can do instead is we can define a separate class for it and free Customer up from the responsibility to validate emails:

public class Customer
{
    public Email Email { get; private set; }
    public Status Status { get; private set; }
    public int Discount { get; private set; }
 
    public Customer(Email email, Status status, int discount)
        requires (status == Status.Standard && discount <= 5) || 
            (status == Status.Gold && discount <= 10)
    {
        Email = email;
        Status = status;
        Discount = discount;
    }
}

Now, the compiler won’t allow us to write the following code anymore. We need to provide an email instance instead of a string here:

var customer = new Customer("[email protected]", Status.Standard, 0); // Compiler error
var customer = new Customer(email, Status.Standard, 0); // OK

Of course, we do need to validate emails even with this approach, but that validation appears in a single place and doesn’t get scattered across the application code. As a side-effect, we simplify the Customer’s invariants so that the class is now responsible only for the domain logic that is essential to customers. That is customer’s statuses and discounts.

That is pretty much as far as we can get with such languages as C# or Java. With languages with a stronger type system, however, we can do even better.

Note that we represent discounts as an integer. That way, we store the percentage by which we can rebate our prices for the customers. In reality, such flexibility is never the case. Instead of allowing managers to specify any discount they want, the business usually defines a strict set of discounts that are applicable to each type of customers.

Using F#, we can write this set down directly in the application’s source code:

type StandardDiscount =
    | Three
    | Five
 
type GoldDiscount =
    | Three
    | Five
    | Seven
    | Ten
 
type Email = Email of string
 
type StandardCustomer = {
    Email: Email;
    Discount: StandardDiscount
}
 
type GoldCustomer = {
    Email: Email;
    Discount: GoldDiscount
}

With this code, it’s impossible to even compile a function that tries to create a standard customer with a discount of seven percent or a gold customer with a discount of more than ten percent.

We are lifting the invariants up to the type system and that makes code contracts redundant altogether. With this approach, we have the shortest feedback loop possible which increases our pace of development substantially.

Comparison

You might have guessed already where this goes in terms of comparison. Every next way of achieving code correctness is superior to the previous one as it brings the invariants definitions and the actual validation of them closer to each other.

Type System >> Code Contracts >> Unit Tests

That leads us to the following guideline: always try to bring as many invariants as you can closer to the type system. If possible, bake them into your types. If not, try to define them as code contracts. Even in C#, we can do that to some extent (see the example with the Email class).

And of course, not every invariant can be represented using the type system and not every business rule can be shown as a precondition. We still need to use unit tests. However, with this approach, the number of unit tests required to cover the edge cases decreases to a large degree.

It also means that languages with dynamic type system lose a lot of the benefits strongly typed programming languages provide us with. They lack the most important element of the safety net and programmers using them have to fill this gap with excessive use of unit tests.

Summary

Writing maintainable code is all about being confident to change it. The quicker you get feedback about mistakes and inconsistencies, the better.

The type system is the most efficient tool to achieve that confidence. Use it to represent as many invariants in your application as possible. Code contracts is the second best choice here. Use unit tests as the final point of your safety net.