Making implicit assumptions explicit



Another software development principle I advocate you follow is making implicit assumptions explicit in your code. Let’s see what that means.

Assumptions and readability

Assumptions often make a big part of our day-to-day development life. As much as they help us simplify the code, they also may become a big source of frustration. Two programmers rarely have the exact same set of skills and experience, so an assumption made by one can be obscure and illogical to another.

That is why it’s important to always keep all the assumptions made in the code as explicit as possible. Not only will it help the developers in your team be sure what a certain piece of code does in practice, it will help yourself not to lose track of the decisions you made previously. It is also especially helpful for newcomers as it allows them to get to know your software quicker.

Essentially, assumptions in code are all about readability. The more explicit they are, the more readable your code base becomes.

Convention over configuration

There is another related practice – convention over configuration, which might seem contradicting to the principle we are discussing at first.

Basically, convention over configuration stands for using conventions instead of configurations where possible. It helps developers reduce the number of decisions made and eliminate the code required to define those decisions. This practice is widely employed in Ruby, ASP.NET MVC, and several JavaScript frameworks.

A good example is naming in ASP.NET MVC:

public class ProductController : Controller

{

    public ActionResult Buy()

    {

        return View();

    }

}

Here, instead of explicitly specifying the Buy action serves “/product/buy” URLs, we can rely on the ASP.NET MVC convention which states that any action in any controller can be accessed by the “/controller/action” URL.

Doesn’t the “convention over configuration” practice contradict the “make implicit assumptions explicit” principle?

Not at all. Preferring conventions to configuration doesn’t make the assumptions in code less explicit, it just reduces the number of boilerplate orchestration needed to express those assumptions.

We can depict it like this: Making implicit assumptions explicit: extracting a conventionAll repeatable assumptions here are just moved out of the equation, they become conventions. Only those assumptions which don’t fit the commonly used conventions remain as configurations.

Examples

A good example here is nullability. Take a look at the code below:

public class ProductRepository

{

    public Product GetById(int id)

    {

        /* … */

    }

}

An assumption made by one person while writing this method is totally obscure for another when it comes to using it. What would happen if the product with Id specified is not found? Will the method return null? Or will it throw an exception?

The ambiguity makes it hard for other developers to reason about the code. It becomes even harder when the assumptions are not consistent over the code base.

The best way to avoid the confusion in this particular case is to come up with a convention that all reference types in your domain model are deemed non-nullable by default. And if you want to indicate otherwise, you need to use the Maybe struct:

public class ProductRepository

{

    public Maybe<Product> GetById(int id)

    {

        /* … */

    }

}

We can do the same with the results of operations performed in code. If there is a chance a method might go wrong and we know how to handle the failure, the best decision would be to explicitly specify it in its signature instead of relying on exceptions:

public Result ProcessProduct(Product product)

{

    /* … */

}

Here, the Result class tells us the operation may fail, and if it does, we can know the reason by examining the returning object. I wrote about this approach in this post, check it out for more details.

More examples

Another telling example is Service Locator. It is considered an anti-pattern by many because it often gets misused. Here’s a sample of such a misuse:

public class ProductService

{

    private IProductRepository _repository;

    private IEmailGateway _emailGateway;

    private IPersonRepository _personRepository;

 

    public ProductService(IServiceLocator locator)

    {

        _repository = locator.Resolve<IProductRepository>();

        _emailGateway = locator.Resolve<IEmailGateway>();

        _personRepository = locator.Resolve<IPersonRepository>();

    }

}

Instead of explicitly indicating the dependencies for the service, we pass it a service locator, which is then used for retrieving those dependencies in the constructor.

The problem with this code is that it hides the actual state of affairs from the consumer of the class. The situation gets worse if we save the locator instance itself and use it later on whenever we need to locate a dependency. It’s really hard to trace all dependencies the class uses in such situations.

A justification people often give for this approach is that it reduces the number of constructor parameters. If there are, for example, 10 dependencies, it wouldn’t look particularly well if we specify all of them explicitly.

Well, that’s exactly what we want to achieve here. Instead of sweeping smelling code under the carpet, make it obvious there is a problem with it! Hiding an issue doesn’t help solve it, only prevents us from tackling it.

The same argument can be applied when dealing with temporary fields. This practice seems appealing at first as it allows us to cut off a corner and make our code less verbose:

public class ProductService

{

    private Product _product;

    private Manager _manager;

 

    public void AssignProductToManager(Product product, Manager manager)

    {

        _manager = manager;

        _product = product;

 

        DoAssignment();

        SendNotifiationEmail();

    }

 

    private void DoAssignment()

    {

        /* works with _manager and _product */

    }

 

    private void SendNotifiationEmail()

    {

        /* works with _manager */

    }

}

But of course, it has the same flaws we discussed previously – it hides the underlying complexity and thus reduces the readability. The DoAssignment and SendNotifiationEmail methods assume the _product and the _manager fields are initialized. What we should do instead is we should once again make the assumptions explicit:

public class ProductService

{

    public void AssignProductToManager(Product product, Manager manager)

    {

        DoAssignment(manager, product);

        SendNotifiationEmail(manager);

    }

 

    private void DoAssignment(Manager manager, Product product)

    {

    }

 

    private void SendNotifiationEmail(Manager manager)

    {

    }

}

That way, there will be no misunderstanding in what these methods depend upon as their prerequisites are now specified in the parameter list.

It’s interesting that functional languages emphasize the importance of this principle. Pure functions are all about making the dependencies in code explicit and disallowing the change of the global state.

Summary

Always state what the code is doing in each particular moment. That will help to fill the gap between different developers’ experiences. Making the assumptions explicit is an intrinsic part of improving the readability of our code.

Relates links

Other articles in the series

Share




  • jamesej67

    I find of useful insight in your writings but have to disagree with you on this one. I think generally the line between being explicit and using implicit assumptions is based on a judgement as to whether your implicit assumptions are likely to trip up a future developer.

    The question of constructor arguments (or indeed any function or method argument list) is about finding a place on a continuum of specificity of what the code using the arguments requires. The disadvantage of passing an object that contains amongst other things references to the objects you require is not that this is not explicit: it’s not the job of the argument list to tell you what a function is doing internally and making it too specific can harm encapsulation. The disadvantage is that you may not have the container object to hand and it is easier to use the constructor/function/method if it only takes what it needs. The advantage is that the interface of the class will change less if you make internal changes to what you depend on and use in the objects passed in: i.e. better encapsulation. So it’s a continuum.