Cohesion and Coupling: the difference



This is another post on the most valuable principles in software development.

You might have heard of a guideline saying that we should aim to achieve low coupling and high cohesion when working on a code base. In this article, I’d like to discuss what this guideline actually means and take a look at some code samples illustrating it. I also want to draw a line between these two ideas and show the differences in them.

Cohesion and Coupling: the difference

While coupling is a pretty intuitive concept, meaning that almost no one has difficulties understanding it, the notion of cohesion is harder to grasp. Moreover, the differences between the two often appear to be obscure. It’s not surprising: the ideas behind these terms are indeed similar. Nevertheless, they do differ.

Cohesion represents the degree to which a part of a code base forms a logically single, atomic unit.

It can also be put as the number of connections inside some code unit. If the number is low, then the boundaries for the unit are probably chosen badly, the code inside the unit is not logically related.

A unit here is not necessarily a class. It might be a method, a class, a group of classes, or even a module or an assembly: the notion of cohesion (as well as coupling) is applicable on different levels. We’ll talk about it in a minute.

Coupling, on the other hand, represents the degree to which a single unit is independent from others. In other words, it is the number of connections between two or more units. The fewer the number, the lower the coupling.

High cohesion, low coupling guideline

In essence, high cohesion means keeping parts of a code base that are related to each other in a single place. Low coupling, at the same time, is about separating unrelated parts of the code base as much as possible.

In theory, the guideline looks pretty simple. In practice, however, you need to dive into the domain model of your software deep enough to understand which parts of your code base are actually related.

It means that unlike such metrics as cyclomatic complexity, the degree to which your code is high cohesive and low coupled cannot be measured directly. It strongly depends on the semantics of the code which itself is an attribute of the domain model.

Perhaps, the lack of objectivity in this guideline is the reason why it’s often so hard to follow.

There is a principle to which this guideline highly relates: Separation of Concerns. The two are pretty similar in terms of the best practices they propose. Check out this article to read more about the Separation of Concerns principle.

Types of code from a cohesion and coupling perspective

Besides the code which is both highly cohesive and loosely coupled, there are at least three types that fall into other parts of the spectrum. Here are all 4 types:

Cohesion coupling difference: types of code from a cohesion and coupling perspective

Types of code from a cohesion and coupling perspective

Let’s step into them, one by one.

1. Ideal is the code that follows the guideline. It is loosely coupled and highly cohesive. We can illustrate such code with this picture:

Cohesion coupling difference: Ideal

Ideal

Here, circles of the same color represent pieces of the code base related to each other.

2. God Object is a result of introducing high cohesion and high coupling. It is an anti-pattern and basically stands for a single piece of code that does all the work at once:

Cohesion coupling difference: God Object

God Object

Another naming for this kind of code would be Big Ball of Mud.

3. The third type takes place when the boundaries between different classes or modules are selected poorly:

Cohesion coupling difference: Poorly selected boundaries

Poorly selected boundaries

Unlike God Object, code of this type does have boundaries. The problem here is that they are selected improperly and often do not reflect the actual semantics of the domain. Such code quite often violates the Single Responsibility Principle.

4. Destructive decoupling is the most interesting one. It sometimes occurs when a programmer tries to decouple a code base so much that the code completely loses its focus:

Cohesion coupling difference: Destructive Decoupling

Destructive Decoupling

The last type deserves a more detailed discussion.

Cohesion and Coupling: pitfalls

Often, when a developer tries to implement the low coupling, high cohesion guideline, he or she puts too much of effort to the coupling side of the guideline and forgets about the other one completely. It leads to a situation where the code is indeed decoupled but at the same time doesn’t have a clear focus. Its parts are separated from each other so much that it becomes hard or even impossible to grasp their meaning. I call this situation destructive decoupling.

Let’s look at an example:

public class Order

{

    public Order(IOrderLineFactory factory, IOrderPriceCalculator calculator)

    {

        _factory = factory;

        _calculator = calculator;

    }

 

    public decimal Amount

    {

        get { return _calculator.CalculateAmount(_lines); }

    }

 

    public void AddLine(IProduct product, decimal price)

    {

        _lines.Add(_factory.CreateOrderLine(product, price));

    }

}

This code is a result of destructive decoupling. You can see that on one hand, the Order class is completely decoupled from Product and even OrderLine. It delegates the calculation logic to a special IOrderPriceCalculator interface; the creation of lines is performed by a factory.

At the same time, this code is completely incohesive. The classes whose semantics is closely related are now separated from each other. This is a pretty simple example, so I’m sure you get the idea of what is going on here, but imagine how hard it would be to understand such code describing some unfamiliar domain model. In most cases, the lack of cohesion makes code unreadable.

Destructive decoupling often goes hand in hand with the “interfaces everywhere” attitude. That is, the temptation to substitute every concrete class with an interface, even if that interface does not represent an abstraction.

So how would we rewrite the code above? Like this:

public class Order

{

    public decimal Amount

    {

        get { return _lines.Sum(x => x.Price); }

    }

 

    public void AddLine(Product product, decimal amount)

    {

        _lines.Add(new OrderLine(product, amount));

    }

}

That way, we restored the connections between Order, OrderLine, and Product. This code is concise and cohesive.

It is important to understand the relation between cohesion and coupling. It’s impossible to completely decouple a code base without damaging its coherence. Similarly, it’s impossible to create fully cohesive code without introducing unnecessary coupling, but this attitude is seldom the case because, unlike cohesion, the concept of coupling is more or less intuitive.

The balance between the two is the key to creating highly (but not fully) cohesive and loosely coupled (but not completely decoupled) code base.

Cohesion and coupling on different levels

As I mentioned earlier, cohesion and coupling can be applied on different levels. The class level is the most obvious, but it’s not the only one. An example here would be a folder structure inside a project:

Cohesion coupling difference: Poorly selected boundaries for a project

Poorly selected boundaries for a project

At first glance, the project is well-organized: there are separate folders for entities, factories, and so on. However, it lacks cohesion.

It falls into the 3rd category in our diagram: poorly selected boundaries. While the internals of the project are indeed loosely coupled, their boundaries don’t reflect their semantics.

A highly cohesive (and loosely coupled) version would be the following:

Cohesion coupling difference: Better boundary choice

Better boundary choice

That way, we keep the related classes together. Moreover, the folders in the project are now structured by the domain model semantics, not by utility purpose. This version falls into the first category, and I highly recommend to maintain such kind of partitioning in your solution.

Cohesion and SRP

The notion of cohesion is akin to the Single Responsibility Principle. SRP states that a class should have a single responsibility (a single reason to change), which is similar to what highly cohesive code does.

The difference here is that while high cohesion does imply code have similar responsibilities, it doesn’t necessarily mean the code should have only one. I would say SRP is more restrictive in that sense.

Summary

Let’s summarize with the following:

  • Cohesion represents the degree to which a part of a code base forms a logically single, atomic unit.
  • Coupling represents the degree to which a single unit is independent from others.
  • It’s impossible to achieve full decoupling without damaging cohesion, and vise versa.
  • Try to adhere to the “high cohesion and low coupling” guideline on all levels of your code base.
  • Don’t fall into the trap of destructive decoupling.

Other articles in the series

Share




  • Uchitha Ranasinghe

    Excellent post mate thanks. Looking at your example of highly cohesive solution structure, how practical is it to have your repositories (OrderRepository) in your DomainModel project? My repositories usually have their own project (less cohesive).
    Thanks again.

    • http://enterprisecraftsmanship.com/ Vladimir Khorikov

      Thank you!

      I personally tend to consider repositories part of the domain model. They reside higher than entities but still I think they are part of it:

      https://lh4.googleusercontent.com/-aus9zEiQMSY/VHTyHvJkz_I/AAAAAAAAAKE/fN5PlZvzzE8/w587-h554-no/Onion.png

      This picture reflects how I usually form layers in software projects. The 2 inner layers are parts of the domain model, the other layers belong to other parts of the system.

      • Chris Dunn

        I keep the repository interfaces (IOrderRepository) in my domain layer and the implementations in the infrastructure layer (SqlServerOrderRepository). That’s one other way to do it, though you need to be very disciplined and make sure no business logic creeps out of the domain layer into the infrastructure layers.

  • Michael G.

    Good post yet again.

    I have one question though.

    Doesn’t the use of “new OrderLine” inside the AddLine method make it more difficult to unit test the code?

    The reason why I ask is that an OrderLine could be fairly simple, but often it can be subject to discounts, which follow some business rules. Moving such logic to a factory is usually the prefered method to deal with instantiation of complex objects, rather than put them in the OrderLine constructor.

    • http://enterprisecraftsmanship.com/ Vladimir Khorikov

      Good question. Generally, it is better to treat a whole aggregate as a unit for testing. Here, OrderLine is part of the Order aggregate (it doesn’t have a lot of meaning outside of it) and thus should be tested with the Order itself.

      The business rules regarding discounts, if implemented inside the Order aggregate, can also be unit tested along with the Order class. Here I touched on this topic in a bit more detail: http://enterprisecraftsmanship.com/2015/08/03/tdd-best-practices/

  • Andriy Chubarev

    Haha, destructive decoupling is about me right now.