DRY revisited

By Vladimir Khorikov

Another principle we should follow when building a software project is the DRY principle. The abbreviation stands for Don’t Repeat Yourself. While it seems pretty straightforward and intuitive, this principle is more than meets the eye. Let’s see how it is so.

The DRY principle

The DRY principle states that every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

The implication from this principle is often deemed to be avoiding code duplication. While it’s true that in most cases following DRY means the elimination of repeatable code, the principle itself is not about code duplication.

Let’s take an example:

public class Product

{

    /* Other members */

    public string Name { get; set; }

 

    public override string ToString()

    {

        return Name;

    }

}

 

public class Customer

{

    /* Other members */

    public string Name { get; set; }

 

    public override string ToString()

    {

        return Name;

    }

}

Here, the Product and the Customer classes have two members in common. We could refactor them and extract those members, like this:

public class NamedEntity

{

    public string Name { get; set; }

 

    public override string ToString()

    {

        return Name;

    }

}

 

public class Product : NamedEntity

{

    /* Other members */

}

 

public class Customer : NamedEntity

{

    /* Other members */

}

This version now complies with the DRY principle. Or does it not? Was there really a single piece of knowledge spread across these entities?

Can it be that two classes have identical parts yet still don’t break DRY?

Yes. The DRY principle restricts the presence of domain knowledge, it doesn’t put any bounds on the actual code which is required to express that knowledge.

The fact the two entities have the same functionality doesn’t mean they violate DRY. In fact, both the Product and the Customer classes hold their own semantics. It just happened that they do it using the identical code in this particular case.

From the domain’s point of view, these entities develop separately from each other. They both represent different parts of the domain and thus don’t violate the DRY principle.

Introducing a base class purely because two or more entities have some members in common is generally a bad practice. I call it utility inheritance.

DRY and utility methods

If the DRY principle forces us to keep every piece of domain knowledge in a single place, what about the code that doesn’t contain any? Would the duplication of such code be a violation of DRY?

No, because, again, the DRY principle is about the knowledge that is essential to your domain. Utility methods don’t contain such knowledge.

It doesn’t mean we should introduce unnecessary duplication all over the helper methods, though. It just means that if we can’t avoid it, it’s not such a big deal.

Eric Lippert gives a great example in his article:

static Func<A, R> Memoize<A, R>(this Func<A, R> function)

{

    var cache = new Dictionary<A, R>();

    return argument =>

    {

        R result;

        if (!cache.TryGetValue(argument, out result))

        {

            result = function(argument);

            cache[argument] = result;

        }

        return result;

    };

}

Here, we have a pretty simple utility method for memorizing a function output. The problem is that if we want to introduce a memorizer for a function with 2, 3, or more parameters, we would need to write methods that almost fully duplicate each other.

There’s no way to avoid this duplication. At the same time, such methods don’t contain any knowledge about the application domain, so it’s perfectly fine to have multiple similar versions of them.

DRY and bounded contexts

One of the most controversial examples in terms of the DRY principle is code in different bounded contexts.

For example, we could have two contexts – Sales and Manufacture – with Product entity in each of them:

namespace Sales

{

    public class Product

    {

        public string Name { get; set; }

        public int Number { get; set; }

        public string Description { get; set; }

 

        public void AttachToManager(SalesManager manager)

        {

            /* … */

        }

    }

}

 

namespace Manufacture

{

    public class Product

    {

        public string Name { get; set; }

        public int Number { get; set; }

        public string Description { get; set; }

 

        public Product[] Disassemble()

        {

            /* … */

        }

    }

}

Note that the two classes have a lot of members in common. A natural tendency in such situation is to merge them into a single entity. However, combining them together wouldn’t be the best design decision.

From the domain point of view, these two classes have completely different semantics. While Sales.Product entity carries the knowledge that regards to how the products are sold, Manufacture.Product is all about their manufacturing.

They do have similar physical representation, but it wouldn’t make a lot of sense to have a joint “all-in-one” version of the Product class. We are better off keeping these pieces of knowledge apart. That would make them more focused and less prone to semantics mismatch. That is, when a property or a method means one thing in one context, and totally different thing in the other.

Summary

  • The DRY principle is about domain knowledge.
  • Don’t confuse adhering to DRY with getting rid of code repetition.
  • There are cases where code duplication is perfectly fine.

Other articles in the series

LinkedInRedditTumblrBufferPocketShare




  • Anders Baumann

    Hi Vladimir.

    Thanks for a great article. I found it very useful. Especially the part that just because two entities have the same functionality doesn’t mean that they violate DRY. I have often faced this issue not only between entities but also between methods inside an entity. I have merged two or more seemingly identical methods only to find out that the semantics differ in a slight way. To solve the problem I have introduced one or several flags in the merged method. This merged method now of course violates SRP. I realize that I am better off keeping the original methods and maybe have them call a common sub method.

    For a long time I wanted to introduce generic solutions everywhere I could. As Greg Young once said: “Developers have a tendency to attempt to solve specific problems with general solutions.” I find that to be very true. I can also see that this kind of thinking leads to unnecessary complexity. Instead of being general code should be specific.

    I also really liked your article about cohesion and coupling. In particular the part about destructive decoupling. I have definitely been guilty of doing destructive decoupling only so that I could unit test each class in isolation. But after reading the cohesion/coupling article and your series on TDD I now know there is a better way.

    Thanks and keep up the good work!

    Anders

    • http://enterprisecraftsmanship.com/ Vladimir Khorikov

      Anders, thank you for your kind comment!

      I can relate to every word. I also used to do all these things which led me to the same results: SRP violation. unfocused, highly coupled code with a lot of complexity.

      I’m glad you also found my previous articles helpful, such feedback is what motivates me to write. Thanks a lot!

  • http://blog.shaunfinglas.co.uk Shaun Finglas

    Nice post, I’ve blogged about this in the past and we are pretty much on the same page. Your first example being a perfect example on why bounded context’s are important.

    http://blog.shaunfinglas.co.uk/2015/06/dry-vs-coupling-in-production-code.html

    • http://enterprisecraftsmanship.com/ Vladimir Khorikov

      Thank you!
      And thanks for the link, another interesting blog for me to follow :)

  • http://www.ozemail.com.au/~markhurd/ Mark Hurd

    Extending your example: Suppose you have Manufacture, Sales, SpareParts and Orders, all with a form of Product. Some of these are clearly in an “is-a” relationship, especially your original example: what they sell is a product they manufactured. (Actually, that statement points out an incorrect assumption: some products sold may not be manufactured by the company, just re-badged for example.) I’m thinking there is a kernel of important properties for all forms of Products: Name, Part Number and Description, like you mention and all but Manufacture need a Recommended Retail Price and an Actual Price. I think I can still justify not having any inheritance here, but it does seem a little harder. I guess it depends somewhat upon how much code could be reused by manipulating IProduct or ProductCore.

    • http://enterprisecraftsmanship.com/ Vladimir Khorikov

      Yeah, the decision here definitely depends on how they relate to each other and whether or not they share some business logic (domain knowledge).