Why following software design best practices decreases code complexity

By Vladimir Khorikov

Most of us agree that in many cases following best practices leads to better code. Namely, it decreases the complexity and allows us to reason about even large software systems easier.

But why is that so, exactly? Today, we’ll take two design principles – separation of concerns and immutability – and see how (and why!) they decrease the complexity of the code.

 Complexity

Let’s start with some code sample and see how we can measure its complexity:

public class Customer

{

    public bool IsLoyalCustomer { get; set; } // true or false

    public LoyaltyProgram LoyaltyProgram { get; set; } // Standart or Extended

    public int LoyaltyProgramEnrollmentYear { get; set; } // 2015 or 2014

    public string ContactName { get; set; } // Julie or Steve

    public JobTitle ContactJobTitle { get; set; } // Manager or CEO

    public string ContactEmail { get; set; } // info@mail.com or support@email.com

}

Let’s also say each property might have only 2 values, just to make it a bit easier for us to count.

How difficult is it to reason about such class? Or, put it another way, how complex is it? The complexity of the code is directly related to the number of possible states it may take. In this case, it will be the number of unique instances of the class, such as:

new Customer

{

    IsLoyalCustomer = true,

    LoyaltyProgram = LoyaltyProgram.Standart,

    LoyaltyProgramEnrollmentYear = 2015,

    ContactName = “Julie”,

    ContactJobTitle = JobTitle.Manager,

    ContactEmail = “info@mail.com”

};

new Customer

{

    IsLoyalCustomer = false,

    LoyaltyProgram = LoyaltyProgram.Standart,

    LoyaltyProgramEnrollmentYear = 2014,

    ContactName = “Steve”,

    ContactJobTitle = JobTitle.Manager,

    ContactEmail = “support@mail.com”

};

And so on. The total number of such instances is: 

The second attribute that adds to the overall complexity is the possibility of change. How many variations are there in which a Customer object can mutate during its lifetime? Well, each of the 64 possible instances can either stay unchanged or move to any of the other 63 states. It gives us potential transitions.

That’s a huge number. In the real world, not all of them would be legal, of course. Also, we don’t reason about each of the transitions separately. What usually happens is we implicitly apply some heuristics in order to sort them out. That allows us to reason about such code without getting overwhelmed.

We employ a set of assumptions in order to reduce the complexity while thinking of the problem. The issue here is that these assumptions are often incomplete or even sheer wrong. The code tends to behave unexpectedly in edge cases. We, in turn, have to revisit our assumptions and take into account lots of the combinations we didn’t pay attention to previously. The only reliable way to decrease the complexity is to reduce the number of scenarios that could possibly happen to the code.

Applying separation of concerns

Let’s see how a more fine-grained composition can alleviate the situation. We can see the Customer class mainly consists of two unrelated parts: the one that’s about their loyalty, and the other one that regards to the contact person information.

We can split the class into two cohesive pieces, like this:

public class Customer

{

    public Loyalty Loyalty { get; set; }

    public ContactPerson ContactPerson { get; set; }

}

 

public class ContactPerson

{

    public string Name { get; set; } // Julie or Steve

    public JobTitle JobTitle { get; set; } // Manager or CEO

    public string Email { get; set; } // info@mail.com or support@email.com

}

 

public class Loyalty

{

    public bool IsLoyal { get; set; } // true or false

    public LoyaltyProgram Program { get; set; } // Standart or Extended

    public int EnrollmentYear { get; set; } // 2015 or 2014

}

Now that we can reason about these pieces separately, how the overall complexity can be measured? The new edition gives us possible combinations, and that is way, way better than in the first version.

Separation of concerns is an important concept and we now can see why. A class with 6 properties is orders of magnitude more complex than two classes with 3 properties each.

Applying immutability

Let’s see how making only a single property immutable in the ContactPerson class changes its complexity.

Currently, the number of unique ContactPerson objects is 8 and the number of possible transitions each of them might take is also 8. That gives us combinations.

If we make, say, the Name property immutable, the number of possible transitions would become 4, which gives us combinations overall. That is 3 orders of magnitude less, only for a single property!

And if we make both classes – ContactPerson and Loyalty – fully immutable, the number of possible situations we need to reason about during a Customer instance lifetime would only be

Results

I bet you already knew following these design principles leads to a better design. Now we can supplement our gut feeling with actual numbers.

The number of situations that could potentially happen to some code directly affects our ability to reason about it. The less that number, the easier it is to maintain and extend it.

Separation of concerns helps us reduce the amount of combinations with which these concerns can interact with each other and prevent the combinatorial explosion – the situation where the addition of only a single new concern to an existing entity makes its complexity sky-rocket.

Immutability attacks the problem from the other end. It reduces the amount of transitions a piece of code might take and thus decreases the total number of situations we need to think of while reasoning about the code.

Summary

Code simplicity is not just a matter of style or taste. Fighting the complexity is often the only way out to success of your project. Moreover, in most cases, the complexity of an application’s code base is the single most important factor that affects our ability to develop it.

Adhering to software design best practices helps us with that.

Related articles:

LinkedInRedditTumblrBufferPocketShare




  • David Raab

    Separating Loyalty and Contact does reduce the complexity, but the reason why it does has nothing to do with your definition of complexity. Your definition of complexity is overall just wrong. In the KISS article we already discussed and you also mentioned a video with Rick Hickes (Simple Made Easy – http://www.infoq.com/presentations/Simple-Made-Easy). I would suggest to rewatch the video. As he clearly explains what complexity is, and also how we achieve simplicity through abstraction.

    The overall short explanation is that simplicity is about dealing “one thing” at the time, while everything that deals more than with one thing is complex. But it has nothing todo with how many values something can have. Something like that would be anyway not helpfull at all. Because just using a string would end up having an “infinite” amount of possibilities. Adding any more field to something that contains wouldn’t even increase complexity in your definition because infinite is already infinite, it cannot get further increased.

    Simplicity is not about how much values something can take, or how many methods or functions it have. It is about if all those things deal with one thing or not. For example a List is simpel. And when we look at F# we even have a List module with around 100 functions, and it is still simple. Why? Because everything the List module does is only handing stuff about List. It is all about transforming a List, and nothing else. Thus a List is simple. A List is not complex because it theoretically can handle an infinite amount of elements or even can work with every type. A List is a perfect example for abstraction and simplicity.

    In your example the complexity of your Customer class is basically “2” not “64”. It is two because your Customer Class handles two things. It handles “Loyality” and a “Contact”. Those are two different things. And sure putting those two different things into each own separate thing reduces complexity.

    That is the reason why complexity is reduced, not because the possible amount of different values is decreased. The amount of possible different values has nothing todo with complexity at all.

    Immutability that you named at the beginning also reduces complexity because a mutable variable is really two things. A mutable variable is really a combination of a value *and* time in one thing. A mutable variable always have a current value. Changing a variable basically mimics time implicitly. An immutable variable on the other hand just represents a fixed value frozen in time.

    So overall your result is correct, also everything about Separation of concerns and so on. But your definition of what complexity really is, is not correct. And i also think it is not helpfull, as you can view every string or int, float etc. as an infinite amount of possibilities you would anyway end up in an infinite amount of possible values. And no three infinite variables are not more infinite as a single infinite variable.

    • http://enterprisecraftsmanship.com/ Vladimir Khorikov

      I understand your point and see where it comes from. Nevertheless, the explanation you propose has an incorrect premise: you assume that any “one thing” has the same complexity. In your interpretation, a cohesive class with 2 members would be of the same complexity as another cohesive class with 10 members which is obviously not the case. This makes this concept pretty much useless when it comes to the actual comparison between different design decisions.

      Because just using a string would end up having an “infinite” amount of possibilities

      It wouldn’t. The thing that adds up to complexity is not the number of states a single parameter can take, it’s the number of such parameters. I tried to make the math in the article as simple as possible, hence the properties can take only 2 possible values. To be absolutely precise, we need to use the notion of set cardinality. The complexity of the unrefactored version in such case would be (6*N)^(6*N) instead of 64^64. This article is a good place to start if you want to learn more on that: https://en.wikipedia.org/wiki/Cardinality .

      Adding any more field to something that contains wouldn’t even increase complexity in your definition because infinite is already infinite, it cannot get further increased.

      That is not true, see the explanation above regarding the cardinality of sets.

      as you can view every string or int, float etc. as an infinite amount of possibilities you would anyway end up in an infinite amount of possible values

      The same here.

      Simplicity is not about how much values something can take

      That is exactly what simplicity (and complexity) is about. Admittedly, it’s not the only parameter that comprise code complexity, but it’s the most important one. I recommend this paper to learn more on that: http://shaffner.us/cs/papers/tarpit.pdf

      • Shang Tsung

        i think you have an error on line

        “Now that we can reason about these pieces separately, how the overall complexity can be measured? The new edition gives us

        ….. 33554432″
        you add the number of variants, whereas you should factorial them, so the result would be (8^8).(8^8) and not 8^8 +8^8, actually “complexities” in given situation should always be multiplicative, not additive

        • http://enterprisecraftsmanship.com/ Vladimir Khorikov

          You raise a very good point, the line you are pointing to is the weakest element in the whole article. Additivity here means we can somehow reason about the two properties separately which is obviously not the case as they are used in a single class.

          At the same time, I don’t think that the result here should be multiplicative because Separation of Concerns do provide the decrease of complexity in a sense that we are able to reason about the two classes separately to some extend. Additivity is the best operation I could come up with in this example. You are absolutely right saying it is not quite accurate here.

          • David Raab

            I also can reason about every single value from the first definition. As every of those 6 variables only had two possible values.

            And still we are calculating “2*2*2*2*2*2” not “2+2+2+2+2+2”. Because the amount of possibilities i can get is 64 not 12.

            And if you add two classes together where each class can have 8 possible values what Loyalty and ContactPerson are. Then the amount of possibilities is also “8*8” not “8+8”.

            That also means that 6 variables with 2 values each ends up with the same amount of possibilities as two variables that can have 8 values each.

            The last one still reduces complexity. But the reason does not lay in the amount of possible values that is somehow reduced. As it is not reduced, it is the same. It reduces complexity because Contact and Loyalty are really two distinct parts, and both has nothing in common. Nothing in Loyalty should every access something from Contact. There are really two distinct parts that where put into one big class.

            So separating two distinct parts into different parts is exactly what simplification/abstraction is. And that is why complexity is reduced. Even if the amount of possible values isn’t reduced at all. That is exactly what you describe as “Separation of Concerns”. It definitely reduces the complexity.

            But the reason is not that the amount of possible values is reduced. It reduces complexity because parts that should be distinct get separated. And relying on cardinality is also why you found it hard to explain why it reduces complexity. The reason is that cardinality is not a tool to measure complexity at all.

      • David Raab

        At first, you are talking about Cardinality. But cardinality is not complexity. Just because you calculated that they are 64 possible values doesn’t mean it has a complexity of 64. Complexity and cardinality are two completely different things. What i’m saying is that calculating cardinality is pretty much useless, as you end up anyway fast with an infinite amount of possible values. But you don’t have to agree with me, instead let me show you your error in your calculation and you also see why calculating the cardinality and use it is a measure for complexity, doesn’t make sense.

        At first. There exists product types and sum types. And when we are talking about C# i just want to point out that C# don’t have a sum type, so if you have a “+” in your calculation you already can assume that you made something wrong. A class itself is a product type. To calculate the amount of possible values a compound type can have, you have to multiplicity the amount of possible values. That’s also what you did in your first example. You calculated 2*2*2*2*2*2 = 64. So you have 64 possible values. There exists 63 transistions. But already here you did the first error. You don’t get the amount of transitions by calculating the exponent. You get the amount by multiplying it. The formula is (N * (N-1)). A simple example two boolean. There exists 4 states (true,true) (true,false) (false,true) (false,false). Everyone of this states has three other possible transistions. So you end up with (4 * (4-1) = 12). You don’t end up with (4**4 = 256). I even can print all possibilities shortly

        true,true ->
        true,false
        false,true
        false,false

        true,false ->
        true,true
        false,true
        false,false

        false,true ->
        true,true
        true,false
        false,false

        false,false ->
        true,true
        true,false
        false,true

        4 states each with 3 transitions. 12 transistion from 4 initial states. So the result is “12+4”. Or in other words. To calculate all combinations with transistions you do (N*N) not (N**N). With 64 values that already makes a big difference. The correct nuumber is

        64 * 64 = 4096

        not

        64 ** 64 = 3.94 * (10**115)

        But while this is wrong, that doesn’t even matter so much, as you continuously do it wrong.

        Now here comes the big error. Your calculation of your separated version where you calculate the cardinality is completely wrong. Let’s do the math. here are the possible values of your classes.

        ContactPerson: 2 * 2 * 2 = 8 possible values
        Loyalty: 2 * 2 * 2 = 8 possible values
        Customer: 8 * 8 = 64 possible values

        Or in other words. The cardinality of Customer is the same. There still exists 64 possible values. Because Customer wraps two values with 8 possible values, you also end up with 64 possible values. And because you have 64 possible values, the amount of transitions stays the same. That’s also why i said that this doesn’t matter so much

        And it also makes sense that everything stays the same. As just wrapping variables in each of its own class and consuming the whole class in another doesn’t reduce the amount of possible values. In the end it is exactly the same. And that is an important aspect. Because if you think that complexity is the amount of possible values or the cardinality, then it means your refactoring into three classes had not any impact at all. In fact separating stuff into different classes never changes the cardinality. At least not as long as you use a product type. And a class is a product type.

        And just for the sake of what a sum type is. A Discriminated Uniont found in F# is a sum type. Doing a


        type CustomerContact = {
        name : string
        jobTitle : JobTitle
        email : Email
        }

        type CustomerLoyalty = {
        isLoyal : bool
        program : LoyaltyProgram
        enrollmentYear : int
        }

        type Customer =
        | Contact of CustomerContact
        | Loyalty of CustomerLoyalty

        In that case Customer would be sum type. If every possible value in the record only had two possible values. Then the cardinality of Customer would be “8 + 8”. Because a Discriminated Union only can hold a “CustomerContact” OR a Customerloyalty. So the amount of possibilities is “8 + 8”, instead of “8 * 8”. But neverthless to note that such a Customer definition would anyway not make much sense. But i wanted to add a definition of what a sum type is.

        Product Type: Class, Tuple, Record
        Sum Type: Discriminated Union

        So your definition of complexity is useless because the cardinality stays the same, you come up with wrong numbers because your math is wrong. With your definition that Cardinality is complexity it would mean a class with 6 variables has the same complexity as spitting it up into two distinct classes and then adding it up together.

        And that is an important point, because if you realise this your definition of complexity doesn’t work at all, and you have to find another one. And what i described what is basically just Rich Hickey definition of complexity is way more useful.

        • http://mdbs99.com Marcos Douglas Santos

          Liked

        • http://enterprisecraftsmanship.com/ Vladimir Khorikov

          I tend to agree with you. I think the explanation I tried to come up with in this article is indeed not an appropriate way to describe code complexity.

          • Shang Tsung

            still a great blog and a great article, as it forces one to think hard and provokes a discussion, which is always a good thing, also as it involves a number of concepts and perspectives on a given situation i think it adds a lot of value to the overall subject and has good points in it,like for example the discussion of different ways of reorganizing the code