Functional C#: Primitive obsession

By Vladimir Khorikov

The topic described in this article is a part of my Applying Functional Principles in C# Pluralsight course.

This is the second article in my Functional C# blog post series.

What is primitive obsession?

Primitive obsession stands for using primitive types to model domain. For example, this is how Customer class might look like in a typical C# application:

public class Customer

{

    public string Name { get; private set; }

    public string Email { get; private set; }

 

    public Customer(string name, string email)

    {

        Name = name;

        Email = email;

    }

}

The problem here is that when you want to enforce validation rules specific for your domain, you inevitably end up putting validation logic all over your source code:

public class Customer

{

    public string Name { get; private set; }

    public string Email { get; private set; }

 

    public Customer(string name, string email)

    {

        // Validate name

        if (string.IsNullOrWhiteSpace(name) || name.Length > 50)

            throw new ArgumentException(“Name is invalid”);

 

        // Validate e-mail

        if (string.IsNullOrWhiteSpace(email) || email.Length > 100)

            throw new ArgumentException(“E-mail is invalid”);

        if (!Regex.IsMatch(email, @”^([\w\.\-]+)@([\w\-]+)((\.(\w){2,3})+)$”))

            throw new ArgumentException(“E-mail is invalid”);

 

        Name = name;

        Email = email;

    }

 

    public void ChangeName(string name)

    {

        // Validate name

        if (string.IsNullOrWhiteSpace(name) || name.Length > 50)

            throw new ArgumentException(“Name is invalid”);

 

        Name = name;

    }

 

    public void ChangeEmail(string email)

    {

        // Validate e-mail

        if (string.IsNullOrWhiteSpace(email) || email.Length > 100)

            throw new ArgumentException(“E-mail is invalid”);

        if (!Regex.IsMatch(email, @”^([\w\.\-]+)@([\w\-]+)((\.(\w){2,3})+)$”))

            throw new ArgumentException(“E-mail is invalid”);

 

        Email = email;

    }

}

Moreover, the exact same validation rules tend to get into the application layer:

[HttpPost]

public ActionResult CreateCustomer(CustomerInfo customerInfo)

{

    if (!ModelState.IsValid)

        return View(customerInfo);

 

    Customer customer = new Customer(customerInfo.Name, customerInfo.Email);

    // Rest of the method

}

public class CustomerInfo

{

    [Required(ErrorMessage = “Name is required”)]

    [StringLength(50, ErrorMessage = “Name is too long”)]

    public string Name { get; set; }

 

    [Required(ErrorMessage = “E-mail is required”)]

    [RegularExpression(@”^([\w\.\-]+)@([\w\-]+)((\.(\w){2,3})+)$”,

        ErrorMessage = “Invalid e-mail address”)]

    [StringLength(100, ErrorMessage = “E-mail is too long”)]

    public string Email { get; set; }

}

Apparently, such approach breaks DRY principle which claims the need for a single source of truth. That means that you should have a single authoritative source for each piece of domain knowledge in your software. In the example above, there are at least 3 of them.

How to get rid of primitive obsession?

To get rid of primitive obsession, we need to introduce two new types which could aggregate all the validation logic that is spread across the application:

public class Email

{

    private readonly string _value;

 

    private Email(string value)

    {

        _value = value;

    }

 

    public static Result<Email> Create(string email)

    {

        if (string.IsNullOrWhiteSpace(email))

            return Result.Fail<Email>(“E-mail can’t be empty”);

 

        if (email.Length > 100)

            return Result.Fail<Email>(“E-mail is too long”);

 

        if (!Regex.IsMatch(email, @”^([\w\.\-]+)@([\w\-]+)((\.(\w){2,3})+)$”))

            return Result.Fail<Email>(“E-mail is invalid”);

 

        return Result.Ok(new Email(email));

    }

 

    public static implicit operator string(Email email)

    {

        return email._value;

    }

 

    public override bool Equals(object obj)

    {

        Email email = obj as Email;

 

        if (ReferenceEquals(email, null))

            return false;

 

        return _value == email._value;

    }

 

    public override int GetHashCode()

    {

        return _value.GetHashCode();

    }

}

public class CustomerName

{

    public static Result<CustomerName> Create(string name)

    {

        if (string.IsNullOrWhiteSpace(name))

            return Result.Fail<CustomerName>(“Name can’t be empty”);

 

        if (name.Length > 50)

            return Result.Fail<CustomerName>(“Name is too long”);

 

        return Result.Ok(new CustomerName(name));

    }

 

    // The rest is the same as in Email

}

The beauty of this approach is that whenever validation logic (or any other logic attached to those classes) changes, you need to change it in one place only. The fewer duplications you have, the fewer bugs you get, and the happier your customers become!

Note that the constructor in Email class is closed so the only way to create one is by using the Create method which does all the validations needed. By doing this, we make sure that an Email instance is in a valid state from the very beginning and all its invariants are met.

This is how the controller can use those classes:

[HttpPost]

public ActionResult CreateCustomer(CustomerInfo customerInfo)

{

    Result<Email> emailResult = Email.Create(customerInfo.Email);

    Result<CustomerName> nameResult = CustomerName.Create(customerInfo.Name);

 

    if (emailResult.Failure)

        ModelState.AddModelError(“Email”, emailResult.Error);

    if (nameResult.Failure)

        ModelState.AddModelError(“Name”, nameResult.Error);

 

    if (!ModelState.IsValid)

        return View(customerInfo);

 

    Customer customer = new Customer(nameResult.Value, emailResult.Value);

    // Rest of the method

}

The instances of Result<Email> and Result<CustomerName> explicitly tell us that the Create method may fail and if it does, we can know the reason by examining the Error property.

This is how Customer class can look like after the refactoring:

public class Customer

{

    public CustomerName Name { get; private set; }

    public Email Email { get; private set; }

 

    public Customer(CustomerName name, Email email)

    {

        if (name == null)

            throw new ArgumentNullException(“name”);

        if (email == null)

            throw new ArgumentNullException(“email”);

 

        Name = name;

        Email = email;

    }

 

    public void ChangeName(CustomerName name)

    {

        if (name == null)

            throw new ArgumentNullException(“name”);

 

        Name = name;

    }

 

    public void ChangeEmail(Email email)

    {

        if (email == null)

            throw new ArgumentNullException(“email”);

 

        Email = email;

    }

}

Almost all of the validations have been moved to Email and CustomerName classes. The only checks that are left is null checks. They still can be pretty annoying, but we’ll get to know how to handle them in a better way in the next article.

So, what benefits do we get by getting rid of primitive obsession?

  • We create a single authoritative knowledge source for every domain problem we solve in our code. No duplications, only clean and dry code.
  • Stronger type system. Compiler works for us with doubled effort: it is now impossible to mistakenly assign an email to a customer name field, that would result in a compiler error.
  • No need to validate values passed in. If we get an object of type Email or CustomerName, we are 100% sure that it is in a correct state.

There’s one detail I’d like point out. Some people tend to wrap and unwrap primitive values multiple times during a single operation:

public void Process(string oldEmail, string newEmail)

{

    Result<Email> oldEmailResult = Email.Create(oldEmail);

    Result<Email> newEmailResult = Email.Create(newEmail);

 

    if (oldEmailResult.Failure || newEmailResult.Failure)

        return;

 

    string oldEmailValue = oldEmailResult.Value;

    Customer customer = GetCustomerByEmail(oldEmailValue);

    customer.Email = newEmailResult.Value;

}

Instead of doing it, it is better to use custom types across the whole application unwrapping them only when the data leaves the domain boundaries, i.e. is being saved in database or rendered to HTML. In your domain classes, try to use them as much as possible. It would result in a cleaner and more maintainable code:

public void Process(Email oldEmail, Email newEmail)

{

    Customer customer = GetCustomerByEmail(oldEmail);

    customer.Email = newEmail;

}

The other side: limitations

Unfortunately, custom types creation in C# is not as neat as in functional languages like F#. That probably will be changed in C# 7 if we get record types and pattern matching, but until that moment we need to deal with overall clunkiness of that approach.

Because of that, I find some really simple primitives not worth being wrapped. For example, money amount with the single invariant stating that the amount can’t be negative probably could still be represented as decimal. That would lead to some validation logic duplication, but – again – that is probably a simpler design decision even in a long run.

As usual, appeal to a common sense and weight pros and cons in every single situation. And don’t hesitate to change your mind, even multiple times.

Summary

With immutable and non-primitive types, we are getting closer to designing applications in C# in a functional way. Next time, I’ll show how to mitigate the billion dollar mistake.

Source code

 Other articles in the series

LinkedInRedditTumblrBufferPocketShare




  • Michal Fiedler

    Are you using your own Result implementation, or do you use some library for that?

  • Edward Brey

    Any reason to prefer “class Email” over “struct Email”?

    • http://enterprisecraftsmanship.com/ Vladimir Khorikov

      That’s because of ORMs. Generally speaking, ORMs don’t handle mapping on structs very well, that’s why I use classes. If you don’t use relational databases or do O/R mapping by yourself, there’s no reason not to use structs – they are cheaper, more lightweight, and do a better job as wrappers.

      Check out my Value Objects article ( http://enterprisecraftsmanship.com/2015/01/03/value-objects-explained/ ) in which I discuss this topic a little bit.

      • Edward Brey

        Doesn’t the ORM just pick up the string type of the implicit operator conversion?

        • http://enterprisecraftsmanship.com/ Vladimir Khorikov

          To make it work, you’ll need to create conversion operators for both ways, i.e.

          public static implicit operator string(Email email)
          and
          public static implicit operator Email(string email)

          Adding the second operator will break encapsulation as it will allow us to create Email instance bypassing the Create method.

          As a possible solution, you could create an internal Email property of a string type and bind it to the database, and that will work:

          protected virtual string EmailInternal { get; set; }
          public virtual Email Email
          {
          get { return Email.Create(EmailInternal).Value; }
          set { EmailInternal = value.Value; }
          }

          Another reason why I prefer classes to structs is that structs don’t support inheritance. With structs, you need to duplicate equivalence operators (namely, operator == and operator !=). With classes, it’s a bit less code.

        • http://enterprisecraftsmanship.com/ Vladimir Khorikov

          I just realize that my answer didn’t cover why ORM will allow you to bind Email class and not Email struct.

          The difference between them is that with classes, you can bind private _value field of the Email class to a DB column (NHibernate supports such binding) whereas with struct Email, you can’t do it. I have to admit that I rarely use such functionality and the primary reason of “classes over structs” is code duplication problem which appears because of the lack of inheritance support.

  • Sean

    Why use a .Create(..) method instead of the default public constructor which accepts an email argument?

    One advantage of the Create(..) method is that it allows constructor dependency injection, but are there other reasons?

    • http://enterprisecraftsmanship.com/ Vladimir Khorikov

      The main advantage in a static Create() method over a constructor is that it allows you to return Result instead of Email. With constructor, you must always return an Email instance to the caller, even if the email string wasn’t correct. With Result, you can just return “failure” in this case. That allows for instantiating Email instances only when they are in a valid state.

  • Venkat Raj

    what is this Result ? is it your custom class or a inbuilt class. Please explain.

    public static Result Create(string email)