Separation of Concerns in ORM

By Vladimir Khorikov

Last week we compared Entity Framework and NHibernate from a DDD perspective. Today, I’d like to dive deeper into what Separation of Concerns (SoC) is and why it is so important. We’ll look at some code examples and features that break the boundaries between the domain and persistence logic.

Separation of concerns in ORM

There are several concerns we deal with in software development. In most applications, there are at least three of them clearly defined: UI, business logic and database. SoC notion is closely related to Single Responsibility Principle. You can think of SoC as SRP being applied not to a single class, but to the whole application. In most cases, these notions can be used interchangeably.

In the case of ORM, SoC is all about domain and persistence logic separation. You can say that your code base has a good Separation of Concerns if your domain entities don’t know how they are persisted and the database doesn’t contain any business logic. Of course, it’s not always possible to completely separate these concerns. Sometimes consistency and performance issues can make you break the boundaries. But you should always consider as clean separation as possible.

We can’t just isolate the domain and persistence logic, we need something that glues them together. That is where ORM comes into play. ORM allows us to map entities to appropriate database tables in such a way that neither the domain entities nor the database know about each other.

Separation of Concerns: ORM, Domain Model, Database

ORM, Domain Model, Database

Why is SoC so important?

There is a lot of information about how to separate application’s concerns. But why bother? Is it really so important?

The number of possible implementations without Separation of Concerns

The number of possible implementations without Separation of Concerns

Keeping different responsibilities together in a single class, you have to maintain their consistency simultaneously in every operation within this class. That quickly leads to a combinatorial explosion. Moreover, complexity grows much faster than most developers think. Every additional class responsibility increases its complexity by an order of magnitude.

To handle the complexity, we need to separate these responsibilities:

The number of possible implementations with Separation of Concerns

The number of possible implementations with Separation of Concerns

Separation of Concerns is not just a matter of good looking code. SoC is vital to development speed. Moreover, it’s vital to the success of your project.

A human can hold at most nine objects in working memory. An application without properly separated concerns overwhelms developer very quickly because of a huge amount of combinations in which elements of these concerns can interact with each other.

Separating concerns into high cohesive pieces lets you ‘divide and conquer’ the application you develop. It’s much easier to handle the complexity of a small, isolated component that is loosely coupled to other application’s components.

When persistence logic leaks to domain logic

Let’s look at some examples of persistence logic leaking to domain logic.

Case #1: Dealing with object’s persistent state in a domain entity

public void DoWork(Customer customer, MyContext context)

{

    if (context.Entry(customer).State == EntityState.Modified)

    {

        // Do something

    }

}

The current persistence state of an object (e.g. whether this object exists in the database or not) has no relationship to the domain logic. Domain entities should operate data that pertains to business logic only.

Case #2: Dealing with Ids

public void DoWork(Customer customer1, Customer customer2)

{

    if (customer1.Id > 0)

    {

        // Do something

    }

    if (customer1.Id == customer2.Id)

    {

        // Do something

    }

}

Dealing with Ids is probably the most frequent type of the persistence logic infiltration. Id is an implementation detail of how your entities are saved in database. If you want to compare your domain objects, just override the equality members in the base entity class and write ‘customer1 == customer2’ instead of ‘customer1.Id == customer2.Id’.

Case #3: Segregating domain entity properties

public class Customer

{

    public int Number { get; set; }

    public string Name { get; set; }

 

    // Not persisted in database: can store anything here

    public string Message { get; set; }

}

If you tend to write such code, you should stop and think of your model again. Such code denotes that you have included some irrelevant elements in your entity. In most cases, you can refactor your model and get rid of such elements.

When domain logic leaks into persistence logic

Case #1: Cascade deletion

Setting up a database for cascade deletion is one of the leak examples. The database itself should not contain any logic about when to trigger data deletion. This logic is clearly a domain concern. Your C#/Java/etc code should be the only place to keep such logic in.

Case #2: Stored procedures

Creating stored procedures that mutate data in the database is another example. Don’t let your domain logic leak to database, keep the code with side effects in your domain model.

I have to point two special cases out, though. Firstly, in most cases, it’s okay to create read-only stored procedures. Putting code with side effects to domain model and code with no side effects to stored procedures is perfectly aligned with the CQRS principles.

Secondly, there are cases when you can’t avoid putting some domain logic in SQL statements. If, for example, you want to delete a batch of objects that fit some condition, an SQL DELETE statement would be a much faster choice. In these cases, you are better off using plain SQL, but be sure to place it with other database-specific code (for example, in repositories).

Case #3: Default database values

Defaults in the database tables is another example of domain logic residing in the database. The values that an entity has by default should be defined in its code, it shouldn’t be given at the mercy of your database.

Think about how hard it is to compile such pieces of the domain logic spread across the application. It’s much better to keep them in a single place.

Summary

Most of the leaks come from thinking not in terms of the domain, but in terms of data. Many developers perceive the application they develop just like that. For them, entities are just a storage for data they transfer from the database to UI, and ORM is just a helper that allows them not to copy this data from SQL queries to C# objects manually. Sometimes, it is hard to make a mental shift. But if you do it, you will open a brand new world of expressive code models that allows for building software much faster, especially on large projects.

Of course, it’s not always possible to achieve the level of separation we want. But in most cases, nothing keeps us from building a clean and cohesive model. Most failed project fail not because they can’t fulfil some non-functional requirements. Most of them are buried under a bulk of messy code that prevented developers from changing anything in it. Any change that was committed to such code led to cascade breaks all over the application.

The disaster can be avoided only by breaking your code apart. Divide and conquer. Separate and implement.





  • GSerjo

    Nice one 🙂

  • Vlad Mihalcea

    To get the most out of a relational database you have to embrace it, not substitute it with an abstraction layer that pretends it can isolate the database entirely.

    Database default values, triggers and rules are extremely important as the database will outlast any given application life cycle. The database is also managed by the DBA, who doesn’t know of the application-level data constraints.

    Although, from a theoretical point of view it’s fine to have a fully-functional domain driven design, that’s only applicable when you store your object graphs in memory or in a graph database/document-oriented database. The moment you want to transform the entity tree-like object structure into relational algebra, you will have to make a choice. If you choose a clean oriented object domain model and don’t look after the database operations executed by the ORM tool, then the application performance will surely suffer. If you choose to leak the database into the data access layer, you can get better handle on the database operations at the price of sacrificing pure-theory object oriented modelling. In the end, it’s you choice.

    • http://enterprisecraftsmanship.com/ Vladimir Khorikov

      Hi Vlad, thank you for your comment.

      The data-driven approach to software development indeed has some benefits. Performance is one of them. At the same time, in my experience, building a domain layer and extracting all the business logic to it almost always pays off greatly.

      The database is also managed by the DBA, who doesn’t know of the application-level data constraints.

      That is true in many cases. Nevertheless, it is often considered a bad practice. Application developers should be in charge of the database they use. Having a dedicated DBA is fine, but his role should be more of a consultant, not “the main DB guy”. Here I write on this topic in more detail: http://enterprisecraftsmanship.com/2015/01/10/how-to-build-microservices-wrong/

      Also, using a database as an integration point between several applications is not a good idea in the long run either.

      • Vlad Mihalcea

        The DBA should be part of the team, indeed, but in realty few back-end engineers know very much beyond SQL-92, like window functions, recursive common table expressions, PIVOT, MERGE to name a few. Finding one developer that knows how to read a query execution plan is also challenging. Although it would be ideal to have developers skilled in DB as well, that’s more like an exception to a norm.
        We’ve been successfully using the database as an integration point, even for a large system where we were heavily using EIP as well. The enterprise integration was more like the glue code that binds and conveys the information from one subsystem to the other, but in the end the database is where the truth resides.

        • http://enterprisecraftsmanship.com/ Vladimir Khorikov

          Good points!

          I agree, a database can be successfully used as an integration point in some cases. However, problems arise when you need to implement any more or less non-trivial refactoring that relates not only to the code but also to the DB structure. The issue with the database being a single point of truth is that you cannot evolve it as often as you might want to. Such databases become rigid as there is more than one app depending on their structure.

          Essentially, the database in such situation becomes not only a data storage but a contract itself. It defines how an app should store its data to properly communicate with other apps. In my experience, it’s almost always a good idea to separate those responsibilities.

          • Vlad Mihalcea

            Every database structure change must be packed along the application code base, so that the enterprise application and the database evolve as a unit. Using FlywayDB is a good way to accommodate incremental migration scripts.

            It’s worth noting that you don’t have to use a single schema for the whole application. You might want to split the data models based on each module responsibilities and only integrate on a core schema, while modules can also use specific database schemas as well.

  • johnnywell

    Sweet 🙂