Interfaces for repositories: do or don’t?



Today’s topic is about interfaces for repositories. Should you introduce them? Or maybe it’s better to use repositories as is? Let’s see.

Interfaces for repositories

Introducing interfaces for repositories is a common practice. Even if you don’t use the Repository pattern per se, you might have found interesting the idea of hiding operations with your database behind some kind of abstraction. The simplest example here is this (input validation is omitted for brevity):

public class UserController

{

    private readonly IUserRepository _repository;

 

    public UserController(IUserRepository repository)

    {

        _repository = repository;

    }

 

    public void VerifyEmail(int userId, string verificationCode)

    {

        User user = _repository.GetById(userId);

        user.Verify(verificationCode);

        _repository.Save(user);

    }

}

As you can see, the controller accepts an interface and uses it to load users from the database and save them back. The interface has the only implementation which is another common practice. Below is its code:

public interface IUserRepository

{

    User GetById(int userId);

    void Save(User user);

}

 

public class UserRepository : IUserRepository

{

    public User GetById(int userId)

    {

        /* … */

    }

 

    public void Save(User user)

    {

        /* … */

    }

}

What issues do you see here?

One issue is that the above interface doesn’t constitute an actual abstraction. It just duplicates the concrete class’s functionality. The Principle of Reused Abstractions tells us that, in order for an interface to become one, it needs to have more than one implementation.

The goal this practice pursues is testability. IUserRepository is more of a technical construct here and has little to do with the actual business logic. It allows us to introduce seams to our code base. Those seams help separate its fast parts (the C# code) from slow ones (the database) and test only the former.

The violation of the Reused Abstractions Principle is justified as long as the seams we’ve chosen lie at the boundaries of the bounded context we work on.

But are they?

External systems your application communicates with can be divided into two parts: those it fully controls and those it doesn’t have control over. The former systems are part of the bounded context while the latter are not. An application database (a database fully devoted to a single bounded context) is one of such systems. It belongs to your application only and not shared with anyone else. It’s part of the bounded context.

So what we have here is a seam that is not aligned with the actual application boundaries. The database resides inside the bounded context but we still introduce an interface for it so that we can mock it in tests. As I wrote previously, mocks are good for substituting communications with external bounded contexts but generally not as useful for communications inside one because of false positives they tend to introduce in such cases.

Are there ways to avoid this drawback and still have fast and reliable tests? There are.

When it comes to unit testing, one of the most powerful techniques you can (and should) apply is refactoring towards the Humble Object design pattern.

When you’ve got a class that both communicates with slow resources (such as database) and possesses important business logic, you have a dilemma. You can’t skip testing it – leaving this business logic uncovered is dangerous. And it’s hard to implement the testing – direct communication with the database will slow down the entire test suite.

What you need to do is strip this class off all the complexity, move that complexity to an isolated domain class, and test that class alone. In our case, this work is already done:

public void VerifyEmail(int userId, string verificationCode)

{

    User user = _repository.GetById(userId);

    user.Verify(verificationCode); // Code to test

    _repository.Save(user);

}

The User domain class is the one doing the “hard lifting” here, the work the repository does is just an orchestration that prepares required data and saves the results of the operation back to the database.

It’s a great example of the Humble Object design pattern in action: the cyclomatic complexity of the code inside VerifyEmail is one, meaning that it doesn’t introduce any business logic on its own. It means we can skip testing that method and not lose anything in terms of test coverage. We can still keep an integration test to check how the whole thing works together but it won’t affect performance much as it would be the only one touching the database. Any edge cases can be validated with fast unit tests.

Note that neither integration tests, nor unit tests would require seams that “abstract” the database out from the rest of the code. Unit tests just don’t involve anything other than isolated domain logic. Integration tests verify the database directly as part of the bounded context.

Note that this guideline holds even if you don’t use the Repository pattern. You could refer to ISession or DbContext in your application services directly, and that would still be perfectly fine. No need to introduce an additional abstraction layer on top of the calls to the database as long as your domain model is properly isolated. You’ll be able to test the isolated part with unit tests and the whole thing with higher level integration tests.

Summary

Interfaces for repositories are usually a sign you introduce seams inside your bounded context. This is a code smell as the only seams you should have is those that separate your bounded context from others.

Try to refactor your code towards the Humble Object design pattern instead. Introduce an isolated domain model that doesn’t touch any out-of-process dependencies. Cover it with unit tests. Cover the whole thing with a few integration tests that do touch the database.

None of the above two scenarios require you to introduce interfaces for repositories.

Related articles

Share




  • cmllamacho

    Interesting article. I’m a little confused because it sort of conflicts with some other pieces of advice I’ve seen regarding making good unit tests, specifically test public methods or the api of your component.

    In this case the Verify method belongs to a domain class, User, but lets assume the UserController needs to do something with the returned value of the repository, a use case rule that needs to be tested, but doesn’t need to be public, for example, verify that the user passed down, with a full order or something. Would you keep that small method private and test it or refactor it to a public case?

    • http://enterprisecraftsmanship.com/ Vladimir Khorikov

      The use case you are describing looks like another operation on the domain model. If so, I would introduce a new method in one of the domain classes and test it the same way as the Verify method.

  • jamesej67

    You describe a very elegant way of avoiding introducing unnecessary interfaces solely for testing purposes. I am against a profusion of interfaces in code when this is not required because they violate DRY as they repeat the signature of the classes which implement them, increasing the cost of a change to that signature. They also complicate code navigation particularly to someone who is unfamiliar with a code base, because where you see a reference to an interface it becomes harder to find the code that will actually execute, as it is elsewhere.

    Furthermore the use of an interface often falsely implies an ability to implement it in any way chosen without explicitly defining what may be implicit expectations in the way that interface is used. E.g. in this case nothing about the IUserRepository interface itself requires that just after saving a User with a given Id you will be able to retrieve that user, but the code which makes use of it will probably expect that behaviour and break if a third party chooses to implement it another way.

    • http://cv.zerkms.com Ivan Kurnosov

      DRY is about not repeating *knowledge* not signatures.

      • jamesej67

        I’m not sure how the signature shared by an interface and its implementer is not knowledge. However on the whole the compiler will ensure consistency between the interface and the classes that implement except for a few edge cases, ensuring consistency. But you still have the maintenance overhead of changing the code in multiple files to achieve a change in the signature of the interface. And the other issues I mention in my comment.

  • https://github.com/Mykezero Mykezero

    Great advice! This was something that had always bothered me when designing my code: that the domain logic is almost always wrapped between things connecting to resources outside the code.

    Like you’ve mentioned, I have been using the repository pattern as a “technical construct” to make the code more testable, but also, to avoid writing integration tests for that code.

    I don’t have a really good setup for integration tests: There’s not set of scripts to bootstrap a test database from scratch, so I’m very cautious to avoid adding potentially harmful integration tests, since they are connecting to a real resource.

    That really hurts my confidence in that my code may not work when its about to go live, since I lose that additional feedback loop that everything is working.

    I was going to ask if you had an article on Integration Testing, but it looks like there’s two of them at the end of the post!

    Thanks for taking the time to share your knowledge: I appreciate it greatly ^^

  • http://whiteknight.blogspot.com/ Whiteknight

    Your mention of the Humble Object pattern is really part of a larger discussion on the Single Responsibility. Humble Objects aren’t the goal, SRP is the goal and Humble Objects are usually the result. Treating Humble Object as the goal misses the bigger picture and might cause you to miss some opportunities to improve your code.

    Interfaces over repositories or any other class doesn’t strike me as a code smell. Interfaces serve several important roles, polymorphism is only one of these.They also support encapsulation. and isolation of dependencies. It doesn’t matter to my domain class whether my IUserRepository uses SQL Server or MongoDB or something else. Those are implementation details. Having my domain class have a direct dependency on SqlServerUserRepository or MySystem.Data.SqlServer.UserRepository instead of IUserRepository forces your domain code to care what the underlying implementation is, even if that “knowledge” is isolated to the namespace imports of the code file.

    When your domain classes start to care about the implementation of your repository, you start breaking encapsulation. If both sides of the call know what the implementation is, it’s very easy to start cheating and violating encapsulation. Now your domain starts to have some logic which depends on a particular DB implementation, and now you have much worse than a “smell”.

    An interface allows the consumer class to explicitly state “I don’t care what the implementation is, so long as the following API and workflow is supported”. Maybe we are running in a live-test mode and want to use a Null Object repository which no-ops every call, or we’re in a unit test where we want to fill in a stub or a mock, or maybe we want to wrap our IUserRepository behind a caching layer to improve performance. All of these are perfectly valid use-cases and none of them are the business of the class consuming IUserRepository.

    Saying “I don’t want to do extra work now, because I don’t know if I will need it in the future” is very different from saying “I am going to do things in a lazy way now, because this system has no future to worry about.”

    • http://enterprisecraftsmanship.com/ Vladimir Khorikov

      Domain classes shouldn’t depend on concrete implementation of Repository, that’s true. But they also shouldn’t depend on any interfaces the repository exposes. That would violate domain model isolation (i.e. separation of domain and non-domain logic).

      Interfaces used for “abstracting” database code from domain classes is a hack. It doesn’t matter if a domain class depends on the concrete implementation or its interface – it’s still a dependency they shouldn’t possess. Here I wrote more about domain model isolation:

      http://enterprisecraftsmanship.com/2016/09/01/domain-model-isolation/
      http://enterprisecraftsmanship.com/2016/10/05/how-to-know-if-your-domain-model-is-properly-isolated/

      However, application logic can depend on repositories. From encapsulation/abstraction point of view, it doesn’t matter if it depends on interfaces or concrete implementations.

      • http://cv.zerkms.com Ivan Kurnosov

        > “But they also shouldn’t depend on any interfaces the repository exposes. ”

        For that reason those belong to the domain, not to the infrastructure.

        • http://enterprisecraftsmanship.com/ Vladimir Khorikov

          They can’t belong to the domain, the domain should not be conflated with any persistence concerns.

          • http://cv.zerkms.com Ivan Kurnosov

            And it’s not “persistence concerns”, it’s an interface. It’s a dependency on a contract, not on implementation. DDD and Hexagonal Architecture are built around this idea: the dependencies come from the outer layer and are defined as contracts (interfaces), that are fulfilled by the adapters.

            Simply check http://alistair.cockburn.us/Hexagonal+architecture or any other article.

          • http://enterprisecraftsmanship.com/ Vladimir Khorikov

            It’s important to distinguish what that contract represents. The domain can define an interface and can accept implementations of that interface from the outside but the interface itself should be meaningful for the domain. Saving something to the database is not a domain’s concern, hence the corresponding interface would not be a meaningful contract from the domain’s perceptive.

            Another aspect is purity. Defining such an interface in the domain layer would make the domain model impure. Here’s more on domain model isolation (purity), exactly the case you are talking about: http://enterprisecraftsmanship.com/2016/10/05/how-to-know-if-your-domain-model-is-properly-isolated/

          • http://cv.zerkms.com Ivan Kurnosov

            > The domain can define an interface and can accept implementations of that interface from the outside but the interface itself should be meaningful for the domain

            Your opinion does not align with how it’s treated in hexagonal or ddd

  • Andreas Horwath

    This is an interesting post, but I remain somewhat skeptical. In general I am in favor of keeping things as simple as possible, and that includes introducing interfaces only when there’s a compelling reason for them, but this looks like it’s one those cases. As Mark Seeman argues in this excellent post, the loose coupling you gain from programming to an interface has a number of benefits that go well beyond the ability to use dynamic mocks in unit tests. In addition, I agree with what Ivan pointed out in an earlier comment: Accessing infrastructure functionality via an interface exposed by the domain is how it’s done in hexagonal architecture and I for one wouldn’t want to be without the benefits of that architectural pattern.

    • http://enterprisecraftsmanship.com/ Vladimir Khorikov

      Thanks for your comment.

      As you mentioned, introducing an interface shouldn’t be a goal in and of itself, this interface should help solve some particular problem. In the Mark’s post, this problem is implementing the Decorator pattern – which is indeed a very sensible reason for such an interface. In most cases, however, its introduction is not justified and thus should be avoided due to YAGNI.

      Regarding the second part of your comment. One of the most important things in DDD is keeping the domain model separate from infrastructural parts of the code. It’s not always possible to keep things perfectly isolated but that’s a good goal to have in mind. Introducing interfaces in the domain model that are not directly related to the problem domain hinders that isolation and potentially complicates the code due to mixing unrelated concerns.

      • Andreas Horwath

        Thanks for your answer. In my experience, the need for a decorator arises quite frequently when accessing infrastructure functionality (for adding cross-cutting concerns such as caching and resilience while still adhering to the SOLID principles), so this looks like a case where the introduction of an interface is warranted. And once you decide to access some infrastructure functionality via an interface, you’d better use interfaces for all of it, or else you’ll end up with messy (and possibly circular) dependencies.

        Regarding the issue of separating domain code from infrastructure concerns, I realize that you have already discussed the pros and cons of this approach in your post about domain model isolation. I have mixed feelings about this. While I like the idea of keeping the domain model functionally pure, my fear is that by moving calls to impure functions to an additional application layer, I will end up spreading my domain loginc across two layers, thereby reducing cohesion. And if given a choice, I would chose high cohesion over functional purity anytime.

        • http://enterprisecraftsmanship.com/ Vladimir Khorikov

          The fact that the need for interfaces arises in some (or even most) places doesn’t justify using them in places with no such need. That is an example of premature generalization. I understand the desire to keep the code uniform but the YAGNI principle is more important.

          This is a dichotomy, indeed. You can’t have both the domain logic “wholeness” and functional purity, unfortunately. I personally lean towards the latter but I understand the arguments for the opposite side of the spectrum.

          • Andreas Horwath

            I agree it’s largely a matter of personal preference of how to weigh functional purity against domain completeness (or high cohesion).

            However, the other question (do you need interfaces for repositories?) is a totally different beast, IMO: If you decide to forgo interfaces, you are violating the DIP, and it doesn’t really matter whether you access your repository from within the domain layer or from some application service, because those are all high-level components, while the repository is more low-level. So, in order to observe the DIP, there’s no way to get around the interface (which should be defined within the calling layer). So, what we have here is a far-reaching decision that affects the overall design (and, in particular, the direction of the dependencies), and in such cases I prefer to go for the more flexible design, even if entails some coding overhead (which will be small anyway).

            BTW, I recently came across this very interesting blog post by Steven van Deursen about the use of the command/handler pattern for modeling queries, which seems to follow the SOLID principles even more consistently than the classic repository pattern. I’d be interested in hearing what you think about it.

          • http://enterprisecraftsmanship.com/ Vladimir Khorikov

            The pattern Steven van Deursen describes in his blog is one of the most wide-spread implementations of CQRS. Jimmy Bogard’s MediatR takes a very similar approach. It does help with SRP, especially when repositories become too bloated.

            But getting back to the initial discussion. Interfaces have nothing to do with the DIP. The DIP specifically says that higher level abstractions should not depend on abstractions of lower levels. An interface is a purely technical structure, it may or may not represent an abstraction. Just as a class may or may not represent it, this depends on the context. Here’s Mark Seemann’s blog on this topic: http://blog.ploeh.dk/2010/12/02/Interfacesarenotabstractions/ So as long as a class represents an abstraction meaningful to the code using it and it’s an abstraction of higher level, this code does follow the DIP.

          • Andreas Horwath

            Just because some interfaces are poor abstractions doesn’t mean we should ditch them altogether. In fact, Mark specifically says (in the comments section): “Good abstractions will still be interfaces (or base classes)”. So let’s take his advice and work towards better abstractions. I realize there may be reasons for not introducing an interface at all, but in such a case we are no longer coding against an abstraction (which may be a conscious decision, of course).

          • http://enterprisecraftsmanship.com/ Vladimir Khorikov

            I’m not suggesting getting rid of interfaces altogether. I’m against extremes on both sides. I agree that the decision should be made on a case by case basis.

          • http://enterprisecraftsmanship.com/ Vladimir Khorikov

            Here’s a post where I also wrote on this topic BTW: http://enterprisecraftsmanship.com/2015/06/02/interfaces-vs-interfaces/

  • http://pyranja.github.io/ Chris Borckholder

    This approach reminds me functional programming principles, where the domain model would consist of pure functions, while database interactions are impure. Then the type system would actually prevent db access from inside the domain model – leading to the code style you showed. Impure interactions are confined to the application level (controllers), which may tap into the domain model to perform pure computations.

    Thank you for this insight.

    • http://enterprisecraftsmanship.com/ Vladimir Khorikov

      That’s exactly how I see it too. There’s a lot of benefits in combining together the concepts from FP and DDD.

  • Trevor Higbee

    I’m not sure I understand. Can I re-state in my words? You’re saying that VerifyEmail() is so trivially simple that we don’t need to unit test it. We just need to make sure the user.Verify() is covered. And because this the repository is used in this trivially simple function, and because this function isn’t under unit tests, we never need to mock out the repository. And if you want to test the entire interaction that’s orchestrating the calls to VerifyEmail() and other calls, just use an integration test that actually hits the database inside your bounded context. Is that about right?

    • http://enterprisecraftsmanship.com/ Vladimir Khorikov

      That’s not what I meant. Here’s how I would re-phrase it:

      1) You don’t need to cover the whole VerifyEmail() with unit tests if you follow the Humble object pattern. Only the user.Verify() part of it.
      2) If you want to check how VerifyEmail() works with the database, don’t use mocks, use the real database instead (integration testing). That will provide better security.

      • Trevor Higbee

        Thank you.