Interfaces for repositories: do or don't?
Today’s topic is about interfaces for repositories. Should you introduce them? Or maybe it’s better to use repositories as is? Let’s see.
Interfaces for repositories
Introducing interfaces for repositories is a common practice. Even if you don’t use the Repository pattern per se, you might have found interesting the idea of hiding operations with your database behind some kind of abstraction. The simplest example here is this (input validation is omitted for brevity):
public class UserController
{
private readonly IUserRepository_repository;
public UserController(IUserRepository repository)
{
_repository = repository;
}
public void VerifyEmail(int userId, string verificationCode)
{
User user = _repository.GetById(userId);
user.Verify(verificationCode);
_repository.Save(user);
}
}
As you can see, the controller accepts an interface and uses it to load users from the database and save them back. The interface has the only implementation which is another common practice. Below is its code:
public interface IUserRepository
{
User GetById(int userId);
void Save(User user);
}
public class UserRepository : IUserRepository
{
public User GetById(int userId)
{
/* ... */
}
public void Save(User user)
{
/* ... */
}
}
What issues do you see here?
One issue is that the above interface doesn’t constitute an actual abstraction. It just duplicates the concrete class’s functionality. The Principle of Reused Abstractions tells us that, in order for an interface to become one, it needs to have more than one implementation.
The goal this practice pursues is testability. IUserRepository
is more of a technical construct here and has little to do with the actual business logic. It allows us to introduce seams to our code base. Those seams help separate its fast parts (the C# code) from slow ones (the database) and test only the former.
The violation of the Reused Abstractions Principle is justified as long as the seams we’ve chosen lie at the boundaries of the bounded context we work on.
But are they?
External systems your application communicates with can be divided into two parts: those it fully controls and those it doesn’t have control over. The former systems are part of the bounded context while the latter are not. An application database (a database fully devoted to a single bounded context) is one of such systems. It belongs to your application only and not shared with anyone else. It’s part of the bounded context.
So what we have here is a seam that is not aligned with the actual application boundaries. The database resides inside the bounded context but we still introduce an interface for it so that we can mock it in tests. As I wrote previously, mocks are good for substituting communications with external bounded contexts but generally not as useful for communications inside one because of false positives they tend to introduce in such cases.
Are there ways to avoid this drawback and still have fast and reliable tests? There are.
When it comes to unit testing, one of the most powerful techniques you can (and should) apply is refactoring towards the Humble Object design pattern.
When you’ve got a class that both communicates with slow resources (such as database) and possesses important business logic, you have a dilemma. You can’t skip testing it - leaving this business logic uncovered is dangerous. And it’s hard to implement the testing - direct communication with the database will slow down the entire test suite.
What you need to do is strip this class off all the complexity, move that complexity to an isolated domain class, and test that class alone. In our case, this work is already done:
public void VerifyEmail(int userId, string verificationCode)
{
User user = _repository.GetById(userId);
user.Verify(verificationCode); // Code to test
_repository.Save(user);
}
The User
domain class is the one doing the "hard lifting" here, the work the repository does is just an orchestration that prepares required data and saves the results of the operation back to the database.
It’s a great example of the Humble Object design pattern in action: the cyclomatic complexity of the code inside VerifyEmail
is one, meaning that it doesn’t introduce any business logic on its own. It means we can skip testing that method and not lose anything in terms of test coverage. We can still keep an integration test to check how the whole thing works together but it won’t affect performance much as it would be the only one touching the database. Any edge cases can be validated with fast unit tests.
Note that neither integration tests, nor unit tests would require seams that "abstract" the database out from the rest of the code. Unit tests just don’t involve anything other than isolated domain logic. Integration tests verify the database directly as part of the bounded context.
Note that this guideline holds even if you don’t use the Repository pattern. You could refer to ISession or DbContext in your application services directly, and that would still be perfectly fine. No need to introduce an additional abstraction layer on top of the calls to the database as long as your domain model is properly isolated. You’ll be able to test the isolated part with unit tests and the whole thing with higher level integration tests.
Summary
Interfaces for repositories are usually a sign you introduce seams inside your bounded context. This is a code smell as the only seams you should have is those that separate your bounded context from others.
Try to refactor your code towards the Humble Object design pattern instead. Introduce an isolated domain model that doesn’t touch any out-of-process dependencies. Cover it with unit tests. Cover the whole thing with a few integration tests that do touch the database.
None of the above two scenarios require you to introduce interfaces for repositories.
Related articles
Subscribe
Comments
comments powered by Disqus