Should you Abstract the Database?
This article is inspired by a tweet that I thought would be a good topic for discussion: should you abstract your database?
Here’s the tweet itself:
At an org using Oracle our team paid attention to separation of concerns. Another team didn’t. When management told us to switch everything to DB2 by Monday, the other team spent the whole weekend doing the migration. We completed everything by Friday evening and enjoyed our weekend with family.
This is a widely held opinion among a lot of programmers. But even though it’s quite popular, it is incorrect. This phenomenon is called the false myth of abstracting the database.
It advocates for abstracting your database such that if the need arises, you are able to quickly switch away from that database to another one.
Let’s dissect this guideline and see what exactly is wrong with it.
1. Vendor lock-in
The main benefit of abstracting your database is the avoidance of database vendor lock-in. In other words, if, for whatever reason, you want to move away from your current DBMS, you can easily do that.
To enable such a move, you should design your data access layer such that it supports multiple database implementations. For that, you introduce an
IDataAccess interface and then come up with an implementation for your current DBMS.
2. Seen vs unseen
The guideline sounds reasonable at first, but when making a decision, you should consider both sides of it:
The desirable results, which are the seen part of the equation.
The unintended consequences, which are the unseen part.
You will always have both sides, no matter the decision. Not discussing the unintended consequences of your particular decision is either ignorance or lying to yourself.
In the context of abstracting your database:
The seen part is the benefits the abstraction gives you.
The unseen part is the (often hidden) cost of that abstraction.
We discussed the benefits already — it’s the ability to quickly switch from one database to another.
What are is cost of this abstraction?
The cost is all the time that it took you to enable this switch.
And this is why abstracting your database is an anti-pattern. People often appreciate the benefits of this abstraction because they are seen: the fact that the switch was easy to do is obvious to everyone.
But people rarely account for its downsides because they are unseen. Once you introduce additional complexity to enable database switch, you perceive it as a given. You just don’t have anything to compare it with, and so the additional time it took you to maintain this abstraction remains unnoticed.
But that additional time is huge. As the saying goes:
Weeks of coding can save you hours of planning.
👆👆 This is exactly how I feel when I see someone bragging about saving 2 days on a database switch.
It probably cost them weeks of additional time to enable such a quick switch. And it will cost more in the future!
Introducing an abstraction is not just a one-off activity, you will have to maintain it for the whole duration of your project, even if you never need to do another switch ever again.
3. The lowest common denominator
Another issue with abstracting your database (aside from having to maintain the abstraction itself) is that you can’t use advanced functionality present in your current DBMS.
You are bound to the lowest common denominator — the functionality that is present in all relational databases, which means you can’t unlock your database’s full potential.
4. How to do it then?
What’s the right approach when it comes to databases then?
Just follow YAGNI and work with your current database as if you are never going to switch away from it. Chances are you never will, and all your abstractions will go to waste anyway.
But what if you do need to switch?
Approach the process of switching not just as dropping another implementation of
IDataAccess, but as porting: you are porting your application from one DBMS to another.
The main tool that will help you succeed with the porting is quality integration tests: tests that don’t mock your database.
I had a couple of occasions in my past projects where we had to switch from one database to another, and a quality integration suite was the savior — we did the switch seamlessly and with no bugs, all thanks to that suite.
For more info about integration testing, check out my Unit Testing book.
5. Separation of concerns vs Abstraction
Finally, I have an issue with the wording chosen by the tweet author:
At an org using Oracle our team paid attention to separation of concerns.
He is conflating separation of concerns with abstracting the database.
Separation of concerns is just tackling your application concerns separately. In other words, not entangling your domain logic with the persistence logic. It doesn’t mean you have to use additional layers of abstraction to account for database switch in the future.
Unlike database abstraction, separation of concerns is always a must-have. You want to be able to focus on different application concerns separately (in isolation) from each other. But you don’t want to put unnecessary layers of abstraction on top of those concerns.
When making a decision, consider both sides: the (seen) benefits and (often unseen) drawbacks
In the context of abstracting your database:
The benefits are the ability to avoid vendor lock-in and quickly switch from one DBMS to another
The drawbacks are the maintenance costs of the abstraction (both one-off and on-going) and the inability to use the full set of database features.
Treat database switch as porting
To enable seamless migration from one DBMS to another, have a quality integration test suite that doesn’t mock your database
comments powered by Disqus