Leaking domain knowledge to tests

Leaking domain knowledge to tests

Leaking domain knowledge is another quite common anti-pattern.

You can often see it in tests that aim at complex algorithms. Let’s take this SUT for example:

There are not many ways in which you could test this (admittedly, not that complex) calculation logic. The first one that comes to mind is this:

Take two values, calculate the expected result and then compare this result with what Calculator returns. You could even realize that this test is a good candidate for parametrization and throw a couple more use cases at almost no additional cost, like so:

This is an obvious solution and it looks good at first glance. However, this is an anti-pattern. We are not testing the calculation algorithm here. What we are doing instead is we leak this algorithm from the SUT to the test itself.

Look at this line once again:

It is the same line Calculator uses to produce the result:

What the test does is it duplicates Calculator‘s internals. It possesses the knowledge about how the SUT should be implemented.

Of course, it might not seem like a big deal; after all, it’s just an addition operator. But that’s only because the example itself is rather simplified. I saw tests that covered quite complex calculation algorithms and did nothing but re-implemented those algorithms in the Arrange part. Basically, someone duplicated them almost in full by copy-pasting the code of the SUT.

The problem with this approach is that such tests don’t verify anything. It’s impossible to find any issues with the SUT when the tests fully reproduce its content. Relying on such tests is not that different from relying on tautological assertions like this:

Which is pretty much as useful as having no assertions at all.

Some people argue that there is some value in these tests. At least they solidify the implementation and protect you from introducing any bugs to it. After all, wouldn’t such a test turn red if you break something in the calculation logic?

It would. However, it doesn’t make them valuable. There are four components that, when multiplied together, determine the value of a test:

• High chance of catching a regression bug.
• Low chance of producing a false positive.
• Fast feedback.
• Low maintenance cost.

And by “multiplied” I mean in a mathematical sense. That is, if a test gets zero in one of the above components, its value turns to zero as well. Therefore, in order to be valuable, this test needs to score at least something in all four categories.

How would you describe the Calculator test in terms of the categories I brought above? It scores pretty high in the 1st, 3rd, and 4th categories. It does have a high chance of catching a regression error. It’s also fast and doesn’t require much effort to maintain.

However, it scores almost zero in the 2nd category because it doesn’t have any chance segregating legit failures from false positives. And why would it? All this test can tell is whether the SUT implements its logic exactly as it used to. If there’s any change in that logic, the test will point it out regardless, even if it’s merely a bug fix.

Assume the following situation. Let’s say you change the complex algorithm and the test turns red. What would you do? Most likely, you will just copy the new version of that algorithm to the test and be done with it, you won’t spend your time pondering why exactly it failed. Which is completely understandable because the test doesn’t have an authority here. It’s a mere duplication of the current state of the algorithm.

Avoid leaking domain knowledge to tests

So, how to avoid leaking domain knowledge to tests, then? You need to stop implying any specific implementation when writing tests. That’s by the way generally a good practice. The more you treat the SUT as a black box, the more resilient to refactoring your tests become.

But how to do that, assuming a complex calculation algorithm? Simple: hard-code the results of its calculation into the test.

do this:

It might seem counterintuitive but hard-coding the results is a good practice when it comes to unit testing. (In fact, not only unit testing but that’s outside the scope of this article.)

The trick here is to have them pre-calculated using something other than the code of the SUT itself. Ideally, ask someone proficient in this topic (a domain expert) to calculate them manually and then use those results when testing your implementation. Of course, that’s only when the algorithm itself is complex enough (we are all experts in summing up two digits 🙂 ).

Alternatively, if you refactor some legacy application, use the legacy code to produce those results for you and then use them as expected values in your tests. I think you get the idea: obtain those pre-calculated results using any means other than invoking the SUT. Otherwise, it wouldn’t be any different from copying the SUT’s implementation directly to the test.

Another, more sophisticated way to test an algorithm is to use property-based testing. That’s a separate big topic, though, so I won’t be diving into it here. But keep in mind that it’s an option too.

Summary

• Leaking domain knowledge to tests is an anti-pattern because of false positives such tests generate.
• This anti-pattern is common when testing a complex algorithm.
• Instead of duplicating the domain logic, pre-calculate the algorithm with the help of a domain expert of an external software and hard-code the results into your tests.

If you enjoy this article, check out my Pragmatic Unit Testing training course.

Related articles

• http://paulwheeler.com Paul

Nice article. I think the Add(1, 3) in the last example should be Add(value1, value2)

Indeed. Fixed it, thanks

• http://www.duanewingett.info Dibley

I think you should do a short Pluralsight course on these testing nuggets!

Interesting idea! Would make for a nice short course, indeed.

• http://www.duanewingett.info Dibley

Well i look forward to seeing it in a few months then.

• Stephan

Thanks for the article. I really like the following: “And by “multiplied” I mean in a mathematical sense. That is, if a test gets zero in one of the above components, its value turns to zero as well.”

Thank you!

• Luís Barbosa

Hi Vladimir, first and foremost, an excellent article. It would be great if you could write some topics about property-based testing.

Not that I have a lot of experience with it but I’ll see what I can put together.

• Luís Barbosa

Even something introductory would be excellent. Since I am suggesting, characterization testing would be another interesting topic.

• Anders Baumann