Testing

What is the point of having tests? Why do we write them? To prove that the code is correct (in some cases, yes)? To fill out the test coverage requirements? Or maybe we don’t write them at all? If this is a single use program or a proof of concept then the last approach is usually best. Sounds heretical but hear me out. 

Where tests shine is in code maintainability. Having tests is indispensable when writing a piece of code that is intended to live. Code that lives is code that changes. Code that has multiple people working on it. Tests give you semblance of security as to the correctness of your changes during refactors/modifications.

I once heard a description of legacy code. Code that reaches a point where no one dares to change it, becomes legacy. I think it is spot-on. Having tests extends code lifespan (all code eventually becomes legacy). Being less scared of introducing breaking changes and having runnable use cases gives people courage. You will need it when asked to modify that piece of code that haven’t been touched for the last 10 years.

Tests are also a great insight into what is the purpose of the particular code. A runnable documentation of sorts. Documentation that has to be updated as functionality changes. When going into a new code base my first reaction is usually to start my reading from the tests. 

However, it’s not all sunshine and roses.

Note: I don’t want to get into discussion as to what is a unit test, what is an integration test, etc. I have seen people fight to the death over nomenclature whose sole purpose is to ease the communication. As long as both parties understand what they are both referring to then you can call it whatever your heart desires. My preferred approach is not to call them anything specific and just state exactly what I am testing. Single method, single class, collection of classes, connection with database, rest endpoint etc.

The problem with having tests, as with all things, is that there is a way to overdo them. Many developers (myself included) would jump on the testing train (and rightly so) and would write a test for every class, every method.  This doesn’t sound so bad. After all, you have a really well tested code, right? In theory, yes. The problem is that the tests cost. Cost time. To write them, to review them, to fix them, to run them. Eventually you will reach a point where adding yet another test will bring very little additional value. This is called the law of diminishing returns. The trick is to strike the balance. There are a number of techniques that try to remedy this like the test pyramids. My experience with those techniques is that often time trying to adhere to their rules can cause us to write tests that we usually would not write, but have to, in order to fit the framework. I have experimented with multiple approaches and I want to share what has worked best for me.

I like the idea of thinking about the tests as documentation. You know, that thing that we always say that we need but no one ever writes and even if there is one then it immediately stops being up to date. Approaching tests as documentation means that they need to cover your business use cases. In my experience, when taking the DDD approach, use cases are usually scenarios that describe how a given bounded context interacts with other contexts. This means that our tests should reside on that boundary. They should run against boundary entry points. For a Microservice this would mean testing the REST/SOAP/whatever endpoints. For a modular monolith, it would mean testing against modules external APIs. 

Does that mean that there should never be tests for individual classes/methods? No. There are bound to be cases where you are delivering a functionality that has a huge number of variations, e.g. a credit score calculator. Testing such a functionality might be impractical through the module API. In such a case a test for an individual class is highly recommended, with a caveat. This class should be leaf. What does that mean? 

This is not my idea but I really like this analogy. When we look at our class dependency graph there are usually classes that are roots, i.e. nothing depends on them. Those are usually our module APIs, aka boundary entry point. This is what we test. The other distinctive classes are those that depend on no other class in our module. Those are usually our repositories, utilities, etc., aka, leaves. Why do we want our class that needs individual testing as a leaf? Exactly because it has no dependencies. We do not have to mock any other service and therefore assume how it behaves. If we had mocks and the behavior changes then we need to remember to fix them all. Without mocks we can truly test this class as a standalone functionality. In order to make our class a leaf we would usually need to refactor it out from the middle of the dependency graph, since, in my experience, such functionalities end up inside domain service classes. As an added benefit such operation usually improves our design.

In summary, we want to test our bounded context APIs, aka module entry points, aka dependency tree roots. For special functionality that needs an individual suite of tests, we extract such functionality to a leaf class and test it in an isolated manner. In my experience, this approach strikes the best cost-benefit balance. This rule of thumb is also quite simple to follow. No need for arguments regarding, if we need to test those getters with five other services mocked. As a matter of fact I have become a firm believer that the existence of a mock is a code smell and probably should be refactored.

I think any text about tests would be incomplete without the mention of test coverage and its measurement. I will not get into much detail here apart from that I am against measuring line coverage and instead much prefer measuring branch coverage. I think it allows for better test optimization. Unfortunately I have not found any good tools for measuring it, especially for large, multi module code bases. Maybe I will eventually write my own or try to fix an existing one.

As a last note, I would like to share my approach to writing tests in Java. Again, this is my preferred way. I really like how Spock (a Groovy test framework), forces the developer to use the BDD approach. I try to write all my tests using it, however there is a huge down side to it, Groovy. As mentioned before, we write tests for code that lives, changes, that will be refactored. Groovy, being a dynamically typed language, is abysmal during refactors. Due to this, I write the tests (Specifications) in Spock, however all the fixture setup is done in plain Java. This way we can get the best of both worlds. Nice, readable tests, and fixtures that can react to refactors.

There is also the subject of context bootstrapping. The most common way in Java by far is by wiring all your applications using Spring and, during tests, starting the whole context. I find this approach to be OK in a Microservice world, but still much prefer bootstrapping the dependencies manually. First of all the tests will run much faster, second I can clearly see how my dependencies interact and therefore check for smells and warning signs. Remember: Dependency Injection is not equivalent to IoC containers. This approach is pretty much the only sensible way in a large monolithic application.

I have not mentioned so called End-to-End tests, aka testing through e.g. clicking on the front-end application that is deployed on a close resemblance of production environment. This is a topic for a whole other post.

Testing