Integration and Unit Test Suitability
When do you write a unit test and when do you write an integration test?
The key test type should match the type of code under test. It's quite a simple formula:
- Integration code should be tested with integration tests.
- Algorithmic code should be tested with unit tests.
- Mixed code (tightly coupled algorithmic and integration code) should be tested with integration tests.
Is it really that simple?
It's commonly accepted that there are times when unit tests work very well and integration (or end to end) tests do not and vice versa. It's not commonly accepted why or when this is.
Indeed this question is mired in controversy. I think this is for six reasons:
- What is a unit test? This definition is vague and often argued over.
- What is an integration test? This definition is vague and argued over too.
- "Integration code" and "algorithmic code" are terms I needed to coin because they have no commonly used equivalent.
- A culture of dogmatism surrounds unit tests and TDD - E.g. Am I suggesting 100% test coverage? No, I’m demanding it. Every single line of code that you write should be tested. Period. -- Uncle Bob
- The so called "testing pyramid", which I belive was misleading and wrong (explained below).
For the purposes of answering this question, I will try to explain what I mean by integration code, algorithmic code and mixed code.
I will also document the definitions of unit test, integration test and end to end test which I am using.
Algorithmic code definition
Algorithmic code is code which makes decisions and runs calculations in memory. It does not interact with the outside world except via the code API calls used to invoke it. It could be stateful like a state machine or stateless like a simple method to calculate insurance premiums.
An entire project can be algorithmic code heavy (e.g. a financial model or a parser). An entire project can also be algorithmic code lite (e.g. a CRUD app).
Integration code definition
Code which integrates. This code could call REST APIs, build and runs queries on a database, present a web page to a browser, receive messages or send messages on a message queue.
All code that is not exclusively algorithmic is integration code.
Integration code is always stateful. Most businesses write a lot more of this type of code than algorithmic code, but interview tasks tend to focus more on candidates' ability to write algorithmic code.
Mixed code definition
Mixed code is tightly coupled algorithmic and integration code. An example could be a method that calculates insurance premiums using a complex formula that also makes direct calls to the database.
This is often seen as an antipattern (more on this below).
Unit Test Definition
The most boring definition I can think of is:
- Written using an xUnit testing framework (pytest, jUnit, unit.js).
- Interacts with code APIs only - creating classes, calling methods, etc. on the code base under test.
- Mocks or stubs out I/O.
Integration Test Definition
The most boring definition I can think of is:
- Written using any kind of framework (xUnit included).
- May interact directly with code APIs and may also directly with the app's interface (e.g. calling a REST API, clicking buttons on pages, etc.).
- Will probably use some kinds of I/O (e.g. reading and writing an actual file, actually calling a REST API).
- May interact with real or mock services (e.g. staging database, local database, APIs).
End to end test Definition
For the purposes of this essay, this is just another kind of integration test - testing at the boundaries of a project.
What constitutes an end to end test will vary depending on where you consider the boundaries of the project to be. For example, a REST API used by a mobile app could have an end to end test that calls its API and checks the response. Or you could say that a "true" end to end test on this project would have to test the usage of the mobile app directly and the API indirectly. The distinction does not matter though.
Is mixed code really an antipattern? Why should I test mixed code with integration tests?
Mixed code is often seen as an antipattern. This type of code is often called "untestable". This is a misnomer. It isn't actually untestable, you just can't write good unit tests for it.
It is also extremely common. If you join a new project, there is a good chance that there is a lot of mixed code.
This type of code can be tested with either unit tests or integration tests, but unit tests on this type of code will require a lot of mocks and/or stubs which will be both ugly and fragile.
Integration tests on this type of code will suffer from a different problem - they will be much slower. If there is a lot of complex decision-making or calculations going on in these methods then an integration test suite can take hours while the equivalent unit tests would take seconds.
I take the following approach to mixed code:
-
If the integration tests on this type of code are comprehensive enough and give quick enough feedback, I don't worry about it. This is very often the case. This is an approach that is in stark contrast to many others.
-
If they don't run fast enough, I write integration tests on it anyway, and over time, refactor it to be "unit testable". This could mean using [hexagonal architecture](https://en.wikipedia.org/wiki/Hexagonal_architecture_(software), which is a common pattern separating integration code from algorithmic code.
What's wrong with the testing pyramid?
The testing pyramid, coined and promoted by Google advocated the idea that unit tests are necessary for all types of code and that they should form be the majority of your test suite - largely because they are fast.
The creators of the idea do state that "even if the units work well in isolation, you do not know if they work well together". However, it assumes that units are always algorithmically complex enough to require a unit test. This part is wrong.
Many applications (e.g. CRUD apps) actually have so little algorithmic code that they do not need a single unit test, but will still benefit from integration tests.
Contrariwise many projects have so little integration code that they do not require a single integration test (e.g. some financial models with simple inputs and outputs, parsers, etc.).
Higher level tests do tend to suffer more from flakiness as they point out. Higher level tests are also sometimes hard to debug. This can be true as well. These are both tricky engineering problems but they are both soluble.
Examples of high quality integration tests
If you'd like to see examples of open source, high quality, state of the art integration tests, I've built 4 types of project here.
They are flake resistant and have built in debugging tooling to make tracking down errors as easy as it is in a unit test.