Code Coverage Considered Harmful

Code coverage is a measure of what proportion of a code base is executed as part of test suite. At best it is useful tool for writing tests, but at worse it falls victim to Goodhart’s law and incentivises bad or useless tests just to get the number higher.

The Map Is Not The Territory

Code coverage is a very easy concept to explain and display to management. This makes it beguiling to use as metric for test quality and coverage, but just because code is executed by a test, that doesn’t mean the code is being tested. All that code coverage actually guarantees is the proportion of the code base which doesn’t crash the test runner, everything else isn’t captured by code coverage.

Code is only tested when the expected behaviour is checked against the actual behaviour. Just running code without checking what it actually does is almost useless, yet that is what code coverage measures.

Code coverage vs quality tests

When a measure becomes a target, it ceases to be a good measure.

Marilyn Strathern, ‘Improving Ratings’ Audit In The British University System

Above I explained why code coverage doesn’t tell you much about test quality, but it is worse than that, code coverage incentivises bad tests.

First off, what makes a good test? A good test:

  • Is tied directly to specific project requirements, preferably only one.
  • Tests as small a slice of the code base as possible, to make tracking down bugs easier.
  • Checks the actual behaviour of the code against expected behaviour, failing if they don’t match.

Code coverage encourages the exact opposite kind of tests. The bigger the bite of the code base each test takes, the faster code coverage goes up. In addition, code coverage doesn’t take into account actually testing code in a test, all it measures is if the code is run.

Obviously code coverage doesn’t prevent good tests from being written, but test engineers are almost human and if your performance is being measured in code coverage not test quality then the temptation is always there.

How to use code coverage, and what to use instead.

Despite all this, code coverage does have its uses. The important information from code coverage tools is exactly which parts of the code base are yet to be tested at all, not the overall percentage. Knowing what parts of the code base haven’t been touched by tests is useful for people writing tests.

As alternative to measuring test coverage, I propose tracking requirements coverage. That is, what percentage of the requirements have associated tests. This requirements coverage can then be combined with code coverage to reveal missing or implicit requirements and unnecessary code. Tracking requirements coverage isn’t a straightforward as code coverage but tagging each test with the associated requirement(s) would work.

Conclusions

  1. Never use code coverage as a measure of test quality
  2. Hide the actual percentage covered from anyone too pointy-haired to understand the limitations of what code coverage tells you.
  3. Instead of tracking code coverage, track requirements coverage.