Mutation Testing works

Disclaimer: These are Google topics, but based on publicly available resources. I’m writing this not because I get paid for it (I don’t), but because I am truly excited about it, it’s now public, and I hope this will find more wide-spread adoption.

Goran Petrović and Marko Ivanković have published a nice paper on the “State of Mutation Testing at Google” a while ago1.

I’m a fan of this, because Mutation Testing spots real test coverage issues which regular line-based coverage won’t catch. The approach presented in the paper runs mutation testing in the context of a code review, which narrows the surfaced issues to the code you’re already working on, and keeps them actionable.


To get an idea how the approach in the paper works in practice: Let’s assume we send a code review introducing a line such as if (a == b || b == 1) {, but we didn’t bother testing the new code properly. With mutation testing enabled, the infrastructure would now catch us and report something like this (from page 2 in the paper):

Changing this 1 line to

  if (a != b || b == 1) {

does not cause any test exercising them to fail.

Consider adding test cases that fail when the code is mutated to
ensure those bugs would be caught.

Mutants ran because goranpetrovic is whitelisted.

How did it figure that out?

Mutation Testing runs on a proposed change during a code review, so the problem is defined (a) through a bunch of changes to production code and (b) implicitly through the tests exercising that code.

The mutation testing infrastructure now attempts the following:

  1. Purposefully modify program logic in the code under test (e.g. negate an if condition)2
  2. Check if any of the tests starts failing.

We messed up the code under test, so with good “logical coverage”, one of the tests should now have failed. But if all tests have still passed, we apparently have changed logic that wasn’t properly tested.

Mutation Testing is underrated

Even though it looks like it’s originally coming from a more academic corner, Mutation Testing works in practice. The examples in the paper are not made up, and they saved me from real bugs.

I hope that there will be more implementations in the wild at some point. It’s worth the effort.

  1. Google started publishing more material on internal developer tooling in the recent years. I ♥ it because I can now point to papers rather than to Google PR when people ask about it. :) ↩︎

  2. The list of implemented mutations is at the top of page 4 in the paper. ↩︎