There are many ways to write code and tests.

You can write code with tests, without tests, tests first, tests last, tests sometimes, tests on Tuesdays, tests when you have the time, and on and on.

Because there are so many different ways of writing tests, the quality of tests also varies.

We have tools in modern languages to tell us how much of a given piece of code is exercised by tests. This is known as code coverage or test coverage and often results in a percentage, e.g. “This method is 87% tested.”

These percentages can be useful, but can also be problematic because they treat every code path with equal weight. All code paths are not created equal, and some naturally have more business value than others. What if the 20% of your code that was not covered was the 20% that handled billing?

So code coverage as a raw number is useful but not necessarily something to optimize for. For those that practice it, test-driven development often results in very high code coverage because code only gets written to cover failing tests.

But not all code is created equal. Before we begin refactoring some code, we need tests in place to make sure the changes we make do not cause regressions. There’s a useful tactic for doing this, inspired by a throwaway line on a Destroy All Software screencast.

It’s called Counting the Contexts.

First, a background in RSpec and context blocks

This post assumes that you are using a Behavior-Driven Development framework like RSpec or Mocha. We assume BDD here because the tests are written in a different style of test writing, where “contexts” are named, nested, and reused.

In an xUnit or MiniTest style of testing framework, you will see tests written like this:

class TestUser
  def test_full_name_includes_the_middle_initial_when_middle_name_is_given
    user = User.new('Roy', 'Gee', 'Biv')
    assert user.full_name == 'Roy G. Biv'
  end
end

Because we have only 2 primitives (classes and methods) here that are difficult to nest, you will often see lots of context jammed into method names or a big setup function. Long method names are tedious to read, and large setup functions are less than ideal because each test often will not need everything that was setup in the test.

RSpec and other BDD-style frameworks let you nest contexts to provide structure to the code you write. Let’s see how that same test looks in RSpec:

RSpec.describe User do
  describe '#full_name' do
    subject do
      described_class.new(first_name, middle_name, last_name)
    end
    let(:first_name) { 'Roy' }
    let(:last_name) { 'Biv' }

    context 'middle name is given' do
      let(:middle_name) { 'Gee' }
      it { is_expected.to eq('Roy G. Biv') }
    end
  end
end

We have more lines of code here, but we also have some more structure. It’s very clear where we would add the test for when a middle name is not given. That new context block would only specify what changed and the new assertion.

Let’s see how these both look when we write a test for no middle name. (Nevermind that the API of passing nil into a positional parameter is pretty crappy):

class TestUser
  def test_full_name_includes_the_middle_initial_when_middle_name_is_given
    user = User.new('Roy', 'Gee', 'Biv')
    assert user.full_name == 'Roy G. Biv'
  end

  def test_full_name_does_not_include_middle_initial_when_no_middle_name_is_given
    user = User.new('Roy', nil, 'Biv')
    assert user.full_name == 'Roy Biv'
  end
end

vs. RSpec:

RSpec.describe User do
  describe '#full_name' do
    subject do
      described_class.new(first_name, middle_name, last_name)
    end
    let(:first_name) { 'Roy' }
    let(:last_name) { 'Biv' }

    context 'middle name is given' do
      let(:middle_name) { 'Gee' }
      it { is_expected.to eq('Roy G. Biv') }
    end

    context 'middle name is not given' do
      let(:middle_name) { nil }
      it { is_expected.to eq('Roy Biv') }
    end
  end
end

So we’ve added the same number of lines of code as the MiniTest example, but the RSpec one has less duplication both within the method body and the method name. Because there is less duplication, it’s very easy to see exactly what is different between the two tests. Our primate brains do not need to engage reading skills, and instead can just glance as shapes. Being able to quickly see what is being tested makes for a great test.

Now that we’re past some contrived examples of 2 different types of testing, let’s see how these context blocks help use spot-check the test coverage.

Let’s Count

Let’s take a look at a more complicated example, based on something I’ve seen in the wild:

class FrequentFlier
  def build(customer, starting_balance = nil, status_level = nil)
    starting_balance = promotion_starting_balance || starting_balance || 0
    status_level = promotion_starting_status || status_level

    # Family members of airline employees get so many perks!
    if customer.related_to_employee?
      starting_balance += 10_000

      if ![:diamond, :platinum].include?(status_level)
        status_level = :gold
      end
    end

    {
      account_number: AccountNumber.generate,
      balance: starting_balance || 0,
      status_level: status_level || nil,
      miles_multiplier: customer&.miles_multiplier || 1
    }
  end
end

Alright, that looks sufficiently fabricated but shares the same shape as the code that inspired this post. Now let’s look at the corresponding test:

RSpec.describe FrequentFlier do
  subject do
    described_class.build(customer, starting_balance, status_level)
  end

  let(:customer) do
    build(:customer, related_to_employee: related_to_employee)
  end
  let(:starting_balance) { 1_000 }
  let(:status_level) { nil }

  context 'customer is related to an employee' do
    let(:related_to_employee) { true }

    it do
      is_expected.to eq({
        account_number: an_instance_of(AccountNumber),
        balance: 11_000,
        status_level: :gold,
        miles_multiplier: 1
      })
    end
  end

  context 'customer is not related to an employee' do
    let(:related_to_employee) { false }

    it do
      is_expected.to eq({
        account_number: an_instance_of(AccountNumber),
        balance: 1_000,
        status_level: nil,
        miles_multiplier: 1
      })
    end
  end
end

Now, let’s Count the Contexts: 2. The key to Counting the Contexts is that we want the number of contexts to match the number of conditional branches within the code. At first glance, it looks like we’re in pretty good shape: There’s only a single if block.

But after squinting at the code a bit, we realize there are many gremlins. Powerful language features are unfortunately also places to hide conditional logic. Default parameters, conditional assignment operators (||=), and safe navigation operators (&.) in languages are all things that conditionalize the execution of code. They can be tough to spot, especially in long methods, because they are not as obvious as an if ... else.

For us to achieve 100% coverage of this method, we need to cover all of its conditional branches. Before refactoring an important piece of code, I will try to bring the test coverage up to 100%.

Now, let’s count how many context blocks there should be:

  1. starting_balance default parameter sets it to nil
  2. status_level default parameter sets it to nil
  3. starting_balance reassignment to a promotional balance
  4. status_level reassignment to a promotional level
  5. customer is related to an employee
  6. customer is related to an employee and has a :platinum or :gold status level
  7. customer responds to miles_multiplier
  8. customer does not respond to miles_multiplier

Our new total number of contexts: at least 8. Our existing tests only have 2. A code coverage tool without intraline analysis would say we were 100% covered.1 In reality, we are closer to 25% covered. Lordy.

Before making any changes to this method, we will want to make sure that we augment the existing test cases to bring this up to a safer level.2 I’ll leave adding to the test cases an exercise for the reader.

Conclusion

Hopefully, you find this technique useful to use when examining existing code. It’s a simple way to gut-check a test coverage metric.

Over time, it’s easy to get good at this was well. Count the contexts, then count the conditional paths through the code. If they are way off, you probably need more tests.

I don’t usually see this type of context drought in code that has been TDD-ed. Instead, the Design Pressure is felt quickly when writing so many context blocks. More than 3 or 4 context blocks for a single method and I start thinking that this method has too many responsibilities.


Special thanks to Alex Navasardyan, Ben Vinegar, Marc Köhlbrugge, Justin Duke, and Nathaniel Watts for providing feedback on early drafts of this post.

  1. I assume that a code coverage tool with intraline analysis should be possible, but I have not seen one. If you know of any, drop me a line. 

  2. Alex Navasardyan pointed out that in languages like Elm and Scala, the code simply won’t compile if every conditional case is not accounted for.