The Cost of Unit Testing

Article summary

Tests We Can All Agree On
Risky Business
Now What?

How far should we take unit testing? Should every line of code be covered by a unit test? What about code that’s hard to test? Let’s look at the cost and value of unit testing in a couple of different situations.

Tests We Can All Agree On

It’s easy to see the value of unit tests when we’re writing actual functions. Let’s say we’re writing a helper function called next_valentines_day for our greeting card application. Here are some tests that we might write:

describe next_valentines_day do
  it "returns next year's valentine's day when after Feb 14" do
    expect(next_valentines_day('2014-02-15')).to eq('2015-02-14')
  end
  it "returns same year's valentine's day when on Feb 14" do
    expect(next_valentines_day('2014-02-14')).to eq('2014-02-14')
  end
end

Even from just these two tests, we we’ve significantly reduced the risk of calculating the wrong Valentine’s day in our software. And that means we’re less likely to lose valuable time fixing bugs in the code or dealing with other fallout from the code being wrong (like decreased greeting card sales).

From these tests, we might come up with this implementation in Ruby:

def next_valentines_day(date_str)
  date = date_for(date_str)
  same_year_valentines_day = date.with_month(2).with_day(14)

  if date <= same_year_valentines_day
    same_year_valentines_day
  else
    same_year_valentines_day.add_years(1)
  end
end

This implementation works for now, but in the future we may decide to change it. We may want to swap out the underlying date library. We might discover a performance issue that needs to be fixed. We may want to extract some code that's been duplicated into a helper function. But none of these changes will invalidate the tests that we've written because we're not changing the actual behavior of the function.

Risky Business

Sometimes, we find ourselves writing code like this:

renderStatusBar: function() {
  var statusBarView = this.getStatusBarView();
  statusBarView.render();
}

This code is responsible for displaying the proper status bar on a web page. It doesn't directly display the status bar, but it instead delegates the rendering to a separate view class. Because it has this side effect, it's not a proper function like our date helper above.

How would we unit test this? How about this:

var mockStatusBarView = sinon.mock(new SnazzyStatusBarView());
sinon.stub(subject, 'getStatusBarView').returns(mockStatusBarView);

subject.renderStatusBar();

mockStatusBarView.expects('render').once();

Unit testing with Sinon.JS

First, we stub out our private getStatusBarView method to return a mock object. Then we call our method under test. Then we verify that the render method on our mock object was called. (All together, this test verifies that our method calls render on whatever getStatusBarView returns, which is exactly what we wrote it to do.)

But writing tests like this is risky.

First, we risk losing time writing the test in the first place. It's tricky to properly stub, mock, and spy on the internals of a method. Dealing with asynchronous side effects can be even harder, and we might not even realize the side effects asynchronous until the test begins to sporadically fail.

Second, if we decide to refactor a method like this, our test will fail even though we haven't broken any user-level functionality. We risk losing a large amount of time fixing broken tests like this, and the risk becomes greater as we add more tests.

On the other hand, we may avoid refactoring because we're aware of this high cost of test maintenance. In this case, we risk our code becoming stale and even more difficult to work with.

Now What?

The root problem is that we tested implementation when we should have tested behavior. We asserted that render was called on a mock object. But we don't actually care what method was called on what object. We don't even care whether objects or methods are used at all. We do care that the correct status bar is shown on the page, but we never actually check that.

We forced ourselves into testing the implementation of renderStatusBar because we erroneously defined a test unit. We thought that because we created a Javascript class with some methods, we should create a corresponding test class. However, renderStatusBar makes no sense as an isolated test unit. Instead, it's a method that's tightly coupled to another class.

To fix our problem, we test units that make sense, instead of units that don't make sense. Integration tests use the full front-to-back stack as a unit and allow us to test behavior. They would likely adequately test our rendering method.

Because integration tests are slower and sometimes less reliable, we may want another type of test. We can create an actual unit test that directly calls renderStatusBar without stubbing any of its methods. The test would rely on all of the subviews.

Conversation

Pat Shaughnessy says:

May 28, 2014

Thoughtful article; thanks!

I would turn this around: Instead of deciding which functions are worth writing unit tests for, write more pure functions in the first place. This isn’t always possible (especially with JavaScript when you’re often highly coupled to the HTML document/DOM) but by writing more functional code in the first place you make it much easier to test and maintain your code later. In fact, this is one of the (maybe the most important) reasons for using TDD… you end up with more code that’s easy to test, since you force yourself to write those tests when you design the target code.

Ben Nash says:

Thanks Pat! I agree that making our code as functional as possible is the way to go.

Comments are closed.

Article summary

Tests We Can All Agree On

Risky Business

Now What?

Related Posts

Want Better Design-to-Code Results from Your Coding Agent? Start with a Design System.

Human Habits for Leading Agentic Development Projects

Is GitHub Actions Putting Your Software at Risk?

Keep up with our latest posts.

Tell Us About Your Project