These intermittent test failures will not stand, man

I recently spent some time cleaning up a large test suite that had fallen into a bit of disrepair. The project was a Ruby on Rails web application being tested with RSpec and Cucumber. The biggest problem was that various tests would fail intermittently. Some tests would pass consistently when run by themselves, but would frequently fail when the whole suite was run. Re-running the suite once or twice would often result in all tests passing, so the non-deterministic tests were largely ignored eventually becoming “broken windows”:http://pragprog.com/the-pragmatic-programmer/extracts/software-entropy.

In his article on “Eradicating Non-Determinism in Tests”:http://martinfowler.com/articles/nonDeterminism.html Martin Fowler writes:

bq. Non-deterministic tests have two problems, firstly they are useless, secondly they are a virulent infection that can completely ruin your entire test suite. As a result they need to be dealt with as soon as you can, before your entire deployment pipeline is compromised.

I am now a firm believer that non-deterministic tests need to be addressed immediately. Here are some of the problems I ran into and what I did to make sure the tests pass every time they are run.

h3. Unique constraints with random test data

The project uses “randexp”:https://github.com/benburkert/randexp to populate fields in “Factory Girl”:https://github.com/thoughtbot/factory_girl factories. Several models have fields that require unique values, and as the test suite got larger there would occasionally be a unit test that would fail because of a duplicate entry.

To prevent this from happening I added a sequence counter to the Factory definitions for every unique field, as described in the “Factory Girl documentation”:https://github.com/thoughtbot/factory_girl/wiki/Usage

FactoryGirl.define do
  factory :product do
    sequence(:name) {|n| "#{/\w{10}/.gen} #{n}" }
    description { /[:sentence:]/.gen }
  end
end

h3. Date changes during the test run

A number of features in the application deal with dates, so quite a few acceptance tests calculate an expected date and then walk through the application looking for the expected dates. The application and tests are configured to work with these dates in UTC. Since midnight UTC is 7pm EST, it was not uncommon for the test suite to be running during the date change in UTC. A test would start running in one day, and by the time it finished the date had changed and the expectations would not be met.

I resolved this issue by using “timecop”:https://github.com/travisjeffery/timecop to set the current time to 8:00 AM UTC before each test. When doing something like this you need to be careful if your web app is running in a different process than your tests. If that is the case the timecop adjustment will only affect your tests, not your application. To update the time in the application process I added some “test-only” API calls and used “rest-client”:https://github.com/archiloque/rest-client to set the time remotely from the tests.

h3. Leaving a page before an AJAX request has completed

The project makes heavy use of client-side JavaScript and AJAX, and many of the pages retrieve additional data via AJAX after the page has loaded. Occasionally a test would fail while going from page to page trying to navigate, but not stopping to look at anything on a page. The errors that arose were cryptic and seemingly random.

These failures came up more frequently with the HtmlUnit browser used by the “Culerity”:https://github.com/langalex/culerity framework. But even after switching to “capybara-webkit”:https://github.com/thoughtbot/capybara-webkit some unexplained errors would still arise in these situations.

I was able to eliminate these random failures by waiting for pending AJAX requests to finish before navigating to another page. Usually this is done by waiting for some kind of an indicator in the DOM, as described in the “Asynchronous JavaScript” section of the “Capybara documentation”:https://github.com/jnicklas/capybara.

h3. Not waiting long enough

The test suite runs on a continuous integration server that is shared with other projects. When the load on the CI server is heavy, the whole test suite can slow down, which means something that normally takes a second or two might jump up to 5 or 6 seconds. The default wait time for Capybara is 2 seconds. That seems to work pretty well for smaller test suites, or tests run by themselves on a lightly-loaded machine, but I found it to be insufficient.

I changed the default wait time to 5 seconds.

Capybara.default_wait_time = 5

And made use of the using_wait_time feature to wait even longer for some of the slower AJAX requests in the application.

using_wait_time 10 do
  page.should have_css('#product-loaded')
end

h2. Never again

I know it can be time consuming to figure out why a test is failing randomly, but I am going to make a concerted effort to never again ignore non-deterministic tests.

I used the techniques discussed to rid my test suite of its intermittent test failures. Have you run into other problems like these, and if so were you able to get them passing every time?

Conversation

Mitch VanDuyn says:

April 2, 2016

Hey thanks so much… this has been invaluable in getting our (pretty intensive) reactrb front end tests to all work reliably

Patrick Bacon says:

April 3, 2016

That’s great to hear Mitch! Glad I could help.

Giancarlos says:

July 15, 2016

Wow!!, nice job mate, thanks for all!

Comments are closed.

These Intermittent Test Failures Will Not Stand, Man

Tell Us About Your Project

Related Posts

Remix is Incredible — If It Fits Your Use Case

Vercel: A Valuable Debugging Tool

Common CSS Pitfalls and How to Avoid Them

Keep up with our latest posts.

Tell Us About Your Project