System tests are a crucial piece of testing any application. I’m a big believer in isolated unit tests as well, but if I had to pick just one kind of test to use, it would be system tests. There is no substitute for actually exercising the full code base from top to bottom.
That said, I don’t know if any one thing about software development has caused me more frustration than system tests. System tests frequently involve implicit assumptions about time or synchronicity. This is especially true on a web stack where the browser’s javascript processing generally occurs in a different process or thread than the code making assertions about the state of the DOM. Hence the many hours I’ve spent investigating “wobbly” tests, usually to discover that if some sequence of independent events happen in just the right way, my test will blow up.
Why System Tests Fail
Often these failures are the result of a drift between user experience and automated behavior. For example, the test might be trying to input data in to multiple fields far faster than any human could, and each field update might trigger some asynchronous processing—or even some AJAX. Or the test failure might be because I am not asserting on the right CSS selector to guarantee that asynchronous processing is complete.
In any case, unexpected system test failures are a major pain. I’ve started to give serious thought to ways we might avoid this problem without compromising the integrity of our tests, but that’s a long blog post for another day.
Tracking Down Failures in Capybara
In the meantime, nearly any project using Capybara for system tests will need to investigate intermittent failures at some point. Usually there are three or four places I go looking for answers:
- The RSpec and/or Capybara error message and call stack at the point of test failure
- A screenshot at the time of failure, courtesy of Capybara-webkit’s save_screenshot functionality
- The HTML/DOM at the time of failure (save_page)
- Javascript console log messages
Side note: ideally, rather than #’s 2 and 3, I would be able to access the HTML/DOM with CSS. But as far as I know, no tool makes this easy.
We used Circle CI on a recent project, and from time to time there were system test failures on the cloud instance that ran the tests, and moreover these failures were hard-to-impossible to reproduce locally. Circle CI lets you ssh into the test machine where the test failed, but a) just because you ssh’ed in is no guarantee that the test will fail again when you run it, and b) you can only stay ssh’ed in if you don’t start running new tests.
If You Can’t Beat ’em, Automate ’em
Wanting to see more details than just the RSpec/Capybara error message for these failures, I automated the generation of all the artifacts in the list above any time a system test fails. The output files are named based on the test file and line number, to make them easy to match with the failing test. Since I stuck them in the “artifacts” directory for Circle CI, the web interface provided links to each file and made it really easy to inspect the artifacts and try to figure out what went wrong. In particular the presence of Javascript console messages helped elucidate some CI-only test failures that had been utterly opaque before.
Not wanting to repeat myself every time I start up a new project using Capybara, I created the autopsy gem. Simply
require 'autopsy'
in your spec_helper.rb
and you’re off. If you want to change the output directory from “./spec/artifacts”, simply set
Autopsy.artifacts_path = "./some/other/path"
Now the next time you are surprised by a time-sensitive test failure, you’ll already have all the gory details at hand when you begin your investigation. Happy sleuthing!