Finding good test input can be tricky. Even with loads of unit tests, bugs still get through.
Consider a wear-leveling algorithm for flash memory — it takes a series of write operations and spreads them over the flash, because individual blocks can only be written and erased so many times before they wear out. If any write sequence leads to writes concentrating in specific blocks, something isn’t working.
While dozens of tests could sufficiently exercise the algorithm’s boundary conditions, actually finding them all can be as difficult as correctly implementing the algorithm in the first place. These tests would also be overly coupled to the implementation, adding a maintenance burden if implementation details change.
The important thing isn’t that those tests’ examples continue to work, but that the algorithm works in general. Besides, it’s easy to check if the wear leveling works for any particular input — have a test mode that updates per-block write counters but doesn’t actually write, feed the input to it, and check if the counters are updating evenly. Why not just generate input and let that find the edge cases?
Similarly, many bugs result from incorrect assumptions, rather than failing to accurately implement a design. Developers writing tests for their own code usually leads to code and tests with shared expectations, so input that “shouldn’t” happen but is permitted by interfaces and types can often uncover bugs.
But how do you write tests for assumptions you don’t know you’re making?
Besides writing test cases based on specific examples, developers can also use a complementary approach, called “property-based testing”. Rather than checking the results with specific input, properties are asserted (“for any possible input, [some condition] should hold”), and a test runner searches for counter-examples. Typically, if it finds a combination of arguments that causes the property to fail, it then searches for simpler versions of those arguments that still fail, and then prints the minimal failing input.
Property-based testing stresses programs differently than tests biased by how the program “should” work. Like using fuzz testing to find crashes or security vulnerabilities, this can discover edge cases that have not been covered by unit tests. It also generates thousands of tests with just a few lines of code, so it’s a great way to get quick feedback on code that is still evolving. Since it treats the code as a black box, code changes that affect the implementation but not the external interface won’t lead to extra test upkeep.
Tools for Property-Based Testing
This style of testing is associated with a Haskell testing tool called QuickCheck, but similar libraries have been ported to other languages (such as Erlang’s QuickCheck port or triq, Clojure’s test.check, and Scala’s ScalaCheck).
I’ve been working on an implementation for C, “theft”, so I can use property-based testing on embedded code. C has very different design constraints than those other languages (in particular: no automated memory management, and a general lack of reflection about types / structs), but the overall workflow is similar — write property tests to check whether the code under test is working as expected, run the tests with generated input, and address any edge cases found.
More about theft in part two tomorrow!