While unit and integration testing can verify that code correctly handles known edge cases, randomized testing can detect the edge cases themselves. The programmer describes their assumptions about a function’s output and how to randomly generate representative input, then lets the test runner search for counterexamples. This is particularly useful when checking algorithms’ invariants, and when auditing software for security vulnerabilities (commonly called fuzz testing).
One of the best known implementations of randomized testing is QuickCheck, part of the Haskell standard library. It uses the Arbitrary typeclass as a hook to specify how specific data types can be generated, matches those with inferred argument types, and searches for bugs. There’s nothing specific to Haskell about the underlying idea, however — clones exist for Erlang and several other languages.
Lunatest, my testing library for Lua, also has functionality for randomized testing. Here is an example of it being used on an implementation of run-length encoding, a simple data compression algorithm that is particularly useful on embedded platforms (which usually do not have the space for libraries like zlib).
- I specified how to generate appropriate input (a sorted array of 8-bit integers, since my use case for the library involves compressing sorted, delta-encoded integer keys):
- I added two basic unit tests for encoding and decoding known input:
- I described four properties I expect to hold:
- Encoding and decoding should be lossless.
- Encoding should compress the data (with varying success).
- Encoding the same data multiple times should have rapidly diminishing returns, as almost all repeating values should be removed in the first pass.
- Encoded values should no longer have long runs of duplicate values.
- I implemented my encode and decode functions. (These are the final versions.)
- The property tests detected a couple edge cases:
- When I had exactly 2 literal zeroes, I wasn’t escaping them properly.
- I had an off-by-one error with content immediately following runs of zeroes. (Oops.)
- Finally, I added those seeds to the lossless property test, identified the underlying issues, added regression tests to document them more clearly, and made the new tests pass.
Randomized testing is complementary to other automated tests: it can find edge cases in the space outside of the test coverage. While it’s important to keep potential edge cases in mind while developing, focused random stress can bring many unexpected bugs into the light.