QuickCheck in Ruby

Scott recently posted about the theft property-based testing library in C. Theft is very similar to Haskell’s QuickCheck. Theft tends to have a little less magic than Quick Check for generating random input and for failure case reduction. Theft makes them more manual, but also gives you better control of how they work. The theft ruby gem is a direct port to allow the same kind of testing in Ruby. Let’s take a look at the value in property-based testing and walk through a contrived example in Ruby.

Testing complicated algorithms can be, well, complicated. Typically, a developer will try to reason about their code and come up with a representative set of test cases to exercise the normal flow of the algorithm as well as all of the edge cases. This approach can leave holes in test coverage and brittle tests to maintain. That leads us to property-based testing. Basically, generate valid, randomly generated input and validate that a particular property is true.

Scott’s post does a good job of showing a real world example on the heatshrink compression library. I’ll use a contrived example to show how the theft gem can be used in Ruby.

Ruby Property-Based Testing Example

Let’s say we’ve written a new sorting algorithm, and we’d like to make sure it’s a stable sorting algorithm. A stable sort simply means that any elements that sort equally maintain their original order once sorted. Our Sorter.sort method is all setup to sort an Enumerable of BoxedFixNums. We box them because Ruby FixNums are global, and 5 is actually the same instance as any other 5, making the stable sort issue moot.

To use theft, we need to tell it about our property, a lambda that returns :pass / :fail, and our arguments. For each argument we must provide a descriptor that fulfills these requirements:

  1. setup generates the input based on the random number generator
  2. shrink attempts to shrink the input
  3. to_s prints out the input when a failure is detected
  4. hash hashes the input for results caching purposes

Generate Random Data

Each argument to the function under test needs a way to generate random inputs based on a random number generator. Theft maintains its own random number generator (RNG) to avoid interaction with the code under test. Always use the RNG that is passed into your descriptor. Spurious calls to rand() can cause theft to give different or inconsistent results. Our sorting example only requires a list of BoxedFixNums to be generated:

class BoxedFixNumArgDescriptor
  def self.setup(rng, args={})
    [].tap do |ar|
      rng.rand(0..20).times do |i|
        ar << b(rng.rand(0..100))
      end
    end
  end
end

Validate Property

We need to make sure that our property holds true. Let's define a lambda that takes our generated input. Our function only needs one input, so the lambda only takes one argument. We return :pass if the sort is stable and :fail if it is not.

  property_sorting_should_be_stable = lambda do |generated_arg| # , other_generated_arg, etc 
    sorted = Sorter.sort(generated_arg)

    sorted.chunk{|item| item.n}.each do |n, items|
      if items.size > 1
        # lazily spot check
        first = items.first
        last = items.last

        if generated_arg.index(first) > generated_arg.index(last)
          return :fail unless sorted.index(first) > sorted.index(last)
        else
          return :fail unless sorted.index(first) < sorted.index(last)
        end
      end
    end

    :pass
  end

Failure Reduction

The best feature of theft is how it can reduce a failure case to its minimal input to help you diagnose the bug. Theft is searching through the different possibilities for simpler input that still causes the failure. To accomplish this, each argument descriptor needs a shrink method that is given an input and tries a number of tactics to shrink the input. Theft will try each tactic until shrink returns :dead_end or the input no longer fails the property test, then it increments the tactic. Once all tactics have been tried, indicated by returning :tried_all_tactics, theft will print out the shrunken input set. A Descriptor must always shrink the input, return :dead_end, or return :tried_all_tactics. Failure to do so may cause infinite loops. Let's look at our BoxedFixNumArgDescriptor shrinking a list of BoxedFixNums:

class BoxedFixNumArgDescriptor
  def self.shrink(list, tactic)
    size = list.size
    if size < 2
      return :dead_end 
    end
    case tactic
    when 0 # first half
      return list[0..size/2]
    when 1 # second half
      return list[size/2..-1]
    when 2 # drop first
      return list[1..-1]
    when 3 # drop last
      return list[0..-2]
    when 4 # drop middle
      mid = size/2
      return list[0..mid-1] + list[mid+1..-1]
    end

    return :tried_all_tactics
  end
end

Put it All Together

Here's the output of our run. I've left in the initial failure to emphasize the input reduction:

  t = Theft::Runner.new autosize: true
  property_sorting_should_be_stable = ... # see above
  config = {
    description: "sorting should be stable",
    property: property_sorting_should_be_stable,
    arg_descriptors: [BoxedFixNumArgDescriptor],
    trials: 30
  }

  t.run config

Output:

ruby examples/stable_sort.rb =>
seed: 1408724650
.failed on trial: 1
[66] 26,14,63,70,23,100,38,64,41,92,74,88,28,38,79,76,3,3,90,80,81,61,54,19,39,66,31,4,11,100,43,53,15,96,23,16,25,99,13,76,19,1,69,41,44,52,60,17,16,60,76,8,84,76,24,16,53,51,18,0,69,79,18,88,89,59
shrunk to
[3] 76,3,3
...

Next Steps

This is an early release with a lot of potential. Feedback and pull requests are very welcome.

  • command line utility
  • cleaner DSL
  • rspec integration

Overall property-based testing can help get better test coverage and to create a flexible, less brittle test suite. Check out the theft gem next time you want to write property based tests in Ruby.