3 Comments

Real-time Generation of Numeric Data Fixtures in IronPython

Gates to Globe Theatre, LondonRecently I’ve been working on a .NET app that communicates with multiple internet end points. Each end point is a data collector, and numeric data is retrieved over these connections in real time.

It is important for the app to be able to do some analysis of the data, such as minimum and maximum value, or the rate of change of a value. In order to write meaningful system tests over this functionality, I needed to simulate the data that will be generated by the end points.

Weighing My Options

There are a number of ways one could go about this. For example, the numeric data could be stored as binary-file fixtures and split up into “frames” for each response. However, this approach is problematic for two reasons. First, if the “frame rate” of the simulated end point fluctuates substantially, either the data output will get out of sync with the intent, or the code that is sending the fixture data has to be smart enough to interpolate frames. Second, it is non-trivial to create the amounts and variety of data required for sufficient testing, to the point where you would very likely want to be able to automatically generate the data given certain rules. By the time you are doing that, you might as well have your data simulator move up a level of abstraction.

I decided to create a simulation engine that can generate these responses. The engine builds a response from the results of a number of functions f(x), where x is time and the return value is whatever type of data is expected in a given response, generally a scalar like a float or int but also sometimes an array.

Because I wanted to be able to define or alter these functions in my data fixtures, I needed some degree of scripting. It’s probably possible — with a lot of effort — to compile external C# code inside a pre-compiled project. However this didn’t seem like a lot of fun, and moreover, it seemed like overkill for the simple little math functions I wanted to write. Another option would have been to write a parser & script engine that only handled one or two functions. However, I know from experience that developing a custom scripting engine is a bad idea. No matter how small it is intended to be, a scripting engine is one of those tasks grows and grows into a behemoth, into a giant Katamari ball of code.

Using IronPython

I decided my best shot was to try and leverage a scripting language that already existed. After a bit of googling I settled on IronPython. I was really impressed by how easy it was to get a Python instance up and running in my .NET project. We’re talking JRuby easy, maybe easier. In a few hours, I was running my Python scripts in the project and generating responses.

Because IronPython uses dynamic typing and because I knew I would be generating up to several hundred responses per second, I did a little testing to make sure there were no performance problems. Other than the initial compilation required for the first invocation, everything was really fast. I decided to pay that compilation cost up front before the simulation actually began, by calling the function with time equal to zero and ignoring the result.

I could have just written a plain function in Python to return the numeric data value, of the form value = f(time), such as the following:

def get_a_value(time):
  return time * 2

However, I knew one common thing I would want to do was linearly interpolate between two numbers over a fixed set of time. To solve this, I decided my scripts would use higher-order functions:

def lerp(start, end, over):
  lo = min(start, end)
  hi = max(start, end)
  def lerp_t(time):
    return max(lo, min(hi, start + ((end - start) * time / float(over))))
  return lerp_t

This way, when I want to do a linear interpolation, I don’t have to write any new code other than invoking the higher order function:

lerp(100,200,5) # returns a function to go from 100 to 200 over 5 seconds

And from there it’s just a hop, skip, and a jump to chaining higher-order functions together for more complex behavior:

lerp(100,200,5).then(lerp(200,100,50)) # value dies off gradually

I’m really glad I chose IronPython for this task. It saved me a ton of work not to have to create and interpret my own scripts. Where have you pulled in a library that has saved you tremendous effort over crafting your own code?