Testing and Legacy Code, A Primer

In the last few weeks, customers and potential clients have asked me on several occasions how Test-Driven Development relates to legacy code (incidentally, one fitting definition for legacy code is code having no test suite).

As much as we all might like to throw out legacy code (especially when it’s craptacular) and “do it right”, that’s often an entirely impractical option or just a plain dumb thing to do. Reality is that there’s no magical formula or one-size-fits-all procedure for dealing with testing and legacy code. Each case is unique and requires a thoughtful, responsible approach. That said, I offer here guidelines toward tackling the challenge in a thoughtful, responsible way.

Continuous Integration:

  • Set it up. First. Get a stable, automated, repeatable build process in place.
  • Add system and unit tests to your build once you’ve set them up.

Evaluating Risk:

  • Hold in-depth discussions with your customer about the realities of touching legacy code and how testing will let everybody sleep better at night. Get buy-in.
  • With your customer, identify the most important and most used features of the current software.
  • Interview anyone with experience with the legacy system to determine what areas of product functionality have historically been unstable or most challenging to get “right.”
  • Use static analysis techniques to reveal the most complex and brittle areas of the codebase. Static analysis tools that implement evaluation techniques such as cyclomatic complexity, abstract interpretation, etc can be helpful in identifying the most problematic areas of legacy code.
  • Use everything you learn in this process to fuel your decision making when you’re up to your armpits in untested code.

System Tests:

  • Before adding to or modifying any part of the legacy code, wrap automated system tests around the code concentrating on the most important and/or riskiest features. Ideally, wrap system tests around all the major features of the product.
    • If appropriate and valuable, create a simulator and/or logging application to exercise/capture the behavior of the legacy system with the outside world.
    • Consider off-the-shelf software for automating/generating interface click events, network traffic, messages on the communication bus, etc.
  • Delay refactoring the code as much as possible while implementing system tests. Sometimes one must change the code to make it testable under automation. If you must refactor to add automated system testing, first develop a sufficient set of acceptance tests to reveal anything you’ve broken due to refactoring. Acceptance tests in this case will likely take the form of manual steps for exercising and verifying the software; gather considerable input from the customer and/or end user while assembling these acceptance tests. Acceptance tests, technically, are specified by the customer.

Unit/Integration Tests:

  • Set up a unit testing framework and write all new code in a test-first manner.
  • Wherever new code touches legacy code, add unit and/or integration tests to the boundary between the two. Refactor the legacy code to support the new tests, taking care to evaluate where testing will deliver the greatest value given the accompanying risk.
  • From your previous risk analysis, employ surgical strikes of refactoring and testing within the riskiest code – even that code that may not directly relate to your present tasks. An action of this sort requires careful consideration, conversation with your customer, and consensus among your development team. While you may not need to touch all areas of code, eliminating instability and bugs outside the scope of your work will improve overall project perception and engender confidence in your efforts. That is, it can be difficult to convince a customer that a particular bug was present long before you started work in the code.

Re-evaluate:

  • If all goes well, you’ll slowly chip away at the code that makes you feel the dirtiest. As you feel cleaner over time, code that once seemed impractical to test may warrant reconsideration in light of the project’s overall goals and testing needs.
  • Exercise the means you used to initially evaluate risk in the code base again as you progress.
  • Wash. Rinse. Repeat. Ah… feel the clean.