6 Tips for Testing Large Legacy Codebases

Have you ever been dropped into a new project and asked to increase test coverage? Have you needed to provide confidence in the code, lower production failures, and make selling the stability of your product easier? You are likely in the daunting position of adding tests to a large (probably legacy) codebase.

Tests provide value for a number of reasons, including preventing simple bugs, verifying expected results, and defining error behavior. Unfortunately, it’s often the case that code is written without testing in mind. Projects have a habit of growing quickly. If tests aren’t part of the core product, a large amount of untested/uncovered code will be produced.

Adding tests is a daunting task — getting 100% coverage is not likely and not the goal. The goal is to provide confidence around crucial code. Here are a few methods I use to quickly navigate large amounts of uncovered code and make meaningful testing updates.

1. Break Down Major Areas of the Code

Take some time to understand the underlying constructs that exist in your application. This will help to categorize the different types of code (_classes_) that need coverage. Taking a moment to recognize the different categories will help you avoid testing only an isolated portion of the code, as well as providing useful examples for future tests in each category.

During this process, it’s important to recognize how each category behaves differently and how this will influence testing. Providing coverage in each category will make future tests easier and faster to write. A clear example of this could be an [MVC](https://en.wikipedia.org/wiki/Model%E2%80%93view%E2%80%93controller) application. The models and controllers will have computational logic well suited for unit tests, while the view may require behavioral or end-to-end tests.

This may seem like added upfront work, but you know that [Dubiously Skilled Teammember] will be tasked with extending your tests, so make sure to provide bulletproof examples that will limit your later rework.

2. Highlight Important Classes/Features

Now that you know the different categories of code within your application, highlight important examples of each. This can be based on a number of factors, such as the security importance of a class, the sheer amount of code, or the number of classes dependent on it.

As stated above, code coverage is not the goal, so providing initial confidence in core logic will go much further than 100% coverage on auxiliary code. An added benefit is that you can now trust the code is covered and not worry every time [Previously Mentioned Teammember]’s commit touches mission-critical code.

3. Build Up from the Base

You have now keyed into the important code you want to cover. However, as is often the case, that crucial code is built on a number of base classes and interfaces. Testing both the crucial code and the underlying inherited behavior in a single location increases the brittleness of tests and the overall code needed.

Instead, determine the dependency chain, and work from the base to understand and cover the code. This is more work than just covering a single class, but it will help you understand the core mechanics, provide confidence in shared code, and reduce repeated code for each inheriting class.

4. Keep Refactors Minimal at First

When you open a class that no one has looked at in ten years, it’s easy to immediately jump to rewriting the class from the ground up. Stop. That would be a crucial mistake — you could easily introduce new bugs or break existing behavior.

Take this opportunity to write tests around the expected behavior, such as with end-to-end and integration tests. This will prevent future failings and make refactoring with confidence possible.

If the code is not compatible with good testing, make minimal, noninvasive changes to allow for testing. For example, take the time to abstract an interface for a common class, which will allow you to safely mock the expected behavior and isolate the effects of a refactor.

5. Write Clear, Documented Example Tests

You’ve finished creating the initial code testing, reports are being automatically generated, and your supervisor is happy. Job well done…

Unfortunately, your supervisor doesn’t want to show _just_ 20% code coverage to the C-Suite, so he tasks [Unlucky But Still Questionably-Skilled Teammate] with extending the test coverage. You’re now doubly swamped with having to answer every question about every class for every test. Stop. Roll this back. This is all preventable.

To save yourself from your coworkers (and yourself), take the time to document the test setup, the requirements for each category of code, and a quick start for running tests. This will pay dividends in the future when onboarding new coworkers or updating tests for new behavior.

6. Abstract & Extract Difficult/Unique Constructs and Mocks

To take the last tip one step further, you can become someone’s hero in a matter of seconds with little-to-no added work. As you are writing tests around crucial and complex code, it’s common to run into tricky testing situations. This comes in many forms, from verifying log behavior to establishing behavior of callback functions (or nested callbacks) to creating complex mocks of common services.

When you take the time to notice test code that is both complex and common, you can save immense time in the future making a note of it in a testing README. This may sometimes go unnoticed, but pair it with the quick start from above, and you will be saving yourself and coworkers time writing good tests.


Using these tips and a bit of patience, you can create a strong base of tests and provide a foothold for good testing practices as the project grows and matures.