How to Evaluate an Unfamiliar Codebase in 15 Minutes

When working at a software consultancy, you may be asked to evaluate an unfamiliar codebase. There are a lot of reasons why you may need to make such an evaluation—for example, your company is going to take over development on a legacy codebase, you’re considering undertaking an application rewrite, or maybe you’re thinking of integrating with a third-party library.

I recently surveyed a sampling of senior developers to ask what key indicators they look for when making a quick assessment of the quality of a codebase. I’ve turned their most common responses into a list of questions to help you.

While this list can be helpful for evaluating code quality, I would urge you to resist using a codebase to make an assessment on a person’s abilities. Remember, you don’t know what context the person was working in when they wrote the code. There may have been external factors and constraints the led them to make decisions that they wouldn’t have made otherwise.

Is There Any Documentation?

If I open up a repo and there is no documentation to be found, that’s an immediate red flag. No one loves writing documentation, but I think we can all agree that it’s important. Things that should be documented in every project include:

  • What is required to set up the development environment?
  • How do you perform a build?
  • Is there a Continuous Integration environment set up?
  • How can application logs be accessed?
  • Is there an analytics platform in use?
  • What is the high-level system architecture?

Is the Build Process Reproducible?

When picking up a legacy codebase for the first time, it can be very difficult to create a build if there are no instructions or automation in place. Look in the root of the project for configuration files related to a continuous integration server. Be sure to check for hidden dot-folders like “.circleci/”. Not only is an automated build system helpful to ensure production build artifacts can be reproduced, but it can also give you a clue as to what steps are necessary to create a build locally.

Is It Tested? Do the Tests Pass?

Testing has become so mainstream now that you can find a testing framework for pretty much every programming language out there. At the very least, I like to see a suite of unit tests. I’d award bonus points for the integration of system tests. If the project has a test directory but it just has one example test in it, that earns negative points! If you do find tests, see if you can run them easily and verify that they are passing. If the project is set up with CI, you might also be able to verify the presence and health of the test suite by looking at the most recent builds in the CI server dashboard.

Was the Separation of Concerns Principle Followed?

In my opinion, following Separation of Concerns is one of the most important aspects of a healthy codebase. Look for how the application is broken up. Do you see massive files and classes with thousands of lines of code? Or, is the application broken up into manageable classes/modules, with designated responsibilities? Large files/classes tend to be full of tangled logic (spaghetti code) and often lead to chunks of code being copy-pasted into other parts of the application because code is not reusable (it violates the DRY principle).

Are There a Lot of Large Functions?

Another thing to look for is large functions, especially functions that are over 100 lines long. Also, look for deep nesting. While there are cases where long functions are appropriate, they should be pretty rare in a codebase. Long functions are typically difficult or impossible to unit test and also very challenging to debug when something’s not working properly.

How is the Application State-Managed?

Thoughtful state management is very important, so try to determine what strategy was used for managing state. Was a global application store used, and if so, was care taken to ensure that state updates are done atomically and safely? Are there a lot of non-pure functions that mutate global state, producing side-effects? If so, that can lead to problems that are difficult to debug. It also means that, as a developer, you need an understanding of how the entire application works and how various components interact with global state before you can safely make any changes.

How Old Are the Core Dependencies?

I won’t get into discussing which core dependencies should or should not be used because that’s very subjective. However, you should look to see that whatever is in use is relatively current. The unfortunate reality of modern software is that it has a short shelf-life. Technology continues to move forward, and if you don’t keep up with it, things will be much more difficult for you in the future.

For example, if the application is built using Ruby on Rails, what versions of Ruby and Rails are used? The maintainers of these tools tend to stop releasing feature enhancements, bug fixes, and security patches for anything but the most recent one or two versions. If you pick up a legacy project and have to perform a Rails upgrade before you can start doing feature development, you’re going to want to plan enough time for that.

Assessing a codebase is a difficult task to perform in a short amount of time. However, this list should give you some things to think about when perusing an unknown repository. If you’ve been tasked with making a quality assessment on a project in the past, I’d love to hear what key indicators you relied on to evaluate it.