To Squash or Not to Squash: The Merge Debate

Your software development team has worked super hard and gotten your super awesome feature reviewed and approved. You’re ready to merge that pull request into your develop branch and see the sweet results. But wait…there’s one last decision to make. There’s a little dropdown button by the merge, and you have to choose. Do you do a standard merge commit? Or do you squash merge?

I’ve seen and had a lot of debates on the topic, and I’ve heard a lot of good points. One thing I see a lot of discussions miss is putting the choices into the context of a regular workflow. This can allow a much clearer picture of the pros and cons of each and help people make informed decisions about their team and codebase.

The Merge Commit

This is the stock, standard merge option, and it’s the default in most repository software if no customization is done. In many ways, it’s easy to understand why, as it can be viewed as the most straightforward. It takes all the commits you’ve done in your branch and adds them to the base branch (along with a merge commit to reconcile the differences).  The entire history of your feature branch is available to view and search through. If you want to see a small set of code changes associated with a sub-feature, or if you want to find a bug with Git bisect, everything you need is there. It works in basically any scenario as long as you have good commit hygiene.

The downside is that it can make your Git history rather messy if you’re not careful. Maybe you’re pairing and want to pass code back and forth, or you had a lot of small feedback items in a pull request. Maybe you commit at the end of every day in case you’re sick the next day and someone else needs to pick up your story. Your branch will end up with a LOT of commits, and most of them will just be tiny messages. Worst case, you end up scrolling through an endless sea of lines that just say…”WIP.” I know it’s scary, but it could happen to YOU.

The core problem here is that commits don’t really represent logical changes of code in a lot of workflows. This is what the good hygiene line in the earlier paragraph was referring to. If your commits don’t represent actual changes, you’ll need to rebase before merging. Failure to do so will mess up your Git history beyond recognition. This isn’t the end of the world, of course, as things like Git bisect still work, but it’s not the most pleasant thing to look through. I often scroll through history to see what happened in a file or a branch, so it’s nice when it’s readable.

The Squash Commit

This is the common alternative to the standard merge commit. It turns all your changes into one single commit and then adds that to the main branch. This neatly sidesteps the issues with commit hygiene from standard commits. You can just push that squash merge button no matter how many commits you had for whatever reason. The result is a very neatly organized Git history that’s very easy to read and follow. There’s no rebasing required and it’s easy to revert. This makes it especially good for newer developers bombarded with new things to learn. Learning to rebase when you don’t understand Git to begin with can be a lot.

This all sounds like a dream come true, but there can be a pretty big catch. All your commit history is erased and replaced with one big commit. This means if there are relevant sub-features in your single pull request, they end up all bundled as one blob of change. This can make finding specific reasons for changes harder as they get bundled and obscured with overarching generic commit messages. It can also make debugging harder (through Git bisect or similar commit analyzing tactics). This isn’t that big of a deal when pull requests are small, but that can be difficult to guarantee.

The Flaws in Context

The standard merge commit basically requires understanding rebase and how to use it properly. Given how important Git is and how useful rebase is for other reasons, this is a pretty easy pill to swallow. Even for newer developers, it’s just a really good skill to teach them. The bigger problem is that it relies on developers always staying on top of their commit history. In a busy work environment with a lot of things happening, it can be difficult to remember. Someone leaves one last bit of PR feedback and you fix it super quick and click that merge, and now you have a commit with “fixed spacing” in your main branch history. Inevitably, this will happen if your team only uses standard merge commits because people aren’t perfect and this process requires a lot of diligence.

Conversely, the squash merge requires you to keep your PRs to a reasonable size. No matter how well you plan out your stories though, it’s impossible to guarantee it stays small. Scope increases, wrinkles appear, and people need to adapt. Even if your story stays small, if it ends up representing more than one logical change, it’s still a pretty good idea to organize the commits manually and skip the squashing. This means no matter how good your story hygiene is, you still need standard merge commit to be an option even if squashing is your primary method.

So which one wins?

It depends.

The truth is both options have their merits, and it vastly depends on the situation your team and codebase are in. It’s impossible to guarantee small pull requests with every single story, so standard merge commit at least needs to be on. That said, it requires a lot of diligence to gain the benefits it offers, so it’s good to at least leave squash commits on also as an option. That allows developers to see the dropdown and select squash in cases where multiple commits aren’t necessary. This will help avoid some of those standard merge commit junk that accumulates in Git history. It will also save developers a fair time and mistakes from constant manual rebases over the long run.

I think the real decision is which one you make the default selection. Leaving standard merge commits on as default helps ensure nobody accidentally merges giant PRs as a single blob. Making squash the default reduces rebase time and helps ensure less junk in Git history.

I typically make squashing the default option, as most of the benefit of having it as an option in the first place is to require less diligence. The natural evolution of that is to make it default so people can’t forget to select it. That said, I typically work in smaller teams with a lot of control over the development process. This means we have the power to make our stories represent single logical changes and keep them small.

A lot of factors go into this decision based on how the pros and cons affect your team. On top of that, both require some form of good practice and due diligence to gain benefits without big drawbacks, so neither one is really free. Carefully weigh the options and choose wisely!

Conversation

Join the conversation

Your email address will not be published. Required fields are marked *