Article summary
I recently set up a project with hosting on Heroku. However, I had code spread across several repositories that all needed to be deployed to the same place. This is a problem because the process to deploy to Heroku is essentially pushing to a git remote — if I did that across two repositories, they would collide.
One possible solution was git submodules, but they are finicky so I was hoping for something simpler. After a bit of investigation, I discovered that git has a feature called subtrees that could be used to handle this.
Using a subtree is simpler than using a submodule for two major reasons: subtrees will be present when the repository is cloned, and the other users on the team don’t need to do anything special to integrate with it.
My Repository Setup
With subtrees, we can nest external repositories in the main repository. I used this to include a deploy repository (which is connected to Heroku) as the build directory for a Middleman application. This way, Heroku only has the static site, but we can avoid copying files between repositories manually.
It’s simple to set up:
1. Add a git remote for the external repository.
First, we need to connect the two repositories: git remote add -f external-repo https://github.com/user/my-repo.git
2. Pull down the contents of the new remote into the main repository.
Run: git subtree add --prefix ./my-external-code-path external-repo master --squash
This will essentially check out all the code on master from the external repo, right into the prefix we specified.
3. Push code back upstream.
After this step you’ll have all the code in place, so if all you wanted was to grab a dependency then you’re finished. If you (like I did) need to contribute code that you worked on in your main repo, then we can push just the subtree it to the remote you added earlier: git subtree push --prefix ./my-external-code-path external-repo master
Why Use Subtrees?
A lot of modern web development is centered around APIs and single-page HTML applications that connect via REST. Even though the two areas are really separate concerns, they are frequently combined because they are worked on by the same people and also because many platform-as-a-service hosts use a git repo as a deployment mechanism. Using subtrees can help work around these limitations and help enforce proper separation of concerns.