So you have a bunch of Git repositories for individual projects and you want to bring them together into a monorepo. Maybe you want to share code without constantly publishing dependencies. Maybe you want to simplify testing or deploy everything together with infrastructure-as-code.
But those original repositories carry a lot of history. Some have even brought history along from pre-Git days. Keeping this history would be really nice, but you also want to organize projects into their own folders. What can you do?
Why You Can’t Just Copy Code In
Let’s back up a moment — you actually can just copy code into a new monorepo. The problem is, you won’t have that history. It’ll be as if all projects just appeared one day, fully written. If that’s not important, you can stop reading now. Do the simplest thing that can possibly work.
But if you want to keep history, read on. You can weave several repositories together, moving each repository into its own folder, and keep every change made in every repository.
Visualizing Where You Want to Be
An important first step is to visualize where you want to go with this migration. For example, you might want to import each project into a new folder in the root directory. This is the approach I took with an internal project recently since each one was part of a collection of interoperating services.
If you have several projects of the same kind, you may want to sort them into different subfolders of top-level folders such as “services,” “modules,” and “tools.” This is the approach my client and my team took with a brand-new monorepo, though it would’ve also been a great strategy if importing.
You may also have a sort of super-project that ties the other projects together. These make great candidates as a starting point for the root of your monorepo. You can weave the other projects into this one.
The structure you pick primarily influences discoverability. It matters less for day-to-day use if you have well-named files that can be opened with a fuzzy finder. See if the structure makes sense to someone who doesn’t know as much about your projects. Check it with someone who used the old projects a lot and will now be using the monorepo as well.
All this said, feel free to experiment on your own for a bit. Just be careful not to push or share out your work until you’re confident nobody will start building on top of it. Making changes after the fact is possible but risks muddying the carefully-imported history you’re curating now.
Importing Branches from Other Repositories
Gather the following items:
- The Git remote URLs for the projects you want to import
- The main branch name for each project you want to import
- Where you want to put each imported project (a folder, or right in the root)
The heavy lifting will be done by git-filter-repo. You can install it from Homebrew.
To make things even easier, I’ve written a script called import-branch. All it takes are the three things I asked you to gather earlier.
$ ./import-branch firstname.lastname@example.org:example/widget-api main api
This will pull down the “main” branch of an imaginary “widget-api” repository and merge it into the working branch in your current project, rewriting commits as necessary to put files in their new “api” folder location as if they’d always been there. It’ll also clean up after itself, even if the merge itself fails.
Play with this step until you’ve got it just the way you want it. Remember — you’re merging things into your local working copy right up until the moment you push it up and share it with others. This is the time to make sure you’re happy.
If you’re not happy with the way one import went, it’s not difficult to back up (in your local working copy, of course) and try again:
$ git reset --hard HEAD~1
This command will reset your working copy to the last commit, immediately before the merge. Since it’s a merge commit, it’ll also remove everything you imported, so you can try again.
Note: This is a great command for this use case, but don’t use it indiscriminately! You can easily throw away important history if you’re not careful.
Weaving Everything Together
Bringing in commits isn’t all you’ll have to do, of course. You’ll also need to make sure your new monorepo functions properly.
- You will probably want to put together a Docker Compose file that launches every service together for development and testing.
- Tools included in each project may work fine, or they may need to be changed. Tools that every project has (such as a deploy tool) should be refactored to minimize duplication, if possible.
- If you have any kind of infrastructure-as-code setup (for example, using Terraform), consider setting it up so everything can be provisioned in one run.
When I last did this, I took these integration steps whenever I pulled in a new project, slowly building up the Docker Compose file and investigating deployment needs.
You can also do it all at once at the end if that works best; but, whatever you do, make sure you share something that works with your team! A shiny, new, but broken monorepo won’t make anyone’s day.
Once the monorepo is up and running, you can archive the old projects — for example, by replacing the README with a pointer to the monorepo.
There may be a period when you’re maintaining old repos and the monorepo at the same time. Keep this time as narrow as possible to reduce confusion. But import-branch has your back, too, if you need to keep pulling changes from the old projects for a bit, until you’re ready.
If you run it again exactly as before, it’ll bring in just new changes from the old repos, if there are any. If there aren’t, it’ll leave everything untouched.
I created an “import-all” script for this transitional period that runs import-branch once for each repository to check. It’s a short-term hack, but it’ll keep everything in sync until we’re ready to switch.