Lessons from an SVN Server Migration

Recently, we rebuilt Atomic’s SVN server. We wanted to upgrade to the latest Ubuntu LTS release and also wanted to manage the server with Chef. Provisioning the server and bootstrapping it with Chef was straightforward. However, actually preparing the server for hosting our SVN repositories and migrating all of the data posed some challenges. I was reminded some useful commands, techniques, and learned how to fix some problems.

Unlike git, which allows us to clone a new bare repository from any existing one, SVN repositories must be exported or ‘dumped’ to a portable format (called a ‘dumpfile’), transferred to the new location, and then loaded into a new, empty repository.

Exporting Existing SVN Repositories

The command svnadmin dump ./myrepository > dumpfile.svn will write all revisions for the repository out to disk.

To reduce the size of the dumpfile, we can calculate the deltas between revisions and only dump these: svnadmin dump --deltas ./myrepository > smaller_dumpfile.svn. This takes more processing power and time, but can greatly reduce the size of the file on disk.

To save save even more space, we can gzip the final dumpfile: svnadmin dump --deltas ./myrepository | gzip -c > smallest_dumpfile.svn.gz

Creating a New SVN Repository

Before actually importing the dumpfile containing our SVN repository, we must first create the SVN repository structure. In most cases, a simple svnadmin create newrepo will achieve this.

However, depending on the desired repository filesystem type (fsfs vs. bdb) or need for backwards compatibility with older versions of SVN, we may need to pass in some additional options: svnadmin create --fs-type fsfs --compatible-version 1.6 newrepo

Importing a SVN Repository from a Dumpfile

After the SVN repository structure has been created, we can import or load the dumpfile from STDIN, svnadmin load newrepo < dumpfile.svn. This reads in all revisions from the file and commits them to the new SVN repository’s filesystem.

To maintain the UUID between the old repository and the newly created repository (so that SVN clients will treat the repositories as identical), we must force the newly created repository to use the uuid from the dumpfile: svnadmin load --force-uuid newrepo < dumpfile.svn.

If our dumpfile is gzipped, we can decompress and load on the fly: cat smallest_dumpfile.svn.gz | gzip -d | svnadmin load --force-uuid newrepo

Errors We Ran Into

Because some of our old SVN repositories were created with a much earlier version of SVN, I encountered some errors when loading the dumpfiles into the new SVN repositories.

In particular, the following two issues came up:

  • Cannot accept non-LF line endings in 'svn:ignore' property
  • Cannot accept non-LF line endings in 'svn:log' property

This immediately halted the load process and left an incompletely-migrated repository. This occurred because the old SVN repositories contained older-style carriage returns (^M) in the properties section (which is no longer allowed since SVN 1.6).

Two solutions were readily available:

  • Use the --bypass-prop-validation flag when loading the repo to ignore the problem.
  • Actually fix the problem and replace the carriage returns in the dumpfile prior to loading.

I opted to actually fix the problem rather than silently carry around this potential problem into our new repositories. To do so, I used a sed command to find and replace all instances of carriage returns in the properties sections of affected dumpfiles.

sed -e '/^svn:log$/,/^PROPS-END$/ s/^M/ /' -e '/^svn:ignore$/,/^PROPS-END$/ s/^M/\n/' original_file.svn > repaired_file.svn

This causes sed to separately examine two address spaces defined by regular expressions, and within those address spaces, replace carriage returns with spaces. The address spaces are specific to the dumpfile format, and, in this case, target the “svn:log” and “svn:ignore” sections.

Note that the ^M is not a circumflex accent followed by the capital letter M, but the carriage return control character (0x0D). It can usually be inserted by typing “CTRL + V, CTRL + M”.

Alternatively, just use the hex value in the sed command: sed -e '/^svn:log$/,/^PROPS-END$/ s/\x0D/ /' -e '/^svn:ignore$/,/^PROPS-END$/ s/\x0D/\n/' original_file.svn > repaired_file.svn

One Final Trick

To speed up the migration of individual repositories from our old server to the new one (over 300 of them), I ended up with some fairly interesting command combinations to get everything processed quickly and in an unattended fashion.

One of the more helpful ones was:

for repo in `ls -1`
	echo "Processing $repo"
	svnadmin dump -q --deltas $repo |
	gzip -c |
	ssh jk@newserver \
	"svnadmin create $repo &&
	gzip -d |
	sed -e '/^svn:log$/,/^PROPS-END$/ s/\x0D/ /' -e '/^svn:ignore$/,/^PROPS-END$/ s/\x0D/\n/' |
	svnadmin load -q --force-uuid $repo"

This lists the repositories in the current directory, dumps each sequentially, pipes the gzipped dumpfile contents over SSH to the new server, decompresses the data, runs it through sed to check for incorrect line endings in the properties section, and finally loads the data into the newly created repository.