Lessons from an SVN Server Migration

Article summary

Recently, we rebuilt Atomic’s SVN server. We wanted to upgrade to the latest Ubuntu LTS release and also wanted to manage the server with Chef. Provisioning the server and bootstrapping it with Chef was straightforward. However, actually preparing the server for hosting our SVN repositories and migrating all of the data posed some challenges. I was reminded some useful commands, techniques, and learned how to fix some problems.

Unlike git, which allows us to clone a new bare repository from any existing one, SVN repositories must be exported or ‘dumped’ to a portable format (called a ‘dumpfile’), transferred to the new location, and then loaded into a new, empty repository.

Exporting Existing SVN Repositories

The command svnadmin dump ./myrepository > dumpfile.svn will write all revisions for the repository out to disk.

To reduce the size of the dumpfile, we can calculate the deltas between revisions and only dump these: svnadmin dump --deltas ./myrepository > smaller_dumpfile.svn. This takes more processing power and time, but can greatly reduce the size of the file on disk.

To save save even more space, we can gzip the final dumpfile: svnadmin dump --deltas ./myrepository | gzip -c > smallest_dumpfile.svn.gz

Creating a New SVN Repository

Before actually importing the dumpfile containing our SVN repository, we must first create the SVN repository structure. In most cases, a simple svnadmin create newrepo will achieve this.

However, depending on the desired repository filesystem type (fsfs vs. bdb) or need for backwards compatibility with older versions of SVN, we may need to pass in some additional options: svnadmin create --fs-type fsfs --compatible-version 1.6 newrepo

Importing a SVN Repository from a Dumpfile

After the SVN repository structure has been created, we can import or load the dumpfile from STDIN, svnadmin load newrepo < dumpfile.svn. This reads in all revisions from the file and commits them to the new SVN repository’s filesystem.

To maintain the UUID between the old repository and the newly created repository (so that SVN clients will treat the repositories as identical), we must force the newly created repository to use the uuid from the dumpfile: svnadmin load --force-uuid newrepo < dumpfile.svn. If our dumpfile is gzipped, we can decompress and load on the fly: cat smallest_dumpfile.svn.gz | gzip -d | svnadmin load --force-uuid newrepo

Errors We Ran Into

Because some of our old SVN repositories were created with a much earlier version of SVN, I encountered some errors when loading the dumpfiles into the new SVN repositories. In particular, the following two issues came up: – Cannot accept non-LF line endings in 'svn:ignore' property – Cannot accept non-LF line endings in 'svn:log' property This immediately halted the load process and left an incompletely-migrated repository. This occurred because the old SVN repositories contained older-style carriage returns (^M) in the properties section (which is no longer allowed since SVN 1.6). Two solutions were readily available: – Use the --bypass-prop-validation flag when loading the repo to ignore the problem. – Actually fix the problem and replace the carriage returns in the dumpfile prior to loading. I opted to actually fix the problem rather than silently carry around this potential problem into our new repositories. To do so, I used a sed command to find and replace all instances of carriage returns in the properties sections of affected dumpfiles. sed -e '/^svn:log$/,/^PROPS-END$/ s/^M/ /' -e '/^svn:ignore$/,/^PROPS-END$/ s/^M/\n/' original_file.svn > repaired_file.svn

This causes sed to separately examine two address spaces defined by regular expressions, and within those address spaces, replace carriage returns with spaces. The address spaces are specific to the dumpfile format, and, in this case, target the “svn:log” and “svn:ignore” sections.

Note that the ^M is not a circumflex accent followed by the capital letter M, but the carriage return control character (0x0D). It can usually be inserted by typing “CTRL + V, CTRL + M”.

Alternatively, just use the hex value in the sed command: sed -e '/^svn:log$/,/^PROPS-END$/ s/\x0D/ /' -e '/^svn:ignore$/,/^PROPS-END$/ s/\x0D/\n/' original_file.svn > repaired_file.svn

One Final Trick

To speed up the migration of individual repositories from our old server to the new one (over 300 of them), I ended up with some fairly interesting command combinations to get everything processed quickly and in an unattended fashion.

One of the more helpful ones was:

for repo in `ls -1`
do
    echo "Processing $repo"
    svnadmin dump -q --deltas $repo |
    gzip -c |
    ssh jk@newserver \
    "svnadmin create $repo &&
    gzip -d |
    sed -e '/^svn:log$/,/^PROPS-END$/ s/\x0D/ /' -e '/^svn:ignore$/,/^PROPS-END$/ s/\x0D/\n/' |
    svnadmin load -q --force-uuid $repo"
done

This lists the repositories in the current directory, dumps each sequentially, pipes the gzipped dumpfile contents over SSH to the new server, decompresses the data, runs it through sed to check for incorrect line endings in the properties section, and finally loads the data into the newly created repository.

Conversation
  • Ali says:

    Wow this version is caodneme crashalicious. Continues the trend of very unstable dev versions lately. GMail, Google Reader, Google Docs, all Snap crash within 10 sec of page loading. Have opened a new incognito window with no extensions active and it exhibits the same behavior. Turned ALL flags off, still crashy. System: Windows 7 Ultimate(x64), 8gb ram. And before you flame me, I completely understand the dev version is going to be unstable at times. This is more that normal, and just reporting such. Keep up the good work. People signing up for the dev version and then “demanding” fixes and complaining without reporting bugs is a hassle for you guys, but I truly think Google allowing users to test such early versions is a huge reason for its rapid progression and it’s continual climb up in market share. So thanks! :-)

  • Garry says:

    First off, I have no Idea what Ali is referring to. This is Ubuntu 12.04 x64 with Firefox 26.0.

    Anyway, some remarks:
    You describe that the sed expressions replace ‘^M’ with space. This is only true for the first expression. The second one replaces it with ‘\n’. – Is that in order to not damage the ignore? (I.e. ignore respects spaces as part of the ignore pattern?)
    ‘^M’ is ‘\r’. With ‘\r’ in this article people could just copy and paste the line from the browser instead of copying and then replacing the copied TWO characters ‘^’ and ‘M’ by typing [Ctrl]+[v], [Ctrl][m].
    According to the svn manual, ‘–force-uuid’ is only neccessary for accepting a UUID from a dump if the repository allready contains data. So if it’s just been ‘svnadmin create’d this shouldn’t be required.

  • Dawuid says:

    Your ‘sed’ recipe saves my day. Thanks!

  • Meillo says:

    for repo in `ls -1` do … done

    is the same as

    for repo in * do … done

  • Antti says:

    Thank you for the informational post!

    I especially liked this part:

    “I opted to actually fix the problem rather than silently carry around this potential problem into our new repositories. ”

    If only more people were like you! Kudos!

  • Comments are closed.