We recently revamped our backup system and decided that we wanted to have some manner of offsite backups in the cloud (as opposed to physical offsite backups).
This presented a unique challenge, however, as we have a fair amount of data to backup… our Internet connection speed isn’t the fastest in the world… and we needed the data secured.
Considering Our Options
Our initial thoughts were that we should use something like rsync to reduce the amount of data that needed to be transferred, minimizing backup times and the impact on our Internet connection.
However, we also wanted to use some sort of standardized cloud storage (such as Amazon S3 or Rackspace Cloud) so that we could store large amounts of data relatively inexpensively.
Furthermore, we wanted to secure our backups so that only we would be able to access them. While Amazon and Rackspace offer data encryption, they control the encryption mechanism and keys — this could be a potential problem if either suffered a security breach or compromise.
The Benefits of Duplicity
Fortunately, we found a nifty tool called ‘Duplicity‘ which allowed us to do everything that we wanted. Duplicity:
Keeps signatures and deltas on files and directories so that only modified files and directories needed to be synchronized.
Allows full and incremental backups, preserving arbitrarily aged backups sets.
Allows restoration from remote media without the need to retrieve all backup sets.
Duplicity has actually been around a long time (since 2002), with support for Amazon S3 since at least 2007. (It’s somewhat of a mystery why we didn’t find it sooner.)
Working with duplicity is very simple. Arguably, the most difficult bit involves setting up GnuPG. This Debian administrator article proved quite helpful in getting GnuPG set up.
It’s also quite easy to use Duplicity to do backups within your own network (using something like SCP), or to just store backups locally (using local file storage). There are a host of other options and features which Duplicity supports. For more information, you should check out the Duplicity features page, and read the Duplicity man pages.
Below are just a few examples of utilizing Duplicity to create and manage backups.
Two ways you can execute duplicity to push encrypt and backup data to Amazon S3:
Two ways to check what files exist in the current backups up on S3:
Two ways to restore files from remote backup on S3:
Justin is a DevOps practitioner at Atomic Object. He runs servers, troubleshoots the network, deploys apps, fixes bugs, manages backups, monitors monitoring, and does all manner of general problem solving for Atomic Object and our customers. He often works with configuration management tools like Chef and Puppet, and loves working with Linux.