Migrate Your Heroku PostgreSQL Database to AWS RDS

Using Heroku, it’s quick and easy to add a PostgreSQL database to your application, but over time these Heroku add-ons may not fit your needs. That’s where Amazon’s Relational Database Service (RDS) comes in! RDS allows much more granular control of the size and performance of your database, and it’s generally cheaper. I’ll share how I was able to successfully complete a migration from Heroku PostgreSQL to RDS.

Prerequisites

EC2

Heroku has its resources hosted on AWS, so staying in their network will make things faster. To do this we will use an EC2 instance to generate the data dump. I provisioned a t3.xlarge instance to take advantage of multiple jobs and speed up the migration process. PostgreSQL should be installed on your EC2 instance and the version should match the version of your Heroku PostgreSQL add-on. You can connect to the EC2 instance a few ways, but I recommend generating a key pair to connect using SSH.

RDS

An RDS instance sharing the same PostgreSQL version should be provisioned before the migration process. I provisioned a db.t3.xlarge instance to take advantage of parallel jobs and speed up the process. You should add the security group of the EC2 instance to the inbound rules of the RDS instance security group.

Heroku

To ensure that no changes to the database are being made while the backups are created, put your Heroku application into maintenance mode and scale down all dynos:

heroku maintenance:on

and

heroku ps:scale {dyno-type}=0

can be used to do this. If you aren’t sure what type of dynos are running on your application, you can use this

heroku ps

to list the type and quantity of each dyno.

Creating the backup

To create the backup, we’ll generate a backup directory on the EC2 instance we provisioned earlier based on the data stored in the Heroku PostgreSQL add-on. Begin by grabbing the name of the database that you want to migrate by using the following command on your Heroku application to list all of the environment variables

heroku config

Look for DATABASE_URL or HEROKU_POSTGRESQL__URL, these will be the name of the database. With the name found, we can then get the connection string of the specified database using the following command on the same Heroku application

heroku pg:credentials:url

We’ll then use this connection string in a command on the EC2 instance we provisioned earlier to generate our backup directory. The command we will use is:

pg_dump -Z {COMPRESSION-AMOUNT} -Fd -j {NUMBER-OF-JOBS} --no-acl --no-owner -d {HEROKU-DB-CONNECTION-URL} -f ./dump

The compression amount argument can be 0-9. We used 9 and didn’t see any issues. The number of jobs argument should be set to one job per CPU on your EC2 instance. We increased the amount and didn’t see any large gains, but using fewer jobs will certainly make the process slower. Now that we have created the backup, we need to populate our database with it!

Restoring the database

Now that we have created a backup of the Heroku PostgreSQL database, let’s restore it to AWS RDS. Before restoring, it’s helpful to test the connection to the database. Do this by connecting to the EC2 instance through your terminal and using the following command:

psql -h {RDS-INSTANCE-URL} -U {DB-USER} -d {DB-NAME} -p {PORT} -W

You will need a lot of the information in this command later, so it’s important to remember where you found them. You can find the RDS instance URL and port number on the Connectivity and Security tab in the RDS AWS console. And, you can find the database name and user in the configuration tab. When running this command, you’ll have to enter the database password.

Once you have confirmed you can connect to the RDS instance, we can begin the restoration process! I recommend using tmux to ensure the processes doesn’t timeout. While on the EC2 instance open a tmux session and run the following command to restore the database,

pg_restore --verbose --clean --no-acl --no-owner --if-exists -Fd -j {NUMBER-OF-JOBS} -h {RDS-INSTANCE-URL} -d {DB-NAME} -U {DB-USER} ./dump

For the restore command, I recommend using the same number of jobs that were used in the dump command. Once the command has finished, your RDS instance will be populated with the data that existed on the Heroku PostgreSQL add-on.

Connecting Heroku to RDS

At this point, you have two databases with the same information, but your Heroku application is using the same old database. So, let’s connect the application to the new RDS instance! To start, we’ll want to build a connection string following this pattern:

postgres://{DB-USER}:{DB-PASSWORD}@{RDS-INSTANCE-URL}:{PORT}/{DB-NAME}. Now let’s put that connection string into a Heroku environment variable. There are a few different ways you can do this, depending on how your application is set up. If your application’s DATABASE_URL variable is associated with the Heroku PostgreSQL add-on, then you will need to either detach or destroy the addon, I recommend detaching it in case we need to revert. To detach use the following command in your Heroku application

heroku addons:detach DATABASE -a {APP-NAME}

This will remove the DATABASE_URL environment variable from your Heroku application so let’s add it back with your new connection string! Simply run this

heroku config:add DATABASE_URL="{NEW-URL}"

with the connection string we created earlier.

Congratulations! Your Heroku application is now using the RDS instance! You can now scale your application dynos back up to normal levels and turn off maintenance mode.

Validation

Before deleting your Heroku PostgreSQL add-on, it’s best to do some sanity checks and make sure all the data was transferred correctly. A few things to check are the number of tables in both databases and the extensions that exist on each. You can also perform a couple of simple queries on each database to make sure the results match up.