Simple, Rails-Like Database Migrations Using Knex on AWS Lambda

For my current project, we’ve been using AWS Lambda as our backend. There are a handful of nice tutorials for connecting to an RDS database from an Lambda function. But I couldn’t find many for deploying new code along with a database migration.

Ideally, I’d like to be able to push new code along with a database migration and have both updated atomically. I suspect the large-scale way of solving this problem is to perform an update as a sequence of smaller, backwards-compatible updates. For Lambda functions with a lot of traffic, that’s probably the best way to go. But for Lambda functions without a lot of load, it would definitely be a lot more convenient to be able update both atomically.

So here’s how I solved that problem with Knex.js.

Migrate on Deploy

What makes this problem tricky is that at any given time, you could potentially have multiple invocations of your Lambda function running simultaneously. When you push out new code, not all of them will be updated at once.

It’s important to remember that there are two cases when code is run in your AWS Lambda function: when your function runs the first time in a given container (which is when your initialization gets run) and when your function handler gets invoked during normal operation.

So you should do the following for each Lambda:

  • During initial startup: Attempt to do a migration inside a transaction. If your database is up to date, you do nothing and continue. If it’s not, you perform the database migration. Because you’re running it inside a transaction, it will block all other Lambda functions from running on your migrating tables until after the migration is finished. Then continue as normal.
  • During normal operation: Do all DB operations inside a transaction. Then check to see if the database has the same version of the schema as you expect. If not, bail out with an error that asks the client to try again later. Otherwise, proceed as normal.

The Code

The code ends up looking like this:


"use strict";
console.log("Loading function");
const environment = process.env.NODE_ENV || "development";
const knex_config = require("./knexfile");
const knexF = require("knex");
const knex = knexF(knex_config[environment]);

// This will run once when the Lambda container starts
const migrations = knex.migrate.latest()
    .then(function () {
        console.log("Migrations complete");
    }).catch((err) => {
        console.log("Error running migrations: ", err);
    });

const DB_OUT_OF_SYNC = "Database out of sync with Lambda function";
function withDB(f) {
    // make sure migrations finished if they happend at all
    return migrations.then(() => {
        // start a transaction
        return knex.transaction((trx) => {
            // check to see if our database is in the state we expect
            return trx.migrate.status().then((status) => {
                if (status === 0) {
                    // we're good to go!
                    return f(trx)
                } else {
                    // bail out, we're likely being replaced by a newer Lambda function
                    throw DB_OUT_OF_SYNC;
                }
            })
        });
    });
}