Improve Realm Migrations with Rails-like Patterns

In a recent React Native project, Realm presented itself as the best database solution with built-in support for encryption. As I delved into the nuances of Realm, I realized that its built-in migration support is very bare. This raised concerns, particularly in light of a complex data model dictated by our client’s requirements.

I devised a solution to preempt these challenges – a migration pattern inspired by Rails, tailored for Realm. Introducing this pattern has substantially simplified the management of migrations and made tackling an ever-evolving data model much easier. This blog post will delve into the rationale and functionality behind our custom migration system and guide you through its implementation.

I’ve created a repository with a complete example for reference. Check it out if you’re interested in seeing our migration pattern in action.

The Shortcomings of Realm Migrations

Before we dig into the system my team uses, let’s talk about how Realm migrations work out of the box and then discuss its shortcomings.

Performing a migration in Realm typically involves a few steps. When you need to change the data schema—for example, by adding a new property, deleting an old one, or changing a property type—you must increment the schema version to trigger a migration. Let’s walk through the process with examples.

Let’s say you have the following Realm schema and configuration:

const PersonSchema = {
  name: "Person",
  properties: {
    name: "string",
    age: "int",
  },
};

const realm = new Realm({
  schemaVersion: 1,
  schema: [PersonSchema],
});

Now, you decide to add an email property to the Person object. Here’s how you could handle adding that property via a migration in Realm:

First, you would modify the Realm object’s schema to reflect the changes.

// The variable name change is not necessary, I'm just using it to illustrate the schema has changed!
const PersonSchemaV2 = {
  name: 'Person',
  properties: {
    name: 'string',
    age: 'int',
    email: 'string?'
  }
};

Next, you would increment the schema version to a number that’s higher than the previous one. Realm uses this version number to determine whether a migration is needed.

const realmV2WithoutMigration = new Realm({
  schemaVersion: 2,
  schema: [PersonSchemaV2],
});

Finally, you would create a migration function that has detailed logic for how existing objects should be migrated to fit the new schema.

const migration = (oldRealm: Realm, newRealm: Realm) => {
  // Only apply this migration if schemaVersion is 2
  if (oldRealm.schemaVersion < 2) {
    const oldPeople = oldRealm.objects("Person");
    const newPeople = newRealm.objects("Person");

    // loop through all objects and set the new 'email' property to a default value
    for (let i = 0; i < oldPeople.length; i++) {
      newPeople[i].email = `${oldPeople[i].name.toLowerCase()}@example.com}`;
    }
  }
};

const realmV2 = new Realm({
  schemaVersion: 2,
  schema: [PersonSchemaV2],
  onMigration: migration,
});

Now, the next time the Realm database is opened with a newer schema version, the migration function will be executed, changing your database.

While this process may appear adequate for straightforward schema updates, its limitations become more pronounced with each subsequent, more complex modification to your data model.

Manual Schema Versioning

For starters, each schema update requires a manual version increment. This is prone to human error since it is easy to either simply forget to do so, or accidentally mess up the version number while resolving a merge conflict from another branch that changed the schema. Leaving this change in the hands of developers can result in significant risk, especially in a production environment.

No Patterns to Follow

Furthermore, there is no set pattern for how to incrementally add migrations. In the above example, the migration function is only handling one simple schema change. But what if we had over 100 migrations, varying from high to low complexity? Without a pattern to follow, this could slowly turn into a giant function with thousands of lines of migration logic.

Collaboration Conflicts

Moreover, managing this within a team can lead to some significant headaches. When multiple developers make schema changes concurrently, merge conflicts are almost guaranteed. Resolving these conflicts poses a risk for schema version mishaps. Imagine a situation where two developers independently implement divergent schema changes. Both individuals increment the version to 2 in their respective branches. Following a merge of these branches, the version remains incorrectly set at 2, even though there have been two sets of modifications. The schema version should now reflect 3 to capture the change accurately. This subtle yet critical error is easy to overlook and can lead to further issues down the line.

Error Propensity

Lastly, without a built-in mechanism to record which migrations have been applied, developers carry the burden of manually verifying that migration logic isn’t inadvertently repeated. While this can typically be managed by including a simple conditional check— if (oldRealm.schemaVersion < newSchemaVersion) {...}—within the migration function, the approach introduces yet another vector for human error.

These shortcomings indicate a need for a migration approach that can provide safety, traceability, and ease of use, especially as the data model grows in complexity or evolves rapidly.

Requirements for an Improved Migration Pattern

When evaluating the features of Rails migrations, three core goals stood out as essential for our pattern.

Scripted Generation of Migration Modules

In Rails, migrations are generated with a CLI command, such as: rails generate migration AddEmailToPerson. This command scaffolds a new migration file that encapsulates the changes, each with a predefined structure and singular focus. Emulating this was key in enhancing the developer experience around adding migrations. By developing a similar script for our Realm migrations, we ensured that each is treated as an independent unit that will be executed automatically as part of the migration lifecycle. Additionally, our script ensures migrations are generated in a way that safeguards against them being run multiple times or out of order.

Migration Tracking

Realm’s default setup lacks explicit migration tracking, prompting us to design a system similar to what is found in Rails. Each migration our script generates is logged in a MigrationRecord object within the database once it’s executed. This approach to record-keeping streamlines the migration process, as it alleviates the need for developers to manually verify whether a migration should run.

Automatic Schema Version Management

Lastly, we took aim at the tedious and error-prone task of manual schema version management. By automating schema version increments based on the presence and sequence of migration files, we minimized the potential for human error. The automated process parses the migration files, determines the next appropriate schema version, and applies it within Realm’s configuration. Developers no longer need to manually track or increment schema versions, dramatically reducing the risk of version-related issues and allowing us to maintain a reliable upgrade path for our database structures.

Developing the Migration Pattern

Without further ado, let’s jump into the implementation of our custom migration pattern.

Types for Migrations

Since our project is using Typescript, defining a type for our migrations was necessary:

export type RealmMigration = {
  name: string;
  migration: (oldRealm: Realm, newRealm: Realm) => void;
};

This type lets our migrations have a name that can be saved in the database and used to check if this migration has already run, and a migration function that contains the logic for this migration.

We also need a schema for our migration record:

export const MigrationSchema = {
  name: "MigrationRecord",
  primaryKey: "id",
  properties: {
    id: "objectId",
    name: "string",
    createdAt: "date",
  },
};

Script for Generating Migrations

Next we need a script to generate the migration, and make the necessary updates to the project to ensure the migration is run. The following code generates our migrations and includes explanations within the comments.

Note: This script uses lodash, so be sure to install it before utilizing this script.

import { kebabCase, upperFirst, camelCase } from "lodash";
import fs from "fs";
import path from "path";

const TS_EXTENSION = ".ts";
// directory where the migration files are stored
const MIGRATIONS_DIR = path.join(__dirname, "../src/db/migrations");

const handleError = (errorMessage: string) => {
  console.error(errorMessage);
  process.exit(1);
};

// Helper function to generate migration name
// Example: migrationAddEmailToUser1629781234567
const generateMigrationName = (name: string, timestamp: string) =>
  `migration${upperFirst(camelCase(name))}${timestamp}`;

// Files that should not be included in the migration index file since they are not migrations
const excludedFiles = ["migrations.ts", "types.ts", "run-migrations.ts"];

// Fetch all migration file names without extension for example 1629781234567-add-email-to-user
const fetchMigrationFileNamesWithoutExt = (directory: string) =>
  fs
    .readdirSync(directory)
    .filter(
      (file) =>
        file.endsWith(TS_EXTENSION) && excludedFiles.includes(file) === false
    )
    .map((file) => file.replace(TS_EXTENSION, ""));

// Get the name of the migration from the command line arguments.
// For example, if the command is 'yarn generate-migration add email to user'
// then the migration name is 'add email to user'
// This supports multiple words in the migration name
// Calling `.slice(2)` removes the first two arguments which are 'tsx' and 'scripts/generate-migration.ts'
const args = process.argv.slice(2);

if (args.length === 0) {
  handleError("Please provide a name for the migration");
}

const migrationName = args.join(" ");
const timestamp = `${Date.now()}`;
const fileName = `${timestamp}-${kebabCase(migrationName)}${TS_EXTENSION}`;
const filePath = path.join(MIGRATIONS_DIR, fileName);
const migrationNameInFile = generateMigrationName(migrationName, timestamp);

// Template for the new migration file
const template = `import { RealmMigration } from '../migrations/types';

export const ${migrationNameInFile}: RealmMigration = {
  name: "${migrationNameInFile}",
  migration: (oldRealm, newRealm) => {
  }
}`;

if (fs.existsSync(filePath)) {
  handleError("Migration already exists");
}

// Write the migration file to the migrations directory
fs.writeFileSync(filePath, template);

console.log("Migration created at", filePath);

// Get the path of the file that will be used to export all migrations
const indexFilePath = path.join(MIGRATIONS_DIR, "migrations.ts");

// Get and sort all the existing migration file names without extension, including the new migration
const filenamesWithoutExt = fetchMigrationFileNamesWithoutExt(MIGRATIONS_DIR);
filenamesWithoutExt.sort((a, b) => {
  const aTimestamp = Number.parseInt(a.split("-")[0]);
  const bTimestamp = Number.parseInt(b.split("-")[0]);
  return aTimestamp - bTimestamp;
});

// Parse the migration file names to get the migration name, timestamp and variable name
const migrationInfo = filenamesWithoutExt.map((filenameWithoutExtension) => {
  const migrationName = upperFirst(
    camelCase(filenameWithoutExtension.split("-").slice(1).join("-"))
  );
  const migrationTimeStamp = filenameWithoutExtension.split("-")[0];
  const migrationVarName = generateMigrationName(
    migrationName,
    migrationTimeStamp
  );
  return {
    fileName: filenameWithoutExtension,
    name: migrationName,
    timestamp: migrationTimeStamp,
    varName: migrationVarName,
  };
});

// Create the content of the migration index file, exporting all migrations
const indexFile = `// This file is auto-generated by ts-scripts/generate-new-migration.ts
// Do not edit this file manually
// To add a new migration, run 'yarn generate-migration <migration name>'
import { RealmMigration } from '../migrations/types';

${migrationInfo
  .map(({ varName, fileName }) => `import { ${varName} } from './${fileName}';`)
  .join("\n")}
export const migrations: RealmMigration[] = [
  ${migrationInfo.map((migration) => migration.varName).join(",\n  ")},
];
`;

// Write the migration index file
fs.writeFileSync(indexFilePath, indexFile);

Since the script is written in typescript we need a way to execute it. You can use any tool, but I am using tsx. Here is the entry in my package.json for running this script:

"generate-migration": "tsx scripts/generate-migration.ts"

By automating this process, we ensure new migrations integrate seamlessly into our development workflow.

Running the Migrations

Once scripted, migrations need to be executed. Our run-migrations.ts has functionality for that:

import Realm from "realm";
import { MigrationRecord, MigrationSchema } from "../schema";
import { RealmMigration } from "./types";
import { migrations } from "./migrations";

// Takes an array of migrations and returns a Realm migration callback
export const buildMigrationRunner =
  (migrations: RealmMigration[]): Realm.MigrationCallback =>
  (oldRealm, newRealm) => {
    // Check if a migration should run by checking if a migration record exists
    const migrationShouldRun = (migration: RealmMigration) => {
      const migrationRecords = oldRealm.objects(MigrationSchema.name);
      return (
        migrationRecords.filtered("name = $0", migration.name).length === 0
      );
    };

    // Create a migration record for a migration that has run
    const createMigrationRecord = (migration: RealmMigration) => {
      const migrationRecord: MigrationRecord = {
        id: new Realm.BSON.ObjectId(),
        name: migration.name,
        createdAt: new Date(),
      };
      const newRecord = newRealm.create(MigrationSchema.name, migrationRecord);
      console.log(`Created migration record: ${newRecord.name}`);
    };

    // Run a migration and create a migration record
    const runMigration = (migration: RealmMigration) => {
      if (!migrationShouldRun(migration)) {
        return;
      }
      migration.migration(oldRealm, newRealm);
      createMigrationRecord(migration);
    };

    // If the schema version is the same, then no migrations need to be run
    // This is redundant since Realm will not call the migration callback if the
    // schema version is the same, but it's here for clarity
    if (oldRealm.schemaVersion === newRealm.schemaVersion) {
      console.log("Realm schema version is the same, skipping migrations");
      return;
    }

    console.log(
      `Migrating Realm from version ${oldRealm.schemaVersion} to ${newRealm.schemaVersion}`
    );

    // Run all migrations that have not been run yet
    const migrationsToRun = migrations.filter(migrationShouldRun);

    // If there are no migrations to run, we log and finish
    // Again this is redundant since Realm will not call the migration callback
    // if there are no migrations to run, but it's here for clarity
    if (migrationsToRun.length === 0) {
      console.log("No migrations to run");
      return;
    }

    for (const migration of migrationsToRun) {
      console.log(`Running migration: ${migration.name}`);
      runMigration(migration);
    }
  };

// When the app is first installed, migrations are not run because the schema
// has not changed. This means that the migration records are not created.
// If we don't do this, then
// the following situation could happen:
// 1. User installs app and begins using it (migrations aren't present yet since this is a new install)
// 2. We add a new migration
// 3. User upgrades to new version with new migration
// 4. All migrations are run including ones that did not need to run
// 5. User is sad
export const initializeMigrationRecordsIfNecessary = (realm: Realm) => {
  const migrationRecords = realm.objects(MigrationSchema.name);
  if (migrations.length === migrationRecords.length) {
    console.log("Migration records already exist, skipping creation");
    return;
  }
  const migrationRecordNamesSet = new Set(migrationRecords.map((x) => x.name));
  const missingMigrations = migrations.filter(
    (x) => !migrationRecordNamesSet.has(x.name)
  );
  console.log(
    "Migration records missing:",
    missingMigrations.map((x) => x.name)
  );
  realm.write(() => {
    for (const migration of missingMigrations) {
      const migrationRecord: MigrationRecord = {
        id: new Realm.BSON.ObjectId(),
        name: migration.name,
        createdAt: new Date(),
      };
      const newRecord = realm.create(MigrationSchema.name, migrationRecord);
      console.log(`Created migration record: ${newRecord.name}`);
    }
  });
};

With this script, we can trust that our migrations will execute only as necessary, safeguarding against errors during the process.

Putting It All Together

Bringing the migrations into our Realm setup culminates in the following configuration:

import Realm from "realm";
import { migrations } from "./migrations/migrations";
import {
  buildMigrationRunner,
  initializeMigrationRecordsIfNecessary,
} from "./migrations/run-migrations";
import { schema } from "./schema";

export const createRealm = () => {
  try {
    const realm = new Realm({
      // Array of schema records to use in the Realm
      schema: schema,
      // Using in memory realm for demonstration purposes
      inMemory: true,
      // Utilize the migration runner with the migrations that are generated
      // by the migration generator
      onMigration: buildMigrationRunner(migrations),
      // Set the schema version to the length of the migrations array
      schemaVersion: migrations.length,
    });
    initializeMigrationRecordsIfNecessary(realm);
    return realm;
  } catch (error) {
    console.error(error);
    throw error;
  }
};

With this setup, we’ve integrated automated migration creation and execution into our project. This approach has mitigated issues with migrations, and reduced the overhead for managing database changes in Realm.

Areas For Improvement

Moreover, our custom migration pattern has proven to be highly valuable for our team, yet, like all evolving systems, it has notable areas that can benefit from further development.

Testing and Validation

One thing we have yet to do is include automated testing and validation for migrations. While our system automates many aspects of the migration process, we’re exploring methods to integrate comprehensive tests that can verify the functionality of each migration. However, the modular nature of our migrations brings us closer to this aim, as it facilitates the use of isolated migration units within automated tests.

Rollback Capabilities

Another avenue of improvement is rollback capabilities. Currently, our pattern doesn’t natively support rolling back executed migrations if unexpected issues arise post-deployment. Implementing a robust and safe rollback feature would allow for seamless reversion to previous states and increase the fault tolerance of our database management.

Realm Migrations with Rails-like Patterns

Our Rails-inspired migration pattern has greatly improved how we handle schema changes in Realm, making the process more reliable and maintainable. Are you considering adding this pattern to your project? Do you have your own pattern for migrations in Realm? I’d love to hear in the comments!

Related Posts

Conversation

Join the conversation

Your email address will not be published. Required fields are marked *