Detecting Errors in Entity Framework Code First Migrations

Our team uses Entity Framework’s code first migrations to manage updates to our database schema. Having not worked much with EF before, we didn’t understand the mechanics of managing these migrations well. That, along with a lack of care in reviewing and validating new migrations, led to an embarrassing number of broken migrations reaching our main development branch.

I’ve learned more about EF code first migrations, since then. However, we still make mistakes that are difficult to catch with code review alone. In this post, I’ll explain the different error cases we’ve encountered. I’ll also talk about how we’ve shifted some of the responsibility for detecting these onto our test pipelines. That increases our confidence in the migrations we let reach our main branch.

Generating and Applying Migrations

I have since read through more EF guides and experimented more with the tool. Here is my current understanding of generating and applying migrations:

  1. Before creating a new migration, changes are made to the models and model configuration classes.
  2. Running dotnet ef migration add MyNewMigration adds a new Migration class and updates the DbContextModelSnapshot with the new changes to the model/model configuration.
    • The Up and Down methods of the generated Migration class apply schema changes to the development DB via dotnet ef database update.
    • The model snapshot comes into play during the next migration. EF references the DB model snapshot to determine what schema changes can achieve the new updates to the models/model configurations.
    • And if that next migration needed to be removed while a developer was iterating on the schema changes, the previous migration’s BuildTargetModel function would restore the model snapshot
  3. Finally, when the time comes to put out a new release, the recommended way to apply changes to the production database is by generating and running SQL scripts.

This EF doc about Code First Migrations in Team Environments is a great resource for learning more about generating and applying migrations. It also provides concrete examples of some issues I’ll identify next.

Error Cases Encountered

With that context, let’s look at a few error cases we’ve encountered on our project. Additionally, I’ll review the EF CLI commands we run in our pipeline to detect these issues proactively. Note the omission in the following examples of any project-specific configuration or arguments such as connection string or EF project name.

The migration is not regenerated with model/model configuration changes.

You’ll often need to change the model configuration based on feedback during code review or while iterating. If you then forget to regenerate the migration, the migration and model snapshot won’t contain those latest changes to the model.

But, say the snapshot and model are drastically out of sync, perhaps missing an entire table or column. Then your application code would fail while your test suite is running. However, something more subtle like adding a unique constraint or a new index might go undetected. That could slip through to the main development branch. In that case, the migration SQL scripts generated to apply changes to the production DB would not contain all the desired changes.

Either way, you can detect this issue by generating a temporary test migration and inspecting its contents. If the current model and model configuration are in sync with the DbContextModelSnapshot, the generated test migration should be empty. To perform this validation, we checked in a target migration file we know to contain empty Up and Down methods. We then generate a new migration and compare the files.


# generate a new migration
dotnet ef migrations add EmptyMigration

# search the current directory for the generated file (ignoring the timestamp on the file name)
GENERATED="$(find . -type f -iname "*EmptyMigration.cs")"
if [[ $GENERATED == "" ]]; then
    echo "Couldn't find the generated empty migration file"
    exit 1
fi

# diff the files - ignoring any whitespace diff
EMPTY=true
diff --ignore-blank-lines --ignore-all-space ./TargetMigration.cs $GENERATED || EMPTY=false

if [[ $EMPTY != "true" ]]; then
    echo "Did not get an empty migration! Verify that the current model snapshot is up to date."
    echo "Generated Migration Contents:"
    cat $GENERATED
fi

if [[ $EMPTY != "true" ]]; then
    exit 1
fi

The latest migration’s BuildTargetModel is out of sync with the current model snapshot.

If two developers work on unrelated schema changes in parallel, each developer’s migrations will not include the other developer’s changes. For the developer whose code changes reach the main development branch first, this is no problem. The model snapshot and the migration they added will contain identical schema configurations.

Then, when the second developer’s code is ready to go in, their changes to the model snapshot can be applied during the merge. In that case, the model snapshot will be consistent with the model/model configurations. However, if the second developer fails to regenerate their migration after the first developer’s migration goes in, the second developer’s BuildTargetModel snapshot won’t contain the schema changes added by the first developer’s migration.

In this scenario, the Up and Down methods of each migration are valid. So, this only becomes a source of annoyance when another developer adds another migration after the second developer’s. If that third developer gets it right on the first try, no problem. Otherwise, developer three will encounter trouble if they must revert to an earlier migration and remove their own incorrect migration. Reverting the new migration will go fine, but removing the new migration will cause problems. That’s because the last migration’s invalid BuildTargetModel will be applied to the overall model snapshot.

To detect this kind of inconsistency between the model snapshot and the last migration’s BuildTargetModel, extend the above script with the following:


# check model snapshot content before removing test migration
CONTENTS_BEFORE=$(cat MyProject/DbContextModelSnapshot.cs)

# remove the test migration
dotnet ef migrations remove

CONTENTS_AFTER=$(cat MyProject/DbContextModelSnapshot.cs)

SNAPSHOTS_MATCH=true
diff --ignore-blank-lines --ignore-all-space <(echo "$CONTENTS_BEFORE" ) <(echo "$CONTENTS_AFTER") || SNAPSHOTS_MATCH=false
if [[ $SNAPSHOTS_MATCH != "true" ]]; then
    echo "Latest migration target model != current model snapshot!"
    exit 1
fi

There’s invalid syntax in the generated migration SQL scripts.

The final error we’ve run into is invalid syntax found in generated migration SQL scripts. Running dotnet ef migrations script generates SQL scripts used to deploy database changes to the production database. By default, each migration is wrapped in a transaction. So, if there is a syntax error (from custom SQL added) in the migration script, the problematic migration won’t be applied. But subsequent migrations will (assuming they don’t conflict with the skipped migration).

For our team, this issue manifested when a trailing semi-colon was omitted from custom SQL added to one of our migrations. When the --idempotent option is used, the SQL to apply a migration is run conditionally if the migration in question is not found in the __EFMigrationsHistory table. So, all schema changes (including custom SQL added to migrations) are wrapped in if blocks. This caused our application code to begin to fail due to an expected field on the model missing from the production DB.

To check for this issue, we updated our validate migrations script to run the run SQL scripts from the migrations.


# generate sql scripts from migrations
dotnet ef migrations script --idempotent --output migrations.sql

NO_SNEAKY_SYNTAX_ERRORS=true
psql ON_ERROR_STOP=1 -f migrations.sql || NO_SNEAKY_SYNTAX_ERRORS=false

if [[ $NO_SNEAKY_SYNTAX_ERRORS != "true" ]]; then
    echo "Got an error running generated migrations.sql - double check any custom SQL added in your migration (remember your trailing semi-colons!!)"
    exit 1
fi

Benefits of EF code first migrations

Improving our understanding of EF code first migrations has allowed us to set up some guardrails in our test pipeline. That lets us detect and prevent the kind of errors I’ve outlined. Again, I highly recommend EF’s Code First Migrations in Team Environments article for further reading and guidance on resolving these errors.

Conversation

Join the conversation

Your email address will not be published. Required fields are marked *