The Hazards of Transactional Test Isolation

These days, it’s standard practice to run your unit and system tests with some degree of parallelism. When doing so, it’s important to ensure that the code being tested doesn’t interfere with other running tests. Usually, this comes down to the database layer.

There are a few options for accomplishing isolation at the database layer. The most tempting approach is to use transactional isolation. Just wrap each test so that its code runs inside a database transaction, and abort the transaction at the end of the test.

This was my preferred approach for a while, and for a few reasons:

  • There’s no need for multiple copies of your database.
  • There’s no need for tooling to create, migrate, or otherwise wrangle all those databases.
  • When it comes time to clean up after each test, aborting the transaction is easy and faster than other approaches.

The last point is particularly important. When you have hundreds of tests interacting with the database, you also have to clean up the database hundreds of times. Every little bit of time spent adds up quickly.

Alas, while this approach can get you far, there are caveats.

1. In-Flight Transactions Can Interact

Even before you attempt to commit (or abort) a transaction, certain actions may trigger row- or table-level locks. This isn’t as insidious as it sounds; if you create this problem, you’ll generally run into it quickly and be able to work around it. Nevertheless, I mention it as an illustration of how transactions are an imperfect approach to test isolation.

2. Your Application Code Will Need to Learn about Savepoints

In SQL, it’s not possible to nest transactions. What would it mean to “commit” a transaction that’s still pending from within another transaction, anyway? Your application would have the false belief that its data has been persisted.

Strangely enough, that’s precisely the behavior you want in a test. In the real world, however, this behavior would be nonsense. Most databases instead provide savepoints. At any point in a transaction, you can declare a savepoint, and, later on, roll back to that point. Effectively, any intermediate statements are discarded.

Given that transactions are both necessary to ensure data integrity with updates involving more than one row, and that transactions represent a kind of semantic negotiation between your running code and the database, I’m willing to bet your application makes use of them. And, when using transaction test isolation, this means that you’ll need to make your application code detect when it is being tested and use savepoints instead.

Depending on which tools you are using, this may be trivial. For example, Knex.js will transparently (or perhaps opaquely?) use savepoints whenever you begin a “nested” transaction. Regardless, this brings us to the next hazard:

3. Savepoints Are Not Transactions

Particularly, there is no way to “commit” a savepoint.

One effect of this is that, when using transactional test isolation, your application will never be asked to resolve a serialization error. This leaves some important codepaths untested. How will your application behave when you have multiple serializable transactions in-flight at the same time? Well, you may not find out until production.

Additionally, your tests will be completely oblivious to any violations of any constraints that have been set deferrable. Deferred constraints are only checked when you attempt to commit a transaction. If you achieve test isolation using transactions, then at no point will your code ever attempt to commit the transaction. You’ll always roll back and skip those deferred constraints.

Conclusion

Using transactions to isolate your tests is indeed faster and possibly easier than other options. Despite this, there are some notable tradeoffs.

If your application isn’t impacted by these caveats, then using transactional isolation may be the right choice for you.

On the other hand, on my current project, we’ve experienced all of the above issues and have since migrated to another approach. Despite the test suite being a few seconds slower, we have more confidence in our tests and have discovered a few bugs that weren’t exposed when we were using transactional isolation.