Every Rails project I’ve worked on has used Active Record. ORMs like Active Record have many benefits. They abstract you away from the database. They make querying data elegant and simple (in most cases).
However, sometimes Active Record is not more efficient than Postgres.
Let’s look at an example. Say our Postgres database has the following schema. Our top-level table is a building. The building has many apartments. An apartment has many rooms. Each room has multiple pieces of furniture, and each piece of furniture has several materials. Your models might look something like this:
class Building < ActiveRecord::Base has_many :apartments, dependent: :destroy end class Apartment < ActiveRecord::Base has_many :rooms, dependent: :destroy end class Room < ActiveRecord::Base has_many :furnitures, dependent: :destroy end class Furniture < ActiveRecord::Base has_many :materials, dependent: :destroy end class Materials < ActiveRecord::Base end
Now, let's say we want to remove a building. This building has 100 apartments, each apartment has 10 rooms, each room has 15 pieces of furniture, and each piece of furniture has five materials.
Using Active Record
Removing the building with Active Record is simple. You just need to call
building.destroy. Because we set up all of those dependent destroys, Active Record will take care of removing all of the related records.
Active Record is so nice that it will also make sure to call the delete hooks on any object that has them. It does this by loading each object into memory and calling
object.destroy on it. This transaction will create 91,001 objects. It takes my development machine about 77 seconds to run this scenario. So, it's not very efficient. Let's try this a different way.
Databases are built to handle operations on large data sets. So, why not leverage the database for its purpose? Rails 4+ supports foreign keys. Let's add a migration to add foreign keys to our tables. It might look like this:
class CreateForeignKeys < ActiveRecord::Migration def change add_foreign_key :apartments, :buildings, on_delete: :cascade add_foreign_key :rooms, :apartments, on_delete: :cascade add_foreign_key :furnitures, :rooms, on_delete: :cascade add_foreign_key :materials, :furnitures, on_delete: :cascade end end
Now, when we call
building.destroy, instead of creating all of those objects in memory, Postgres will handle destroying all of the related records. It takes my machine 45 seconds to run the same operation using cascade deletes.
Active Record is a powerful tool. Because of this, we may shy away from letting the database handle work for us. But there are occasions where the database is the correct answer.
Some will argue that putting optimizations in the database causes opaqueness, and new developers rolling onto a project may not understand why all apartments disappear when a building is removed. I would argue that one of the first things developers should do when joining a project is to become familiar with the project schema. If they do, they will understand why and how the deletion happens.
Let your database do what it was built to do!