Article summary
Every Rails project I’ve worked on has used Active Record. ORMs like Active Record have many benefits. They abstract you away from the database. They make querying data elegant and simple (in most cases).
However, sometimes Active Record is not more efficient than Postgres.
Let’s look at an example. Say our Postgres database has the following schema. Our top-level table is a building. The building has many apartments. An apartment has many rooms. Each room has multiple pieces of furniture, and each piece of furniture has several materials. Your models might look something like this:
class Building < ActiveRecord::Base
has_many :apartments, dependent: :destroy
end
class Apartment < ActiveRecord::Base
has_many :rooms, dependent: :destroy
end
class Room < ActiveRecord::Base
has_many :furnitures, dependent: :destroy
end
class Furniture < ActiveRecord::Base
has_many :materials, dependent: :destroy
end
class Materials < ActiveRecord::Base
end
Now, let's say we want to remove a building. This building has 100 apartments, each apartment has 10 rooms, each room has 15 pieces of furniture, and each piece of furniture has five materials.
Using Active Record
Removing the building with Active Record is simple. You just need to call building.destroy
. Because we set up all of those dependent destroys, Active Record will take care of removing all of the related records.
Active Record is so nice that it will also make sure to call the delete hooks on any object that has them. It does this by loading each object into memory and calling object.destroy
on it. This transaction will create 91,001 objects. It takes my development machine about 77 seconds to run this scenario. So, it's not very efficient. Let's try this a different way.
Using Postgres
Databases are built to handle operations on large data sets. So, why not leverage the database for its purpose? Rails 4+ supports foreign keys. Let's add a migration to add foreign keys to our tables. It might look like this:
class CreateForeignKeys < ActiveRecord::Migration
def change
add_foreign_key :apartments, :buildings, on_delete: :cascade
add_foreign_key :rooms, :apartments, on_delete: :cascade
add_foreign_key :furnitures, :rooms, on_delete: :cascade
add_foreign_key :materials, :furnitures, on_delete: :cascade
end
end
Now, when we call building.destroy
, instead of creating all of those objects in memory, Postgres will handle destroying all of the related records. It takes my machine 45 seconds to run the same operation using cascade deletes.
Conclusion
Active Record is a powerful tool. Because of this, we may shy away from letting the database handle work for us. But there are occasions where the database is the correct answer.
Some will argue that putting optimizations in the database causes opaqueness, and new developers rolling onto a project may not understand why all apartments disappear when a building is removed. I would argue that one of the first things developers should do when joining a project is to become familiar with the project schema. If they do, they will understand why and how the deletion happens.
Let your database do what it was built to do!
Do you have to remove “dependent: :destroy” in models after add the foreign keys?
No, you don’t have to remove them.
I have the same question as Fon. Do you delete the dependent: :destroy statements because this new approach of cascading simply over-rides that?
Thanks for a nice article!
A few notes:
1. “dependent: :destroy” explicitly invokes “destroy” on each associated object which invokes before/after _destroy callbacks. Sometimes you really need them.
2. Foreign key with “on_delete: :cascade” is very similar to “dependent: :delete_all”, the difference is the place where you actually perform this “DELETE”: on the database or on the application level.
3. “dependent: :delete_all” shows your “DELETE …” query in your application logs. Foreign key does it directly in the database, so you can’t see it in your logs.
4. Of course, “on_delete: :cascade” is much faster because it’s not written in Ruby.
Thank you for reading the post and for taking the time to leave a comment.
You are absolutely correct with your first note. This approach would not work if the model relied on any deletion callbacks.
One important thing to note about “dependent: :delete_all” is that it will only remove the next association. So, in this example, using “dependent: :delete_all” instead of foreign keys would leave orphaned records below the apartment model.
Very nice article, congrats!
Nice one. Thanks, Ryan.