There’s been a lot of buzz around the Rails world about the new ActiveRecord query interface, since the 3.0 release. It’s a fairly awesome tool, and I’ve been excited about how it lets you compose pieces of queries, which can help make code more DRY. However, it isn’t always clear how to get ActiveRecord and Arel to generate the correct SQL for the more difficult cases.
Here’s a case I ran into lately and how I solved it.
The Initial Solution
Suppose that your software product is sold to companies, and each company’s administrator can manage employees who work for any of his companies. To get all the employees a given user can access, you might:
Employee.joins(:companies). where("companies.id IN (?)", user.company_ids)
This will, unfortunately, give you duplicate employees when an employee works for more than one matching company, so (following the example in the Rails documentation) you might write:
Employee.joins(:companies). where("companies.id IN (?)", user.company_ids). select("DISTINCT(employees.id)")
This is closer, but the Employee objects you get back will not be very useful — they only have one column! Try this instead:
Employee.joins(:companies). where("companies.id IN (?)", user.company_ids). select("DISTINCT(employees.id), employees.*")
Success! Now go and make a method for that so you can write something like
EmployeeFinder.accessible_by(user). You can then use this encapsulated query all over your application to get that data, and if your business rules change, you’ll only have one place to make the update.
class EmployeeFinder def self.accessible_by(user) Employee.joins(:companies). where("companies.id IN (?)", user.company_ids). select("DISTINCT(employees.id), employees.*") end end
However, there is still one case where this will fall down: it’s not a composable query. Suppose your employees can upload their TPS reports, and that these files are stored in the
uploaded_files table. If you wanted to get all the files uploaded by a given user’s employees, you might try to write something like this:
It looks like it should work, but it’ll blow up with the error: @“Mysql2::Error:
Operand should contain 1 column(s)”@ What this cryptic error is really saying is
that ActiveRecord is trying to generate a subselect query that looks something like:
SELECT * FROM uploaded_files WHERE uploaded_by_id IN ( SELECT DISTINCT(employees.id), employees.* FROM employees JOIN companies_employees ON employees.id=companies_employees.employee_id JOIN companies ON companies_employees.company_id=companies.id WHERE companies.id IN (1, 2, 3) );
The problem is that the
"WHERE uploaded_by_id IN (...)" clause expects the
subquery to return one column only (the list of ids.) I’ve tried the below
approach, which seems like the logical way to overwrite it:
UploadedFile.where(uploaded_by: EmployeeFinder.accessible_by(user). select(:id))
This doesn’t work because ActiveRecord is just going to aggregate the select and add it on – not replace it. I’ve discovered that we can rewrite the select clause portion of the statement using Arel’s API:
employees = EmployeeFinder.accessible_by(user) employees.projections = [Arel::Nodes::SqlLiteral.new("employees.id")] UploadedFile.where(uploaded_by: employees)
This will help Arel produce a valid query that returns the information you’re interested in and without duplicating logic anywhere: your authorization logic is tightly wrapped up only in the EmployeeFinder.
It’s almost certainly possible to use PredicateBuilder or update the where clauses directly (see some of the code here), but I feel like if you’re going to take this approach the code above is minimally invasive.
The advantage of the above approach is this: You can find the records you are interested in with one single query to the database without error-prone duplicated logic. But there are other ways to do this:
- You could duplicate the employee finder logic and write it up as an explicit join.
- You could just run two queries.
Depending on your database and your measured performance characteristics (take into account this experience and the optimizations available to your DBMS), one of those might make more sense for you. Regardless, it’s one more tool for you to have in your toolbox.
Has anyone else out there run into this problem? I’m curious if people have solved this differently than I have.
Is there any particular reason why you check for distinct employee.id and not just leave ‘distinct employees.*’? This is what AR’s #uniq does.
Also, a useful method to know about is #except() http://api.rubyonrails.org/classes/ActiveRecord/SpawnMethods.html#method-i-except
Thanks for the feedback!
There’s no particular reason why I wrote it the way I did. I just copied from the documentation.
Thanks for the pointer to except. I didn’t realize that worked for select clauses as well (though it certainly makes sense now that you point it out.) That is certainly a more idiomatic solution than what I had above. I’m glad there’s a better way!
Try this instead:
where("companies.id IN (?)", user.company_ids).
I don’t think this is valid SQL for Postgres and it’s a distortion of the meaning of DISTINCT so even though it works on MySQL, it doesn’t make sense. The reason is that once DISTINCT eliminates duplicate rows based on the specified column, there’s no way to resolve the other columns.
Excellent point, Drew. I get sloppy sometimes since MySQL will allow things that it probably shouldn’t. As I recall that exact problem occurs with GROUP BY clauses as well (or at least, it did at one point.)
That will definitely work, but the disadvantage is that it’s going to make an extra round-trip to the database. Calling collect / map on the relation will cause it to be executed. For many applications and queries, that’s ok, but it’s always good to have other options available for when it isn’t.
Couldn’t you just query the employees via companies directly from the user? By using a has_many :through for example. And distinct them on employees.* for unique results. Seems like solving the wrong problem right now.
Why not just:
where(“companies.id IN (?)”, user.company_ids) ?
Because it doesn’t work ;)
Unless it’s fixed in a different Rails version, when I run that in my app, this code:
Employee.includes(:companies).where("companies.id IN (?)", [1, 2, 3]).to_sqlwill return
SELECT `employees`.* FROM `employees` WHERE (companies.id IN (1,2,3)).
While your code works, it’s a little to much sql for me and you are not taking advantage of half of what AR/ARel has to offer. I rewrote the code and now it also should work across ORMs
.merge(Company.where(:id => user.company_ids)) \
Meh – that doesn’t look good
Here’s a gist
I’m glad I posted this! I’m learning a lot from the comments. The merge method looks very useful. How did you learn about it? It doesn’t look very well documented (on either http://api.rubyonrails.org/classes/ActiveRecord/SpawnMethods.html#method-i-merge or on http://guides.rubyonrails.org/active_record_querying.html ). Do you have any recommendations for resources for learning these patterns?
The merge method is extremely useful. I use it all the time to merge scopes and another trick is to grab the arel_table off the model, it allows you to do additional complex queries such as unions, conditional joins, etc and then you can apply those to other composed queries with merge.
There is documentation in the code but the rails api docs must have not been generated with it yet. https://github.com/rails/rails/blob/master/activerecord/lib/active_record/relation/spawn_methods.rb
As far as places to learn the patterns – you just have to dig through lots of partial blog posts, etc. Ryan Bates covered merge in a rails cast I believe but he still used some sql that didn’t leverage arel I think (http://railscasts.com/episodes/215-advanced-queries-in-rails-3)
Here is another good place to see a bit about what Arel has to offer –> http://www.slideshare.net/flah00/activerecord-arel
Also – If you have anyone from your team heading to Ohio in January to attend a conference called CodeMash, I’m giving an advanced ActiveRecord/ARel talk there. I’ll publish slides and push out some blog posts after the conference and I’ll link those back to you also.
[…] Making Queries More Composable with ActiveRecord and Arel Mike Swieton says it isn't always clear how to get ActiveRecord and Arel to tackle certain difficult database queries so he shares some pointers he figured out here. […]
Comments are closed.