Identifying a Rails 3.0.5 Performance Problem with perftools.rb

I’ve been anxious for a good opportunity to try out perftools.rb ever since I heard about it at RubyConf 2010 and read an article about using it to identify performance issues in a large, slow test suite. That opportunity presented itself after I upgraded my current project from Rails 3.0.4 to 3.0.5.

I updated the Gemfile to use gem "rails", "3.0.5", ran a bundle update rails, and kicked off the full test suite. Everything passed so I started up Passenger Standalone to do some manual testing. Right away I noticed that one of the main reports seemed a lot slower, so I checked the Rails development log for the processing time and compared it against what I was seeing in 3.0.4.

Using 3.0.4:

Completed 200 OK in 1877ms

Using 3.0.5:

Completed 200 OK in 9601ms

Here was a chance to try perftools.rb to find out what was causing this major slow down. My guess was that something had changed in ActiveRecord that was leading to a ton of new objects being instantiated, which in turn meant a huge increase in time spent in the garbage collector. To verify my guess I installed the rack-perftools_profiler middleware following the instructions for Rails 3. Along the way I changed the default printer to be text:

config.middleware.use ::Rack::PerftoolsProfiler, :default_printer => 'text', :bundler => true

In addition, I enabled class caching in the development.rb file to better mimic the production environment. With these changes in place I started the server again, and returned to the page in question in my browser. To invoke the perftools middleware I added ?profile=true to the end of the URL. Here are the top few lines of the output running Rails 3.0.4:

Total: 66 samples
      23  34.8%  34.8%       23  34.8% garbage_collector
       8  12.1%  47.0%        9  13.6% String#gsub
       5   7.6%  54.5%        5   7.6% BasicObject#instance_eval
       4   6.1%  60.6%        4   6.1% Mysql::Result#each_hash

And the same page running Rails 3.0.5:

Total: 236 samples
      84  35.6%  35.6%       84  35.6% Mysql#query
      53  22.5%  58.1%       53  22.5% garbage_collector
      22   9.3%  67.4%       22   9.3% Regexp#===
       5   2.1%  69.5%       38  16.1% ActiveRecord::ConnectionAdapters::Column#initialize

In 3.0.5 the application is spending more time in the Mysql#query method (84 samples) than the entire request took in 3.0.4 (66 samples)!. Looking back at the 3.0.4 profile output I had to go way down the list to find the Mysql#query method:

1   1.5% 100.0%        1   1.5% Mysql#query</code>

It turned out that my guess about more time being spent in garbage collection was kind of correct, but that was not biggest problem. Thanks to perftools.rb I had a better place to start looking – MySQL queries. After adding logging statements, stepping through code using ruby-debug19, and looking at a few recent commit diffs, I think I have identified the problem.

When ActiveRecord eager loads a has_one association, an AssociationProxy object is instantiated for each result record. The HasOneAssociation populates its @finder_sql variable in its constructor. If the association has a conditions hash it will be converted into a SQL string as part of this process. And this recent commit changed the way ActiveRecord handles association conditions:

Deprecated support for interpolated association conditions with the :conditions => ‘foo = #{bar}’ syntax, and added the new interpolation syntax which is :conditions => proc { “foo = #{bar}” }.

A side effect of this change is that the sanitized conditions for an eager loaded association are no longer being cached. The columns of the model’s table are used to convert from the conditions hash to a SQL string, and loading the columns for a model hits the MySQL database with a SHOW FIELDS FROM `table_name` query.

In my case, I am eager loading multiple has_one associations for a result set with several hundred rows. Because the sanitized conditions are no longer cached, MySQL is being queried for the columns over and over again.

I believe the following example models would encounter the problem I am seeing if you tried to load all Driver records while eager loading the favorite_car association:

class Driver < ActiveRecord::Base
  has_many :cars
  has_one :favorite_car, 
          :class_name => "Car", 
          :conditions => {:favorite => true}


class Car < ActiveRecord::Base
  belongs_to :driver


perftools.rb let me quickly identify where my application was spending the most time, rather than blindly stumbling around the code trying to figure out what had changed between Rails 3.0.4 and 3.0.5.

Now that I have identified the problem, my next step is to open a Lighthouse ticket for this. Then I’ll see if I can get the ActiveRecord tests running on my machine so I can attempt to put together a patch to fix it.

UPDATE 2011-03-20: I submitted a ticket for the problem: Ruby on Rails Lighthouse