wkhtmltopdf and Rails

Earlier this week Dustin Tinney and I needed to add simple pdf generation support to a Rails application. We queried the Atomic Brain Trust for advice and were told to check out wkhtmltopdf.

wkhtmltopdf is a simple shell utility that converts html to pdf using the webkit rendering engine, and Qt.

After downloading the application we did a quick test to prove if wkhtmltopdf would work well for us.

> ./wkhtmltopdf --print-media-type "http://google.com" output.pdf
Loading pages (1/6)
Counting pages (2/6)                                               
Resolving links (4/6)                                                       
Loading headers and footers (5/6)                                           
Printing pages (6/6)
Done

Everything worked great. Adding the print-media-type flag used the print stylesheet, and putting the source_url in quotes allowed for crazy url strings to work properly.

Our next step was to spike the pdf creation logic into our application. Our strategy was to send print requests to an action which uses wkhtmltopdf to render a different action in our application, designed for this purpose.

We did the following things to create our spike:

1.) Put the wkhtmltopdf shell script into our RAILS_ROOT/script directory.

2.) Create a destination pdfs folder in our RAILS_ROOT/public directory.

3.) Update routes and add the following action to our products controller.


def create_product_pdf
  source_url = product_url(:id => params[:id])
  output_pdf = "output_#{Time.now.to_i}.pdf"
  destination = "#{::Rails.root.to_s}/public/pdfs/#{output_pdf}"
  command_line = %Q{#{::Rails.root.to_s}/script/wkhtmltopdf --print-media-type "#{source_url}" #{destination}}
  
  # Execute wkhtmltopdf
  `#{command_line}`
  
  redirect_to "/pdfs/#{output_pdf}"
end

4.) Update the view to send pdf print requests to the create_product_pdf action.

We figured that would be enough to make things work, but to our surprise the page hung when we pressed the pdf button. After 30 minutes of struggling Dustin was able to identify the problem. In our development environment we are using WebBrick to serve our pages, and that server is single threaded. The problem we had in that environment is that the first request would come in when we clicked the pdf button. Then during that action we shelled out to wkhtmltopdf. wkhtmltopdf would then make another request to our same development server to render the page. The result – deadlock!

This same problem does not exist in our production environment. We have a Passenger environment setup on our production server, and it spawns multiple ruby processes to handle requests.

In order to make things work properly in our development environment, we stopped using WebBrick and instead began using Passenger Standalone.

Conversation
  • Garrett says:

    There is a gem that does for you already: http://railscasts.com/episodes/220-pdfkit

    Nice article though!

  • Thanks for reading Garrett. This gem looks very interesting. I will keep it in mind for the next time I need to integrate pdf support.

  • Comments are closed.