1 Comment

Serving Remote PDF Files with Node.js and Express

Recently, I got my first exposure to Node.js by working on a small Express web app for a client. One of the things the app needed to do was forward PDF files from another web service to a browser. The task seems quite simple: Just make an HTTP request for the data, then serve the same bytes through our endpoint. However, I ran into a couple of sticky points when loading the PDF data over HTTP and serving correctly to the user, so I thought I would whip up a quick tutorial on how to do it right.

Loading PDF Data over HTTP

The first step in serving up our remote PDFs is to request them from our remote server. We can use Node’s built-in http.request for this. Unfortunately, most of the tutorials on this subject show you how to handle plain text data, not binary data like a PDF.

Handling binary data with http.request is pretty simple, but not very intuitive. Most tutorials go something like this:


var options = {
    method: 'GET',
    host: 'localhost',
    port: port,
    path: '/file'
  };

var request = http.request(options, function(response) {
    var data = '';

  response.on('data', function(chunk) {
    data += chunk;
  });

  response.on('end', function() {
    // do something with data
  });
});

request.end();

The problem here is that we use strings to represent the response data, and Node assumes those strings are UTF-8 encoded. This breaks down badly on binary data like a PDF, corrupting the data and making it unusable. Fortunately, the solution is simple. The chunked data the response object gets is actually a Node buffer object without any encoding assumptions, so as long as we don’t pretend it is a string, it will work correctly:


var options = {
    method: 'GET',
    host: 'localhost',
    port: port,
    path: '/file'
  };

var request = http.request(options, function(response) { 
  var data = []; 

  response.on('data', function(chunk) { 
    data.push(chunk); 
  }); 

  response.on('end', function() { 
    data = Buffer.concat(data); // do something with data 
  }); 
}); 

request.end();

Simply collecting and concatenating all the buffers is sufficient to preserve their encoding.

Serving a PDF

If a PDF or other file is stored locally to disk alongside your application, Express makes serving a breeze. At that point, you just need res.download or res.sendFile. Serving file data that is in memory is a little bit trickier. You might be tempted to just res.send(pdfData) and call it a day, but you would probably be disappointed in the result. Fortunately, simply setting a few headers (which is what res.download does anyway) is enough to solve the problem. It works something like this:


res.writeHead(200, {
  'Content-Type': 'application/pdf',
  'Content-Disposition': 'attachment; filename=some_file.pdf',
  'Content-Length': data.length
});
res.end(pdfData);

Notice that we used res.end instead of res.send since we already sent the response header.

If we put all of this together, it isn’t that much different than serving up any other request data in Express. Unfortunately, it took me a lot of time and googling to figure out all the subtle differences. Hopefully, you found this tutorial useful. I also put together a little sample app to demonstrate this code.