PDF Snapshot Testing with Node and GraphicsMagick

The task I am working on this week involves generating downloadable PDF files for customer and supplier orders. We’d like to drive the implementation using tests and be able to find any regressions in the code automatically.

After looking at different alternatives for accomplishing this, we decided to try a visual snapshot approach similar to the way Jest does snapshot testing.

We’ll be using GraphicsMagick and ImageMagick.

Setup

We need to install GraphicsMagick for our test scripts and ImageMagick for using interactively. I have found that GraphicsMagick is better supported via Node modules, while ImageMagick is a bit easier to use on the command line for generating comparisons and viewing them later.

To install these tools and the Node modules on a Mac, we can use:

brew install graphicsmagick
brew install imagemagick.
yarn add gm @types/gm

Installation on other platforms may vary.

Code

First, we’ll define the paths that we’ll use to store our expected and actual PDF files. The PDFs are stored in a sub-directory (__pdf-snapshots) so that they will be close to our test code and easy to find.

const reportPath = path.join(__dirname, '__pdf-snapshots');
const actualFileName = path.join(reportPath, 'actual.pdf');
const expectedFileName = path.join(reportPath, 'expected.pdf');

Next, we’ll define a simple function to fetch the PDF from our server. In our case, we generate the PDF on the fly and send it to the client.

const getPdf = async (id: number) => {
  const reportUrl = `https://localhost:3100/report/${id}`;

  await download(reportUrl, actualFileName);
};

Then, we’ll define a function to compare a single page of our newly generated (actual) PDF and compare it against our expected PDF. We simply call the compare function of the GraphicsMagick Node module/wrapper.

You can specify the page number of the PDF in square brackets ([]). In our case, we are comparing a simple PDF file with only one page, so you’ll see later in the code that we simply use page 0 for both files we are comparing.

We don’t want to allow anything except a very precise comparison, which is why the tolerance level is set to 0. If you want to learn more about this setting, you can consult the documentation for the compare function.

We would like to be able to await this function, so we’ve wrapped it in a promise to handle the callback style of the GraphicsMagick interface.

const isPdfPageEqual = (a: string, aPage: number, b: string, bPage: number) => {
  return new Promise((fulfill, reject) => {
    gm.compare(`${a}[${aPage}]`, `${b}[${bPage}]`, {
      tolerance: 0,
    }, (err, isEqual, equality) => {
      if (err) {
        reject(err);
      }

      fulfill(isEqual);
    });
  });
};

Next, we’ll define a snapshot function which operates as follows:

  • When the expected PDF file does not exist, will simply copy over the actual PDF onto the expected PDF and pass the test.
  • When the expected PDF file does exist, we will compare it against the actual PDF and issue an error if they do not match.

When the expected and actual PDFs do not match, we can do a manual inspection. When we are satisfied, we can rerun the test with the UPDATE environment variable set. This will overwrite the expected PDF with the actual PDF and pass the test. We’ll see this in action later.

const snapshot = async () => {
  if (process.env.UPDATE || !(await exists(expectedFileName))) {
    await fs.createReadStream(actualFileName).pipe(fs.createWriteStream(expectedFileName));
  } else {
    const helpText = [
      'Actual contents of PDF did not match expected contents.',
      'To see comparison of the expected and actual PDFs, run:',
      `compare -metric AE ${expectedFileName} ${actualFileName} /tmp/comparison.pdf; open ${expectedFileName} ${actualFileName} /tmp/comparison.pdf`,
    ].join('\n\n');

    return expect(await isPdfPageEqual(expectedFileName, 0, actualFileName, 0), helpText).to.be.true;
  }
};

Finally, we’ll write a simple test which exercises this method.


describe('Report PDFs', () => {
  it('can generate a PDF', async () => {
    // generate test data
    const order = generateTestOrder();

    // fetch actual pdf
    await getPdf(order.id);

    // compare snapshot of actual and expected pdfs
    await snapshot();
  });
});

Execution

First, we’ll run the test.

We can see that both the actual and expected PDFs have the same timestamp.

Next, we’ll run our test again to see that only the actual PDF has been updated.

We can see that the timestamp for the actual PDF has changed, but the expected PDF hasn’t.

Then, we’ll modify our implementation and re-run our test.

We can see that our test detected a change between the actual and expected PDFs and reported it as a test failure.

Next, we will will manually inspect the expected PDF, actual PDF, and a visual comparison of the two. We can execute the command that is output upon test failure.

We can see the differences:

If, after manually inspecting the expected PDF, actual PDF, and comparison of the PDFs, we find that these changes are acceptable, we can simply re-run our test with the UPDATE environment variable set.

Finally, we can see that the timestamp of the expected PDF is updated.

We can add this new expected PDF to our repo and commit. If we are using a continuous integration environment, we will automatically see a test failure when the actual output differs from the expected output.