How to Upload PDF Files to Azure Blob Storage Using Node.js

I’m currently working on a software project where our goal is to allow a prospective borrower to apply for a mortgage. As part of the lending process, the borrower is required to upload PDF documents such as their W2 and paystub so that a credit analyst can verify their information for underwriting. As a team, we discussed various methods to handle the document uploading process and landed on leveraging Microsoft Azure Storage and their easy-to-use Node SDK.

Getting Set Up

To start, navigate to the quickstart docs to get your account set up. You’ll need an Azure account, an Azure storage account, and the latest stable version of Node.js. The docs will walk you through the setup of each. Next, locate your storage account from the Azure Portal. In the menu bar, navigate from Security + Networking to Access Keys. Click Show Keys, and copy the connection string associated with your storage account. You’ll need it later. Finally, add the npm package for the Azure Client to your project using the command below.

npm install @azure/storage-blob

Now that you have the SDK installed in your node environment, add the following code to a TypeScript file to instantiate the Azure API Client. If you haven’t already, create a container in the Azure Storage Portal and keep track of the name.


import { BlobServiceClient } from "@azure/storage-blob";
const blobServiceClient = BlobServiceClient.fromConnectionString(
  "YOUR-CONNECTION-STRING"
);
const containerClient = blobServiceClient.getContainerClient(
  "YOUR-CONTAINER-NAME"
);

Uploading Files to a Container

First, in order to upload files to a given container, we’ll need a base64 encoded string of a PDF. Then, we’ll convert it to a Buffer object. Next, we must instantiate a BlockBlobClient class with the filename that we want to give our PDF, and call its uploadData method to send it to our Azure Storage container.


const uploadDocumentToAzure = async () => {
  const data = Buffer.from("BASE-64-ENCODED-PDF", "base64");
  const blockBlobClient = containerClient.getBlockBlobClient("FILENAME-TO-UPLOAD");
  const response = await blockBlobClient.uploadData(data, {
    blobHTTPHeaders: {
      blobContentType: "application/pdf",
    },
  });
  if (response._response.status !== 201) {
    throw new Error(
      `Error uploading document ${blockBlobClient.name} to container ${blockBlobClient.containerName}`
    );
  }
};

Downloading Files from a Container

Furthermore, in order to download files from a given container, once again we’ll instantiate a BlockBlobClient given a filename. Then, we’ll call its download method with parameter 0 to access the entire file. download returns a readable stream, so I’ve included a helper function streamToString to convert the stream to a base64 encoded string of the PDF.


const downloadDocumentFromAzure = async () => {
  const blockBlobClient = containerClient.getBlockBlobClient("FILENAME-TO-DOWNLOAD");
  const response = await blockBlobClient.download(0);
  if (response.readableStreamBody) {
    return await streamToString(response.readableStreamBody);
  } else {
    throw new Error(
      `Error downloading document ${blockBlobClient.name} from container ${blockBlobClient.containerName}`
    );
  }
};

const streamToString = async (
  readableStream: NodeJS.ReadableStream
): Promise => {
  return new Promise((resolve, reject) => {
    const chunks: any[] = [];
    readableStream.on("data", (data) => {
      chunks.push(data);
    });
    readableStream.on("end", () => {
      resolve(Buffer.concat(chunks).toString("base64"));
    });
    readableStream.on("error", reject);
  });
};

Deleting Files from a Container

Finally, deleting files is fairly straightforward. Unlike uploading and downloading, there’s no need to instantiate a BlockBlobClient. Just call the deleteBlob method on the ContainerClient class with the filename you wish to delete.


const deleteDocumentFromAzure = async () => {
  const response = await containerClient.deleteBlob("FILENAME-TO-DELETE");
  if (response._response.status !== 202) {
    throw new Error(`Error deleting ${"FILENAME-TO-DELETE"}`);
  }
};

That’s it. From setup to downloading to deleting, Microsoft Azure Storage and Node SDK allowed the team to handle the document uploading process.

Conversation
  • Can you also use Node.js Streams to upload arbitrarily large files (e.g. +2Gb) efficiently? Is there a requirement to send a string encoded in base64 rather than a plain stream of bytes in the Azure SDK?

  • Comments are closed.