Managing Data Loaders in Node.js with Shared State

Article summary

When developing software, it’s always tough to work with caching. While my team appreciates the efficiencies that Data Loader can provide for applications, we still struggle with the same problems that always come up when caching queries, mainly invalidating caches when a data set has changed.

To help manage our data loaders, my team has been experimenting with storing our data loaders in an “app context” object that is passed around our server-side code. On our project, this context object is created for each request on our server via our GraphQL resolvers. Typically, this context will hold information about the current request, such as session info and the current user. But we found it useful for managing our data loader lifecycle, as well.

The Pattern

Most functions for our server-side code accept the app context as the first parameter. Whatever is stored on the app context can be read from or manipulated by any function. It turns out this is really nice for data loaders, where one function could do the work of looking up a piece of information, and another function that does the same lookup can get the cached result instantly.

Here’s what it might look like in practice, where I’m calling into a repository to grab food records from a database. I’m tracking the database access calls with a counter on the app context.


const appContext = AppContext.buildAppContext();

await FoodRepository.findById(appContext, 1);
// Fetch count is one since cache is empty.
console.log(`fetch count: `, AppContext.getFetchCount(appContext));

await FoodRepository.findById(appContext, 1);
// Fetch count is still one since id 1 is in the cache.
console.log(`fetch count: `, AppContext.getFetchCount(appContext));

await FoodRepository.findById(appContext, 2);
// Id 2 is not in the cache, for the fetch count increase to two.
console.log(`fetch count: `, AppContext.getFetchCount(appContext));

await FoodRepository.findById(appContext, 1);
// Id 1 is in the cache, for the fetch count stays at two.
console.log(`fetch count: `, AppContext.getFetchCount(appContext));

// By clearing the data loaders, fetching a particular id will increase the fetch count
// since the cache is now empty.
AppContext.clearDataLoaders(appContext);

await FoodRepository.findById(appContext, 1);
// Id 1 is no longer in the cache, so the fetch count increases to 3
console.log(`fetch count: `, AppContext.getFetchCount(appContext));

console.log('fetching id 1');
await FoodRepository.findById(appContext, 1);
// Id 1 is in the cache, so the count stays at 3
console.log(`fetch count: `, AppContext.getFetchCount(appContext));

However, we can potentially run into trouble if we forget to clear the cache after mutating the data set.


const appContext = AppContext.buildAppContext();

// Lookup all records in the data store. These results are cached.
const allFoods1 = await FoodRepository.findAll(appContext);
console.log(`first food list: `, allFoods1); // Output is "Apples, Cookies, Pancakes"

// Create a new record in the data store.
await FoodRepository.createFood(appContext, 'Waffles');

// Repeat the same lookup. The new record is not in the list since the cache is still around.
const allFoods2 = await FoodRepository.findAll(appContext);
console.log(`second food list: `, allFoods2); // Output is still "Apples, Cookies, Pancakes"

// Clear the cache.
AppContext.clearDataLoaders(appContext);

// The new record is now included in the data set.
const allFoods3 = await FoodRepository.findAll(appContext);
console.log(`third food list: `, allFoods3); // Output is "Apples, Cookies, Pancakes, Waffles"

Notice how we don’t get our expected results until we properly deal with the cache. With this example, it’s fairly straightforward to fix the issue, but this potential bug can be dangerous in a larger, more complex system. For this reason, I’ve put the cache clear functionality inside the create function itself.


// The create function for FoodRepository, with cache clearing
const createFoodV2 = async (appContext: AppContext.Type, name: string, price: number): Promise => {
  const newId = DataStore.insertFoodRecord({ name, price });
  AppContext.clearDataLoaders(appContext);
  return Promise.resolve(newId);
};

And it allows for a much more predictable behavior:


const appContext = AppContext.buildAppContext();

// Lookup all the records. Results are cached.
const allFoods1 = await FoodRepository.findAll(appContext);
console.log(`first food list: `, allFoods1); // Output is "Apples, Cookies, Pancakes"

// Create a new record. This version of the create function will clear out the cache.
await FoodRepository.createFoodV2(appContext, 'Waffles', 2.12);

// The new record is now included in the data set as expected.
const allFoods2 = await FoodRepository.findAll(appContext);
console.log(`second food list: `, allFoods2); // Output is "Apples, Cookies, Pancakes, Waffles"

By clearing the cache in the create function, we can be sure that any new queries of the data set will be up-to-date.

If you want to dig into this more, I set up an example project with this data loader management pattern.