Finding Records that Don’t Exist in PostgreSQL

I was recently tasked with writing a bit of code to find delivery route stops where nothing has to be unloaded. This code is intended to prevent a delivery driver from stopping at a store when there’s nothing in the truck for that customer that day.

To solve this, I used several features in PostgreSQL, including WITH statements, VALUE lists, and a UNION, along with some data provided as bindings from our JavaScript code.

Let’s start with a list of customers and stores that are in our delivery plan. In our production code, this is queried out of the database. For this example, I’ll simply hard-code them as a list of objects describing the stores.

const storesInPlan = [
  { customerName: 'MY CUSTOMER', storeName: '001' },
  { customerName: 'MY CUSTOMER', storeName: '002' },
  { customerName: 'MY CUSTOMER', storeName: '003' },
  { customerName: 'MY CUSTOMER', storeName: '004' },
];

We’ll then transform this shape into a string that can be used a placeholder in the SQL (which will correlate with bindings passed in while executing the query).


// Resulting string: '(?, ?), (?, ?), (?, ?), (?, ?)'
const planStopPlaceholders = storesInPlan.map(() => ('(?, ?)')).join(', ');

// resulting array of strings: ['MY CUSTOMER', '001', 'MY CUSTOMER', '002', 'MY CUSTOMER', '003', 'MY CUSTOMER', '004']
const planStopBindings = _.flatten(storesInPlan.map((storeInPlan) => ([storeInPlan.customerName, storeInPlan.storeName])));

To begin, we build up a rowset with all of the stores in our plan. This rowset is simply populated from the list of values we have provided in the binding placeholders below (planStopPlaceholders) using JavaScript string interpolation. The VALUES list allows us to transform a set of n-tuples into rows with named columns.

  WITH
    plan_stops AS
    (
        SELECT
            customer_name,
            store_name,
            0 AS count

          FROM
            (VALUES ${planStopPlaceholders}) AS plan_stops(customer_name, store_name)
    ),

When executed on its own, this snippet results in:

customer_name store_name count
MY CUSTOMER 001 0
MY CUSTOMER 002 0
MY CUSTOMER 003 0
MY CUSTOMER 004 0

Next, we find actual allocations from our database. This rowset may very well contain less rows than what we are looking for, which is precisely the point of this code.

    store_allocations AS
    (
        SELECT
            allocations.customer_name,
            allocations.store_name,
            COUNT(*)
        FROM
            allocations

        INNER JOIN orders ON orders.id = allocations.order_id
        INNER JOIN plans ON plans.id = orders.plan_id

        WHERE
            plans.id = ? AND
            allocations.quantity > 0 AND

        GROUP BY
            allocations.customer_name,
            allocations.store_name
    ),

When executed on its own, as an example, this snippet results in:

customer_name store_name count
MY CUSTOMER 002 42
MY CUSTOMER 003 75
MY CUSTOMER 004 11

Then, we UNION together the list of stops we have in our plan with the result set of stops that actually have allocations.

    plan_stops_with_allocations AS
    (
        SELECT customer_name, store_name, count FROM plan_stops
        UNION
        SELECT customer_name, store_name, count FROM store_allocations
    )

When executed, as an example, this snippet results in:

customer_name store_name count
MY CUSTOMER 001 0
MY CUSTOMER 002 0
MY CUSTOMER 003 0
MY CUSTOMER 004 0
MY CUSTOMER 002 42
MY CUSTOMER 003 75
MY CUSTOMER 004 11

Finally, we SUM up the values to find those which have a count of 0. The HAVING clause will filter out any rows where stores have product allocations, leaving us with stores that have no product allocations for the day.

  SELECT
    customer_name,
    store_name,
  FROM
    plan_stops_with_allocations
  GROUP BY
    customer_name,
    store_name
  HAVING
    SUM(count) = 0;

When executed all together, as an example, this snippet results in:

customer_name store_name
MY CUSTOMER 001

By creating a rowset of items that we want to check in our database and UNIONing them together with stores that have actual allocations, we can SUM up the results and find those which have none. We have essentially found records that don’t exist in our database.

Conversation
  • Dave says:

    Wouldn’t a LEFT JOIN with a test for NULL on the right side be much easier?? It could be done in just one SQL statement – something like this:

    SELECT
    customer_name,
    store_name
    FROM
    (VALUES ${planStopPlaceholders}) AS plan_stops
    LEFT OUTER JOIN
    (
    SELECT DISTINCT
    allocations.customer_name,
    allocations.store_name,
    FROM
    allocations
    INNER JOIN orders ON orders.id = allocations.order_id
    INNER JOIN plans ON plans.id = orders.plan_id
    WHERE
    plans.id = ? AND
    allocations.quantity > 0
    ) AS store_allocations ON
    plan_stops.customer_name = store_allocations.customer_name AND
    plan_stops.store_name = store_allocations.store_name
    WHERE
    store_allocations.customer_name IS NULL

  • Comments are closed.