odo: Atomic Counters from the Command Line

odo-atomic-counter

A couple of days ago, I read something that stuck in my mind:

“Monitoring trick: Add a line to the beginning of your runit run scripts
to bump a counter. Alert when the derivative is unreasonably high.” –
@nrr, 4:05 PM – 2 Nov 2014

I use runit to manage some personal stuff, and really appreciate how easy it is to get new services running under it. I can just plunk run scripts in new service and log directories, log to standard out, and runit takes care of the rest. Because there’s so little friction in setting up new services, I’ve used it for things that would never even occur to me under a more cumbersome framework.

Still, I couldn’t think of a command-line program to increment a counter in a file without race conditions, at least not without some sort of database. While deferring to systems like postgres makes a lot of sense in production, I figured writing a standalone program to do it in C would be a good learning exercise, and it might work well for more casual use cases.

A Deceptively Simple Problem

Reading a number from a file, adding 1 to it, and writing back to the file isn’t terribly difficult. But what about if multiple programs using the counter file try to change it at the same time? Maybe each reads 100 and tries to update it to 101, but changing it from 100 to 101 twice means that one count gets lost. Sometimes that’s okay, but a monitoring tool should be more reliable, and it’s a lot more work to build in that kind of reliability after the fact if the basic implementation is flawed.

After years of Unix systems programming and reading Stevens’s Advanced Programming in the Unix Environment, I had a pretty good idea what constructs would be involved. Also, Mike English wanted to pair on it with me, to see how that kind of program would be structured in C, and to get more familiar with Emacs.

Syscalls and Atomic Operations

First off, when I need to control how processes share files, I think of mmap(2), a system call for mapping memory to files on disk and/or sharing it across processes. mmap(2) can do many things, but in this case it hooks into the OS’s disk buffer cache and virtual memory systems — if multiple processes open the same file with mmap(2), their view into the file will be shared and kept in sync, instead of each reading a snapshot of the file into their own memory and getting whatever it contains at that instant.

(I could also have used flock(2) to lock the file, updated it, and close(2)’d it, but I’ve been making a point to learn lock-less concurrent operations. That way, I can completely avoid worrying about how locks behave if processes crash or otherwise behave unexpectedly. While it’s overkill in cases like this, it greatly simplifies interactions in more complex systems.)

Once all processes are working with the same memory, atomic operations on it can safely update the value. Instead of reading data, updating it, and writing bytes to the file, only to have it interleave with other processes reading/updating/writing, an atomic compare-and-swap CAS operation like __sync_bool_compare_and_swap can conditionally update from exactly X to exactly Y. If something else changed it first, the update fails, giving a chance to update and try again. It’s unlikely that this “you first”, “no, you first” series of interactions will repeat more than a few times before the update applies successfully, and it will always apply without losing any updates.

Composability

While a CAS update on a memory-mapped integer met my original goal, Mike and I realized that wouldn’t really compose well with other Unix tools – most of them work best with lines of text. Instead of a raw integeter, the counter could be a numeric string, such as “1234”. CAS can still update “1234” to “1235” if the numeric strings are treated as fixed-width numbers. I opted for 64-bit counters where supported, so “1234” becomes “00001234”, which is 0x3433323130303030 (as a little endian uint64_t), or 3,761,405,300,627,746,864, and increases to “00001235” (0x3533323130303030), or 3,833,462,894,665,674,800. With CAS, it doesn’t matter if the difference is 1 or 256. Because it has a fixed width, the string needs to be padded with zeroes; I decided to name it “odometer” because of how the numbers look.

I started off just wanting to add 1 to a number in a file, but doing it robustly led through virtual memory, dedicated processor instructions for atomic operations, and some string formatting tricks. Still, they’re all cleanly abstracted away — the program can be used without worrying about any of those details. It just works.