Nicer C99 APIs with Designated Initializers

While working on a library for property-based testing in C, I discovered a trick that can be used to make significantly nicer library interfaces in C99: “designated initializers”. As of C99, struct literals can have their fields set by name, in any order. The C99 standard explicitly updated the behavior for how fields in struct literals are handled:

6.7.8 point 21:

“If there are fewer initializers in a brace-enclosed list than there are elements or members of an aggregate, or fewer characters in a string literal used to initialize an array of known size than there are elements in the array, the remainder of the aggregate shall be initialized implicitly the same as objects that have static storage duration.” (emphasis mine)

Since memory with static storage duration is already initialized to zero, this means that in C99 we can finally depend on stack-allocated structs’ fields being set to zero, rather than garbage data from previous stack frames. This is a huge improvement! If pointers to structs are used as arguments for function calls, it also gives C99 a portable way of using optional and keyword arguments. Read more on Nicer C99 APIs with Designated Initializers…

theft: Property-Based Testing for C

Recently, I discovered a bug in my heatshrink data compression library, a result of a hidden assumption — I expected that input to uncompress would be padded with ‘0’ bits (like its compressed output), but if given trailing ‘1’ bits, it could get stuck: it detected that processing was incomplete, but polling for more output made no further progress.

Read more on theft: Property-Based Testing for C…

Property-Based Testing – Testing Assumptions You Don’t Know You’re Making

Finding good test input can be tricky. Even with loads of unit tests, bugs still get through.

Consider a wear-leveling algorithm for flash memory — it takes a series of write operations and spreads them over the flash, because individual blocks can only be written and erased so many times before they wear out. If any write sequence leads to writes concentrating in specific blocks, something isn’t working.

Read more on Property-Based Testing – Testing Assumptions You Don’t Know You’re Making…

Callaloo Radio System: Part 2 – Building a Homebrew USB Device

At the end of part 1, the radio link between the receiver and the bathroom doors’ transmitters was working, but how does the receiver get its data where someone else could see it? I could have put a couple red/green LEDs on the receiver board itself, or wired it to some sort of display, but that doesn’t give much room for future expansion. (We may be remodeling the downstairs floor in a couple months, and adding another bathroom is likely. Other sensors could also use the same radio link.)

Read more on Callaloo Radio System: Part 2 – Building a Homebrew USB Device…

Callaloo Radio System: Part 1 – Setting Up a Radio System from Scratch


The bathroom on the main floor of our office is down a short hallway, so we can’t see whether the bathroom is available without looking around the corner. To solve this problem, we made an Arduino-based monitor for a reed switch (a magnetic switch) on the door, setting an LED red or green to indicate whether the bathroom is occupied or available.

Read more on Callaloo Radio System: Part 1 – Setting Up a Radio System from Scratch…

Getting Started with MQTT

As more and more things around us become networked, the communication protocols tying them together need careful rethinking. This network of devices, sometimes called the “Internet of Things” or “Machine-to-Machine” network (though it could also just be called “the Internet”), includes many embedded devices with very limited resources.

Protocols designed for typical ethernet networks, such as HTTP, are based around assumptions that no longer fit: they expect more bandwidth, processing power, and network reliability than may be available, and that networked devices will be on most of the time.

There is a more appropriate alternative, however: MQTT. It’s much lower in overhead, with only a 2-byte header for many messages. Its design suits devices that are suspended most of the time, with only occasional network activity. It also has support for reliable delivery built into the protocol, so simple sensors can just flag an outgoing message as requiring confirmed delivery and let the message broker take care of delivery reattempts. Using a standard messaging protocol for all communication also greatly reduces the surface area for possible security vulnerabilities. Read more on Getting Started with MQTT…

Lightweight Indexing for Small Strings

Lately, I have been investigating performance improvements for heatshrink, my data compression library for embedded systems. Since it may be processing sensor data in real-time, or compressing data as it transfers over a network, compression can directly impact overall throughput. After experiments with string search and indexing algorithms, I’ve settled on a particularly effective method, which I’m calling a “flattened linked-list index”.

Where’s the Time Going?

Before trying to speed it up, I measured where it was spending most of its time. Profiling revealed two hotspots:

  1. Detecting repeated patterns in recent history.
  2. Managing bit-buffering for its I/O.

Read more on Lightweight Indexing for Small Strings…

Comparing the Cost of Different Multiple-return Techniques in C

C’s design limits functions to directly returning at most one value. Unfortunately, there are many cases where returning more than one makes sense — returning a data buffer and its size, returning either a success code and data requested or an error code, splitting a tree node into two, etc. Making returning multiple values awkward has likely led to many security problems over the years, when people forget to track (or check) sizes associated with buffers. There still isn’t an obviously correct way to return multiple values, just a couple of methods with different trade-offs.

Three Common Techniques

The most common method involves mutation: The function returns one value, and additional values are written into pointers that were passed in by the caller. (They may or may not be NULL-checked.) While there are conventions for this, such as returning a status code and writing the result(s) into parameters, the language standard has no opinion. Passing around pointers to return values through can be a source of subtle errors, and (along with pointer arithmetic) also complicates static analysis.

Read more on Comparing the Cost of Different Multiple-return Techniques in C…