Article summary
One of the chief concerns of a software development team is managing work. We even have our own jargon—user stories, tasks, chores, bugs, epics, sprints—terms we use to help juggle assignments and stay organized.
But even a smart, hard-working team full of disciplined developers can fall behind, failing to meet deadlines and feeling overwhelmed by everything that needs to be done. To understand why work piles up like this, it helps to look at a different but similar industry: manufacturing.
The Penny Game
The book Velocity describes an enlightening simulation, a model of a simple manufacturing line. The game uses pennies and dice to represent pieces of work flowing through stations in a factory. It may be simple, but the penny game can improve our understanding of how software teams work, how the interaction of variable processes affect the system as as whole.
In the penny game, pennies come in at one end of the line, are processed by each station, then exit at the other end. This would be rather mundane but for one complication: each station does not always process the same number of pennies.
In the simulation, rolled dice indicate how many pennies each station is allowed to move. These dice represent reality’s irregularity. In a real manufacturing line, the productivity of each station varies over time. (In a software team, productivity varies a lot.) Because each station depends on the station before it, that variability adds up, eventually affecting the end result. Let’s see how.
We’ll start with four stations, each with four pennies. We’ll have five dice: one to tell us how much each station may process, and a fifth to tell us how many new pennies to add to the first station.
Let’s play in an online version of the Penny Game. Here’s how it starts:
Next, we’ll roll the dice and move pennies accordingly.
If we do that a hundred more times, we get something like this:
Do you notice what happened? We started out with four pennies in four stations, or 16 pennies in progress. After running the simulation for 101 steps, we ended up with 68 pennies in progress–more than four times our original amount! How did we get so many? Each station should be moving an average of 3.5 pennies every turn—three and a half pennies in, three and a half pennies out—a net total of zero.
But that’s not what happened. Instead, more pennies went in than came out. In this case, 364 pennies went in, an average of 3.6 pennies per turn, pretty close to the 3.5 we expected. What came out, however, is not so expected. Only 312 pennies were finished in 101 turns, or less than 3.1 pennies per turn. The dice at the end aren’t weighted, so something in our simulated manufacturing line is holding pennies back.
In fact, if we look at the number of pennies in progress turn-by-turn, there’s a lot more increasing than there is decreasing.
Watch the Backlog Grow
This isn’t just a fluke of how we rolled the dice. Most runs of the penny game turn out this way. The number of pennies in progress slowly increases, adding on average almost one extra penny every two turns.
This trend toward more and more pennies in progress is a problem. It’s not sustainable. On a manufacturing line, those pennies represent piles of partially finished goods. From this graph, we see that those piles would be growing, and carrying inventory isn’t free.
The same is true in a creative or software development setting. Work in progress costs something. It costs something to organize work, to switch between different tasks, to manage expectations. You can only handle so much before the work itself begins to suffer.
The problem of growing inventory translates from the penny game to real life. Eli Goldratt observed it in actual manufacturing lines more than 30 years ago, and he made it a central plot point in his business novel The Goal. It’s inherent in any system that has components with fluctuating productivity and that depend on one another—including creative processes.
So what’s going on? Why does it happen?
Capacity vs. Output
If you play the game yourself and watch closely, you’ll see that there’s a difference between capacity and output. A station might roll a four but only have three pennies to move. Its capacity is four pennies, but its actual output is only three. Based on fair dice, we guessed each station could move three and a half pennies per turn, but that’s only if it has enough pennies to move. If a station doesn’t have enough work lined up, it’ll be under-utilized.
In fact, in our 101 turns above, we only moved 92% of the pennies we could have. Some of the time, stations didn’t have enough work coming in to produce at 100% capacity. The first station was the busiest, having enough pennies to move 19 out of every 20 turns, but the stations down the line had three times more free time. When the first station didn’t process enough pennies, it affected the rest of the system.
Something like this happened to my own team just a few weeks ago. This project’s process has several dependent steps, and each is highly variable. Stakeholders describe the features they want, our designer creates mockups, stakeholders decide on feature priorities, code is written, designs are reviewed, and updates are integrated and deployed—a typical agile process and highly variable. But while planning for a recent sprint, we discovered we didn’t have enough work to keep everyone busy. We were under-utilized—not because the project was almost done, but because work was held up somewhere before our role in the process.
And, as always, more work was on its way, which would soon increase our team’s work in process. Having more in-progress work adds overhead. Those tasks have to be juggled, creating delays. Worst of all, it’s easier to make mistakes while juggling, meaning wasted work—reliability suffers.
The client will eventually notice all that work piling up, especially on a long project. The designer might be begging the client to make decisions so he or she can create mockups, but the client is wondering why the features they prioritized six weeks ago aren’t live yet. New features and bug fixes are tied up somewhere in the system, slowing turnaround.
Let’s make one more tweak to the penny game. Take a marker and draw a dot on the last penny of the first station. Count how many turns it takes for that penny to make it out the end of the line. Then draw a new dot on the next penny going into the first station. Count how many turns that one takes to finish.
What you’ll see is that the longer you play, the more turns it takes a penny going into the system to come out the other end. The total turnaround time increases as the work in progress increases, another result we see in real-life software development.
Changing the Rules
The penny game is a simple yet meaningful simulation of processes that both vary and depend on each other. Now that we’ve seen it reproduce two real-world problems, we can experiment with the rules to get an idea how to fix them. What if some stations are more productive than others? What if you start out with a different number of pennies in each station? What if you limit how much work is allowed into the system? Each change to the rules influences the behavior of the system, sometimes improving results, sometimes making them worse.
I encourage you to play with the penny game. Here’s an online version you can explore. You might discover something unexpected.
What do I need to do to roll the dice at: http://exupero.org/pennygame/ ?
On the left side of the page below “0 steps”, you’ll see a pale blue line. Hover over that line, and the page controls will slide out.