The 10-Box Exercise: Where Estimate Rubber Hits the Calendar Road

Article summary

For a few years now, many Atomic projects have been using an exercise we call “10-box” to help with our sprint planning meetings. The 10-box exercise helps the team visualize and think through the tactical execution of the sprint. It does this by forcing them to reconcile abstract estimates in points with actual calendars and availability. This simple but effective activity helps the team avoid easy misses like vacation and think through/commit to execution time boxes. The exercise also helps optimize for alignment and synergy between stories that might yield dividends if tackled holistically by the same pair.

Our sprint planning meetings are typical scrum. The team identifies stories in the backlog to load into the sprint for development during that two-week period. Our stories are usually all estimated ahead of time in story points on a Fibonacci scale. All very normal. But one thing that can get lost in the stories and points and work ahead is the actual calendar. Who has days off? Where do all-hands meetings cut back on development time? Do stories have dependencies that won’t be met until later in the sprint?

The 10-box exercise helps us navigate these questions. In a 10-box exercise, we draw a simple 5×2 grid representing the weekdays in a two-week sprint, one for each pair of developers. We block off days taken up by meetings, vacations, or holidays, and fill the rest with stories. The result looks something like this:

This pair is planning on three stories this sprint, expecting two of them to take about three days and the other to take two. Straightforward, right? It looks simple, but the process of getting there and some of the rules around how we use these are crucially important.

The First Rule of the 10-Box Exercise

The projection of stories onto days trips some people up. It’s a given among agile teams that Thou Shalt Not Estimate in Days. Estimating in real-time causes all sorts of problems. For example, systematic error tends to mean that what you think will take “a day” may average out in the long run to take 1.273 days. You can’t know this factor ahead of time. It also changes as the team grows and shrinks or new tailwinds ease future work or growing technical debt fights it. It also communicates a false sense of precision. Error bounds make the difference between a seven-day and eight-day estimate, for example, not very useful. Estimating in days/time for longer-term planning and communicating expectations outside of the team is Bad.

Hence, the first rule of 10-box is you do not share 10-box outside of the core development team. It’s an internal scrum tool, not an external communication tool. Any external stakeholders not intimately involved with the team don’t need to know it’s happening. This is important not just for avoiding misunderstandings due to simplistic ideas about software estimation and the agile process. It’s also for creating a psychological “safe space” for the team. The 10-box exercise is a tool that aids in the planning process and helps us check our execution at the end of the sprint against our plans and projections, nothing more.

The Second Rule of 10-Box

One other key to keeping the 10-box exercise simple and manageable is not to sweat the actual calendar days or story order. Sure, 10-box models the sprint in terms of days, but we’re not creating a schedule here. We’re projecting work into workdays to make sure we aren’t overcommitting. This also prompts team members to think through the risk management and strategy they’ll use to approach this sprint’s work. Keep it simple:

• Only use half- and whole-day blocks.
• Coalesce little things into a half-day block to account for the distractions if needed.
• Ignore order: fit stories into the grid in the way that’s convenient to minimize fiddling. The focus is on the number of days, not which days or which order.

10-box is about the amount of time available for productive development work, not a schedule of when to do it within the sprint.

How to Implement the 10-Box Exercise

We begin sprint planning with one empty 10-box per track of work. What “track” means in practice varies from team to team. On some teams, a pair of developers will plan to collaborate throughout the entire sprint. On other teams, it might be an individual developer who will provide continuity through a track of related work, but whose pair changes from day to day as we rotate developers across the tracks to broaden exposure and system knowledge. Sometimes it’s an individual developer, due to having an odd number of team members, a more fragmented schedule (e.g. a tech lead). Whatever the team’s strategy, a single 10-box represents a single thread of continuity within the sprint.

2. Block out non-productive days.

An empty 10-box is usually prefilled with some unavailable days that emerge from the sprint cadence within the team or regular external obligations. You may find that sprint meetings, planning, go-lives, teach & learn activities, etc. will eat up somewhere around one to two days per sprint. These are often spread out throughout the sprint, but the exact schedule doesn’t matter. For 10-box, we’re focused on half-day granularity at most, so we’ll collapse all of these things into one or two boxes:

In addition, we ask the team to block out any vacation or other external time that will diminish their ability to deliver points. These get blocked off as well, like the sprint meetings. Vacation can add a bit of a wrinkle for the 10-box of a pair. If one developer in a pair is taking off a full day, we’d usually block off half a day to account for the reduced throughput.

Blocked off time is unproductive time from a velocity perspective. This helps force an honest conversation about team commitments and time prioritization in a way that doesn’t punish developers for factors outside of their control.

3. Associate a story with a track.

As you prioritize stories into the sprint, they are assigned to individual pairs/tracks. Do this with an eye toward where there may be dependencies, synergies, or contingent next steps between stories. Also look for where there are opportunities to broaden knowledge within the team. That could look like proactively assigning good first stories to a pair in which one or both of the developers is new to that part of the application or tech stack.

Usually there’s a senior member of the team — normally the tech lead — who has broad and deep familiarity with the system as well as the capabilities and exposure of the team members. This person will take point on suggesting track assignments, but other members of the team are welcome to contribute ideas around productivity optimization, team learning, or infrastructure building.

4. The pair picks the days.

The pair gets to decide how many days to plan for based on their comfort level with that area of the codebase, the technologies used, and the techniques required. They should strive for allocating an amount of time appropriate for the effort and the value delivered. However, there should also be an emphasis on modeling a plan they think they can deliver completely within the time block. They should treat the number of days they have allocated to the story as a time box. And, they should do their best to execute within that time commitment and deliver a completed feature, including writing any automated tests and/or deployment that should happen as part of completing the feature.

“Time-box” in this sense is meant loosely. Our emphasis is on the amount of time needed to complete a set of stories, not how much time is spent per story. To that end, often one story may end up being a little padded, but there’s uncertainty as to where that padding will be needed. As the pair executes on related stories, they may need to do refactoring or infrastructure work. The pair has the latitude to decide the time and context for doing that work. Maybe the “bigger” story gets marked as done early and the “smaller story” gets the extra time so it can be completed at the point of minimum ignorance.

While the pair picks the time allocation, the rest of the team should have input. Are there hidden complexities that make the allocation seem too optimistic? Point them out. Does the estimate seem sand-bagged too aggressively? Ask why, since it may be due to a misunderstanding. Or it may be accounting for time learning a new part of the system, which is a good reason to pad the estimate. This team dialogue helps the developers get input from the team that can help them approach a feature’s development in the smartest way possible.

Usually, some amount of this discussion has happened in advance during backlog refinement. However, last-minute reevaluation is still valuable. New infrastructure may provide tailwinds to the story that didn’t exist when the team originally assigned points to the story. Or, the team may need to account for collaboration overhead due to entanglements with other stories or missing context that needs to be sorted during the execution of the story. The 10-box exercise is a last opportunity to surface and account for issues like these before execution begins. Often valuable insights will emerge that we’ll note in the story description or the margins of the 10-box.

An important caveat about these estimates is they should only cover work within the developers’ control. Some parts of a process can be high or variable in latency, such as needing external stakeholders or QA teams to perform testing or approve PRs. Leave these sorts of activities out of the 10-box exercise.

5. Repeat #3 and #4 until all 10 boxes are full.

Bring up priority work and assign it to pairs, and then load it into 10 boxes until they’re all full.

When working together in the same physical space around a whiteboard, you can dole out stories to individual pairs that do their discussion/estimation in parallel while interested parties listen in and/or participate to individual story discussions. This helps keep things moving smoothly. Remote 10-box estimation is more difficult in large groups since this is harder to do while videoconferencing.

As 10 boxes fill up, you’ll often end up with awkward gaps where no stories fit. You can allocate this extra time to another story perceived as risky, or use it for addressing technical debt or quality-of-life improvements. Or, you can simply leave it open to see how much time is left at the sprint and defer the decision on how to spend it.

6. Reconcile estimate and/or velocity mismatches.

Often at the end of a 10-box exercise, you’ll see some stories have a significant difference between the assigned time in the 10-box and the expected duration based on the points estimate and team velocity. It’s worth asking “why” in these circumstances. It could be one of several factors, such as:

1. A change in complexity due to new infrastructure since the story was originally estimated.
2. Team members accounting for ramp-in and/or learning time. Perhaps this is a junior team member who’s still learning the ropes or an experienced team member branching into a part of the system they’re unfamiliar with.
3. Additional time allocated for aiming to solve a class of upcoming problems, hopefully reducing time/effort needed on subsequent stories later this sprint or the next. This bends the cost curve, making #1 more likely for those subsequent stories.
4. Accounting for time to address technical debt in a problematic area of the codebase.

What to do about these mismatches is up to the team, particularly team leads. For example, in #1 cases, the difference in complexity may point to old estimates that might be considered invalid in light of new knowledge. If the difference is significant, the team leads may choose to treat the 10-box exercise as a re-estimation of that effort and adjust story points before the sprint starts to avoid distorting velocity with out-of-date estimates.

In a #3 scenario, it may be worth reallocating points from a set of stories to account for infrastructure-building to better model the anticipated time spent into the backlog. Doing so helps force the question as to whether the investment in infrastructure building is worth the investment. Ideally, you’d see a net reduction in total points across that class of problem or at least confidence it will streamline build-out and increase flexibility when adapting to future changes of plans.

Another option is to look for ways to reduce the scope from lower-impact stories to capitalize on strategic investment or learning. Think of this as fine-tuning where the team’s attention is paid to help align value delivery with time spent.

The team doesn’t necessarily need to take action, but it’s at least worth the team leads mentally noting these situations and potentially having a discussion about trade-offs and approach before committing.

As a final step, the scrum master or delivery lead should review the total points loaded into the 10-boxes/sprint against the team’s velocity. These things should be pretty closely aligned, aside from individual stories already accounted for. But a remaining mismatch between points loaded into the sprint and typical team velocity is a useful additional data point for communicating expectations externally or fine-tuning process.

Is the sprint load lower than is typical? What is the impact of vacations, meetings, and other distractions? Or is the team accounting for ramp-in time to new parts of the domain, new technologies, etc? Or is the team padding for uncertainty? In that case, it can be helpful to explicitly plan to de-risk that uncertainty as early as possible within the sprint.

Is the sprint higher than usual? If this isn’t explained by known factors that led to larger stories being packed into fewer days (and not re-estimated), it may be a sign the team is being too optimistic. What risks may have been missed? It may be worth lightening the load by moving a story back into the backlog to account for surprises or distractions. If surprises happen frequently, consider accounting for typical amounts of distraction in step 2.

7. Get to work!

At this point, the team’s ready to go. They know what they’re working on, who they’re working with, and have thought through their approach to the sprint at a high level. It’s time to get to work and execute!

Throughout the sprint, teams try to hold one another accountable for hitting their estimates, or at least invalidating them as early as possible. The 10-box assignment is a sort of internal team handshake agreement for how we’re investing our time and therefore our client’s money. So, we strive to keep that investment aligned with the agreed-upon value. That said, the team executes as well and efficiently as possible, doing their work as usual.

Accounting for Surprises

Things don’t always go according to plan, and ultimately this exercise is about supporting the team in thinking through their execution and learning how to do that better. It’s about internal accountability, but missing the 10-box shouldn’t be considered a job performance issue in and of itself. Things happen — external interruptions, unanticipated lurking complexity, or simply a little bit too much optimism (or pessimism). The team needs to be able to respond and adapt, and a 10-box plan shouldn’t hinder that. This is part of the reason for the first rule of 10-box (keep it team-internal).

Usually once the 10-box is created, we don’t worry about updating it in response to changing circumstances unless doing so would be useful. However, deviating from the plan warrants communication. Treat the 10-box as a handshake agreement within the team, and talk about it if the developers need to renege. I encourage pairs to de-risk stories early and surface issues as soon as possible to maximize damage control. There are many options:

• Spend the additional time it takes to finish the story, forcing other stories out.
• Adapt the design/functionality to obviate significant complexity.
• Break a complex component into a separate story to address later. Or,
• Abort the story and reschedule it for a future sprint when time is available or once a decision on the options has been made.

These calls have a strategic impact on the project. That’s because they are ultimately about how we’re spending the client’s money. If the pair working on the story doesn’t have enough insight into the goals and desired trade-offs, they should involve the tech lead or delivery lead to help inform the decision.

Postcript – In Retro: How’d we do?

After the sprint is over, in retro, it’s worth revisiting the 10-box and asking, “How did we do?” Which stories went over? Which under? Did we have to abort a 10-box plan due to last-minute challenges? Is there anything to learn that will impact how we define or estimate future stories or risks we should consider in future 10-box exercises? This is helped by Atomic’s practice of punching our time, meaning tracking our time spend to the 15-minute granularity. This makes it easy to run a report for the sprint and see the number of hours spent where.

That said, not every missed estimate is a teachable moment. Most are not, in fact. Monitoring and reflecting on the misses can be useful, but no one appreciates being micromanaged and second guessed. When in doubt, be skeptical of 10-box uses that stray too far from “useful planning tool” and into job performance territory. Doing so encourages teams to sandbag their estimates to protect themselves in a low-trust environment, defeating the purpose of the exercise and robbing it of its value.

Related Posts

Conversation
• Sepideh says:

That’s very intresting! would you please tell me by which tools you are do this practice for the project? and what you recommend if we want to build and use this practice for our project?