Article summary
Update: This post was written prior to the GSD repository being abandoned by its maintainer. The future of GSD is uncertain, but the lessons from this can be applied to other spec-driven development frameworks.
My Intro To GSD
I first used the Get Shit Done spec-driven development system for Claude on a client project. This was my team’s first experiment with spec-driven development. My team leads had done the discuss and plan steps, and my job was to run gsd-execute-phase, review the output, and ship the work. From that seat, GSD looks like magic. Phases that would have taken me multiple days to code were down to only one.
So I tested it on a side project where I’d own the whole loop. Lunchboxd is a Letterboxd-inspired food review app where users review individual meals instead of restaurants. Web on Next.js, Postgres on Neon. Starting from scratch — just me, GSD, and an empty repo.
The Speed
GSD’s execution loop is genuinely impressive once you’ve experienced it. You shape the implementation in gsd-discuss-phase, the system researches and writes plans, then gsd-execute-phase runs those plans with a pause to check in and validate.
This wasn’t my first time using Claude to build a project from scratch. In my previous Lunchboxd attempt, I wrote one multi-paragraph prompt and sent my AI agent to the races. A web app that looked like it took weeks was up within hours. But when I tested it, bugs upon bugs made it unusable. From the start, GSD showed a stronger foundation that cut those usability issues down significantly.
The Pattern
Then I noticed where the wheels came off. The first plan in a phase — the one running on the freshest context with the cleanest spec — was almost perfect. By Plan 03 or Plan 04, quality dropped visibly. Subtle bugs, misread conventions, decisions that contradicted earlier ones in the same phase. The further I got into a workstream, the more mistakes appeared.
Early on, my CLAUDE.md noted that the “follows” table needed two indices: one on follower_id and one on following_id, because the feed query joins through both. Plan 01 of the social phase set that up correctly. By Plan 04, an executor writing a related migration quietly redefined the table without one of the indices. The verifier didn’t catch it — the plan itself never mentioned indices. Those upstream decisions had evaporated.
The client project team noticed the same drift. Working in parallel streams, deviations from the plan compounded fast. We reconciled it through communication and reworking stories to realign everyone.
The Time Investment
My favorite part of GSD is gsd-discuss-phase. You shape the implementation by answering questions about layout, error handling, edge cases, and all the gray areas the system surfaces before the planner touches anything. That conversation can run for 15 minutes or more for a single phase.
gsd-plan-phase adds research and plan verification on top. Before any of that, you write a CLAUDE.md. For Lunchboxd, mine covered the tech-stack rationale, a “what NOT to use” table drawn from the failed vibe-coded attempt, and confidence levels on each technology choice. By the time gsd-execute-phase ran, I’d already spent more time talking to Claude than Claude was about to spend writing code.
At first, that felt like a problem. But the more phases I shipped, the more I realized that time was an asset. The edge cases surfaced in discuss were ones I hadn’t considered. Building with them in mind from the start prevented the vibe-coded messiness I’d seen before — rather than patching problems later.
The Result
Owning the whole loop changed how I think about GSD. The version I saw on the client project — just running gsd-execute-phase and watching things work — was a partial view. The real GSD is the discuss conversation that took me cumulative hours.
On small side projects, small phases are the way to go. I do the discuss step thoroughly. I keep phases small enough that Plan 01 does most of the meaningful work and later plans fill in details I’ve already verified. When I run gsd-verify-work, I do it honestly. When I catch drift in Plan 03 or Plan 04, I stop and re-plan rather than letting the system burn context patching mistakes on top of mistakes.
GSD made me faster — but not how I expected. It didn’t shortcut the thinking. It moved the thinking earlier and made it more important than ever. The hours I used to spend implementing are now hours I spend in gsd-discuss-phase. The quality of that conversation determines whether the execution goes well.
I was blown away by how much faster things moved on the client project with a large existing codebase for context. Lunchboxd showed me how capable spec-driven development is at starting from scratch.