Your Coding Agents Are Only as Good as Their Feedback Loop

My current software project had an issue, and, over time, it has become more of an issue as coding agents have gotten better. That problem is running parallel agents in our codebase while providing them strong methods to validate their work. The codebase was started in 2021. That’s before agentic coding was a thing, let alone a consideration to design your codebase around.

At least in my flow, there was one checkout of the repo, running one test suite at a time. Now we hand larger and more complex pieces of work to autonomous agents, and, increasingly, we run several of them at once. A codebase that assumes a single operator will limit your agent’s ability to perform larger tasks, as it will struggle to validate its outputs. The size of the task you can hand to an agent is bound by the quality of its feedback loop and its ability to validate its work.

Your agent should be able to do everything you can, mostly

An agent that can write code but can’t run the tests or otherwise validate that the task is complete isn’t very helpful. An agent that can build, lint, run tests, and generate a migration on its own can iterate. It can self-correct without needing you to monitor it while it works toward its goal. This validation loop is why autonomous agents can increasingly take on longer, more complex tasks.

So the goal I had for this work was simple: my coding agent should be able to do everything I would do by hand. Run the unit tests, run the integration tests, auto-lint and format, and generate migrations through our ORM. As the models and coding harnesses have become better, they’ve started reaching for these validation tools on their own, without me asking. At least in the codebase I was working on, these things were not possible. Because of this, the agent would do one of two things. It would burn tokens grinding away, trying everything it could think of to finish a task it had no way of finishing, until I noticed and told it to stop, or it would eventually work out that it was stuck and tell me it couldn’t run the tests, which I already knew it couldn’t.

We all know coding agents can go awry sometimes. So, no, my agent can’t actually do everything I need to do for my job. It can’t access my local database, let alone the production one. If it needs information from the database, I’ll manually provide that context. But as much as is reasonable, allowing the agent to validate its own work drastically improves outcomes. So, in this example, it’s mostly about local development, but when debugging an issue in a deployed environment, my agent can validate the error by reviewing logs or looking at our error-monitoring stack.

A quick map of the repo

Our project lives in two repositories. The first is a monolithic backend that serves as a platform for three separate products with overlapping usage. The second is a monorepo that contains three web frontends and a mobile app, all of which communicate with our backend. Many people work on our project, and they have their own preferences for their coding harnesses and models. The three I had to account for were Claude Code, Codex, and Cursor. Whatever I built had to work across all of them.

Worktrees isolate git

All three harnesses have landed on the same primitive for running agents in parallel, that is git worktrees. Each agent gets its own checkout of the same repository, so two agents can sit on two branches at once without affecting each other’s diffs. That solves the most obvious conflict, but doesn’t solve everything else, mainly:

  • Gitignored files like .env don’t exist in a fresh worktree,
  • build artifacts,
  • or any machine-globals, host port bindings being the big one.

Those gaps fall into two buckets, which map cleanly to the two halves of the work I ended up doing.

The first bucket is shared resources: when two worktrees try to use the same host port or the same database, you can’t run them simultaneously. The second bucket is missing artifacts: a fresh worktree can’t compile the app because some build output was never generated, a single agent can’t get started without me holding its hand, telling it it needs to generate a build artifact first. Most of what follows is closing one of those two gaps. But all of it relies on the repo knowing it’s in a worktree in the first place.

Teach the repo to recognize a worktree

The detail that makes worktree detection possible is that a worktree’s git directory and the repository’s common git directory are different paths. In a normal checkout, they’re the same. Using this we can write logic to determine if we are in a git worktree:

_is_git_worktree() {
  local git_dir common_dir
  git_dir="$(git rev-parse --absolute-git-dir 2>/dev/null || true)"
  common_dir="$(cd "$(git rev-parse --git-common-dir 2>/dev/null || echo .)" 2>/dev/null && pwd || true)"
  [ -n "$common_dir" ] && [ "$git_dir" != "$common_dir" ]
}

Backend: getting the env files into a worktree

Our backend stores many environment variables, but a fresh worktree has none of them. Each harness has its own way to copy git-ignored files when it creates a worktree.

Claude Code Desktop reads a .worktreeinclude file that lists what to copy:

.env.docker
# other files to copy on worktree creation

Cursor uses a worktree.json that runs setup commands:

{
  "setup-worktree": [
    "cp $ROOT_WORKTREE_PATH/.env.docker .env.docker 2>/dev/null || true"
  ]
}

Codex, at the time, had no native worktree provisioning. This pushed me toward a more general solution that doesn’t care which harness you use. The entry point for running commands in this project is a Makefile. So, I made it handle setting up the environment files. Using the same worktree detection from above and symlinking the .env from the main checkout means the agent should have access to those values when you run your first make command in the worktree.

_is_git_worktree := $(shell \
  git_dir="$$(git rev-parse --absolute-git-dir 2>/dev/null || true)"; \
  common_dir="$$(cd "$$(git rev-parse --git-common-dir 2>/dev/null || echo .)" 2>/dev/null && pwd || true)"; \
  [ -n "$$common_dir" ] && [ "$$git_dir" != "$$common_dir" ] && echo 1 || echo 0)

ifeq ($(_is_git_worktree),1)
  # Auto-symlink env files from the main repo if missing.
  _main_repo := $(shell git worktree list --porcelain | head -1 | sed 's/worktree //')
  $(foreach f,.env.docker .env.local .env.staging .env.hybrid .env.langgraph,\
    $(if $(wildcard $(f)),,$(shell ln -sf $(_main_repo)/$(f) $(f) 2>/dev/null)))
endif

I went with symlinks rather than copies on purpose. A symlink keeps a single source of truth, so rotating a credential in the main checkout fixes every worktree at once. The trade-off is that worktrees aren’t insulated from env edits, which is exactly what I want for my day-to-day use. If you specifically wanted to test a different configuration in a worktree, you’d copy the files into the new worktree instead.

Backend: running tests in parallel without collisions

Getting the env files in place meant each agent could now theoretically run the tests. The next problem was that the tests have real dependencies. Postgres, among other things, are brought up through Docker. Two agents running unit tests at the same time would both reach for the same Postgres instance, and that goes about as well as you’d expect.

Luckily for us, Docker Compose already does most of the isolation for free. It namespaces containers, volumes, and networks by the project directory name, and each worktree is its own directory. The only machine-global resource left was host port bindings, which were hardcoded in our Docker Compose files. First, I parameterized every binding with a default that matches the old hardcoded value:

services:
  postgres:
    ports:
      - "${POSTGRES_HOST_PORT:-5435}:5432"

Second, in a worktree, set every host port to 0 in the makefile, which tells the kernel to hand out a free port:

ifeq ($(_is_git_worktree),1)
  export POSTGRES_HOST_PORT ?= 0
endif

Outside a worktree, behavior is identical to before. But when spinning up parallel work using worktrees, the ports no longer collide, and multiple agents can run their suites side by side in isolation.

Frontend: making a fresh worktree actually runnable

The frontend goal was the same as the backend. I wanted an agent to land in a brand new worktree, run pnpm install, and have the dev server, typecheck, and tests all just work, with no manual setup the agent needs to perform.

Our existing install already got most of the way there. It pulls the dependencies, and a postinstall step regenerates our API client codegen when needed. The two gaps were the gitignored .env file and a set of dist/ builds that the app imports at module-eval time. To solve this, I added a small bootstrap script at the end of the root postinstall and made it deliberate about when it runs. If we are in a git worktree, and not in a CI environment or building for our hosted deployment, it symlinks the env files if they’re absent, and builds the workspace dependencies if our build dependencies are missing. For the env file, I again chose to symlink them.

A note on Claude Code Desktop Preview

While I was chasing more autonomy and the agent self-validating itself, I tried Claude Code’s Desktop Preview. I hoped the Claude Code agent harness could validate its own changes by using its embedded browser to navigate the app and take screenshots. The preview booted the dev server fine. However, much of our app’s functionality requires authenticating against a third-party auth provider, and Desktop Preview only renders localhost URLs. The redirect to the provider was blocked:

⚠️ Link to <auth provider url> was blocked. Preview only supports localhost URLs.

I couldn’t find a workaround at the time. Maybe someday.

Wrap Up

There was one common thing I had to solve across these repos. Does the checkout know it’s in a worktree, and can it automatically fix what that worktree is missing? This type of developer/agent experience lets you hand bigger work to agents as their feedback loops improve and they get better at validating their own work using the tools you’d use to validate your own work.

Conversation

Join the conversation

Your email address will not be published. Required fields are marked *