AI Manager Styles: From Full Visibility to Pure Outcomes

You’re managing the thing that writes code. So what’s your management style?

Most people start the same way: watching every action, approving every tool call. Then they trust more, review less, scale up. It’s an evolution. The management style that works on day one isn’t the one that works on day ninety.

Vincent van den Tol wants to keep track of how his application is built. Not because he doubts Claude Code. Because on larger tasks, you need to see where the agent is headed so you can correct course before it’s gone too far. He’s reaching for tools like fp, Backlog.md, and Vibe Kanban. The tools people are building look remarkably like the tools we already know: kanban boards, task breakdowns, story maps, sprint planning. The same management practices that worked for human teams are being adapted for agent teams.

The From Producer to Manager pattern predicted this. What I didn’t predict was that it’s not one management style. It’s a progression.

Stage 1: The micro manager

This is where everyone starts. You sit in a tight loop with the agent: watch the terminal scroll, approve every tool call, review every file edit before it lands. No external dashboard needed because you are the dashboard. Every action goes through you.

It works for learning, for high-stakes changes, and for building trust with a new tool. The cost is obvious: you’re the bottleneck. The agent waits for you between every step. But when you’re touching production infrastructure or security-sensitive code, that bottleneck is a feature, not a bug.

Claude Code already has built-in task management. It persists tasks across sessions, supports dependencies and blockers, and coordinates between multiple sessions via a shared task list. But the work stays inside the terminal. It’s like a coworker who always answers “I’m busy” when you ask what they’re working on. They might be doing great work. But you can’t steer what you can’t see.

Once you’ve seen the agent handle a hundred file edits correctly, you stop watching every one. That’s when you’re ready for the next stage.

Stage 2: The spec writer

The first leap of trust. You stop reviewing every action and start reviewing deliverables. You write the spec, the agent implements against it, one task per session. Each task has acceptance criteria and a definition of done baked in. The dashboard is a kanban with columns (todo, in progress, in review, done), but the real management tool is the spec itself. If the spec is clear, the agent delivers. If it isn’t, you see it in the output. The discipline is single-task-per-session: no context window overload, every run is reviewable. Backlog.md and fp.dev by Fiberplane take this approach.

Vincent’s follow-up comment captures the compounding value: “You can easily look up completed issues later, or refer them to Claude as extra context.” Every completed issue becomes institutional memory. You’re not just managing tasks. You’re building a knowledge base.

This stage teaches you what makes a good task. That skill becomes essential for what comes next.

Stage 3: The team lead

Once you can write good specs, one agent isn’t enough. You start running multiple agents in parallel. Each agent gets its own isolated workspace. The Kanban board becomes a control room: you see which agents are working on what, review their output before merging, and reassign work when something stalls. The management challenge shifts from quality to coordination. When five agents run simultaneously on the same codebase, someone needs the overview. Vibe Kanban by BloopAI is built for this.

Windsurf already calls shared agent sessions “trajectories”, a word that implies direction you can see and correct. At this stage you’re no longer pair programming. You’re running a team.

Stage 4: The night shift

At some point you realize you don’t need to review every deliverable either. If the success criteria are fully automatable (tests pass, build succeeds, linter is clean) the test suite can be the reviewer.

You write a detailed plan, hand it to the agent, and let it loop autonomously until everything passes. A stop hook intercepts every attempt to quit and redirects the agent back to work if verification fails. The agent doesn’t stop when it thinks it’s done. It stops when the work is verified. The Ralph Wiggum technique is the clearest example.

No visibility, no steering. Fire and forget. You get overnight productivity at ~$10/hour, but you can’t course-correct mid-run. If the plan was wrong, you find out in the morning. This stage only works if your earlier stages taught you how to write plans that are good enough to survive the night.

Stage 5: The swarm

The final stage: multiple agents running autonomously, escalating when stuck. Instead of you watching the work, the agent calls you when it needs you. You’re not monitoring, you’re on-call. Course correction happens at the agent’s request, not on your schedule.

Vibe Kanban literally moos when an agent finishes a task. Claude Code has terminal bells and sound alerts when agents complete work or need input. The async ping replaces the dashboard. You do other work until the agent calls.

The trust and accountability question is real here. When you’re on-call for five agents, each deciding independently when to escalate, the quality of those escalation decisions matters more than any dashboard. You’ve delegated not just the work, but the judgment about when to involve you.

Knowing when to apply which style

The progression isn’t always forward. The real skill isn’t reaching stage 5. It’s knowing which stage to use when.

Security-sensitive code? Back to stage 1, tight loop, every action reviewed. Customer-facing features? Stage 2 or 3, specs and boards. Low-stakes internal tools? Stage 4, overnight, no hesitation. A greenfield prototype? Maybe stage 5, let the swarm explore.

Good managers don’t pick one style. They switch styles depending on the task, the risk, and how well the agents know the codebase. The same person might micro-manage a database migration in the morning and run a swarm on documentation in the afternoon.

All stages share the same loop: decompose, implement, review, persist. The difference is who does the reviewing. You, the spec, the board, the test suite, or the agent itself.

The question isn’t which stage is best. It’s which stage fits this task, right now.