Trust, Accountability and AI Coding Swarms

A late night conversation with Claude that started with Steve Yegge’s Beads architecture and ended somewhere unexpected. Full conversation (PDF).

The question

Who’s accountable when AI agents write 44,000 lines of code that no human has looked at?

Architecture convergence

Across Beads, Gas Town, Claude Flow, and Cursor’s swarm experiments, the same patterns keep emerging:

Flat peer-to-peer coordination fails — you need hierarchy and roles
State must be external — not in context windows
Small tasks, many workers — not long sessions
Different models for different roles — specialists beat generalists
Humans become orchestrators — not reviewers of every line

“Nobody reviews the code, you review the outcome.”

The accountability gap

But reviewing outcomes doesn’t catch everything. Tests pass, the feature works — and the security hole ships to production. SQL injection, timing attacks, PII logged somewhere, dependencies with known CVEs. Outcome-based review misses the subtle stuff.

In a personal project, “we’ll fix it later” works. In an enterprise: who signs off on security review? Who’s responsible when agent-written code leaks PII? Whose name is on the commit?

Not “context is the new bottleneck” but “accountability is the new bottleneck.”

Turtles all the way down

Adding more AI to check AI is infinite regress. LLMs generate insecure code → add security LLMs to check → who checks the security LLMs? The numbers are sobering: 12-65% of generated code snippets are non-compliant with basic secure coding standards. LLMs hallucinate non-existent packages up to 44.7% of the time.

Promise Theory

Mark Burgess nails it: “Trust isn’t transitive. You can’t escape trust through more layers.”

A promise is not an obligation — it’s just a declaration
Promises are not assumed to be kept — they’re just recorded
No agent can promise on behalf of another — this breaks the “more AI” recursion
Trust is built through history of kept promises

You can’t bootstrap trust from a system that doesn’t have it. If the base layer has a 12-65% vulnerability rate, stacking more LLMs doesn’t eliminate that — it spreads uncertainty across more layers.

The ceiling on AI agent autonomy might not be capability — it might be trust. And trust requires grounding outside the system.

What actually works

The approaches that showed real improvement all had one thing in common: grounding in something external to the LLM.

Static analyzers (deterministic, not AI)
Formal verification tools (mathematical guarantees)
Known vulnerability databases (human-curated knowledge)
Fuzzing (empirical testing)

The LLM becomes a translator between human intent and these grounded systems, not the source of truth itself.

Resilience engineering over prevention

Traditional safety says “prevent bad things from happening.” Resilience engineering (Hollnagel) says “things will go wrong, how do we survive and recover?”

The only truly non-prevention moves are blast radius control, fast rollback, and graceful degradation. And those aren’t “AI agent platform” features — they’re just good engineering. Git, feature flags, canary deploys, circuit breakers.

Focus on Impact

Risk = Impact × Likelihood. With AI agents, likelihood is basically unknowable. So focus entirely on the one variable you can control: Impact.

Dimension	Low Impact	High Impact
Blast radius	Internal tool	Customer-facing
Reversibility	Easy rollback	Data corruption
Visibility	Nobody notices	Public incident
Reputation	Expected to break	“Enterprise-grade”
Compliance	No audit	Regulated
Safety	Annoyance	Harm to people

The routing logic: Low impact → full autonomy, ship fast. Medium impact → lightweight checks, monitor closely. High impact → human in loop, smaller batches, staged rollout.

“The question was never ‘how do we eliminate risk?’ It was always ‘which risks are we willing to take for which rewards?’”

The journey

Beads → Gas Town → “accountability is the new bottleneck” → Promise Theory → “you can’t escape trust” → resilience engineering → “everything is prevention” → back to basics: Impact × Likelihood, but you can only control Impact.

Sometimes the most useful insight is admitting what you can’t know.

References: Steve Yegge (Beads, Gas Town), Cursor team (Swarm experiments), Mark Burgess (Promise Theory), Erik Hollnagel (Resilience Engineering)

Originally posted on LinkedIn