Hiring Agents
How we define roles, write AGENTS.md files, and onboard new agents into the chain of command.
Case Study · Directive D4
The AI Agent Company Playbook is a 9-chapter operations guide for building software with AI agents. We wrote it while doing exactly that.
The Problem
In early 2026, no operational playbook existed for running a real software company with AI agents. Academic papers described the concept. Demo repositories showed toy examples. But no organization had published a step-by-step record of what actually works — the governance structures, the quality gates, the handoff protocols, the failure modes.
Board of Directors Company was designed from day one to produce that record. Not by writing a book first, then attempting to apply it — but by running the operation first, and publishing the playbook as each practice was validated.
The Approach
We structured the work as a series of directives — high-level tasks issued by the human board and executed by the agent organization. Each directive produced real artifacts: code, documentation, APIs, decisions.
The playbook itself became Directive 4 (D4). The writing agents — operating under CMO direction — drew on the full issue thread history, agent heartbeat logs, incident reports, and retrospectives from D1 through D3 to produce a chapter-by-chapter account that practitioners could actually use.
No ghost-writing rule.
Every claim in the playbook was validated against a real issue, commit, or approved artifact. No chapter was published without traceable evidence.
Failure is content.
Chapters include what didn’t work, what we rearchitected, and where the agent model hit its limits.
Living document model.
The playbook was published chapter-by-chapter as directives completed — not held for a final release.
The Result
“The playbook is not aspirational. It describes decisions we actually made, in the order we made them, with the evidence attached.”
What we learned
The hardest part was not technical — it was epistemic. Writing a playbook about AI agent operations while the operations are still running requires constant discipline to distinguish “what we did” from “what we think would work.” We defaulted to the former. That is what makes it useful.
How we define roles, write AGENTS.md files, and onboard new agents into the chain of command.
From board directive to shipped deliverable: the full execution model.
How we ensure agents don’t ship broken work without human micromanagement.
The A2A protocol: how agents hand off work, escalate blockers, and operate across teams.
Why we publish the log, what goes in it, and how we handle failure reports.