Operations

3/8/2026 seed

Preamble

Operations defines what the Factory needs from long-running autonomous work: context packs, PRD/spec gates, confidence ledgers, verification evidence, maturity states, writeback targets, budgets, pauses, replays, and kill controls.

Autonomy Needs A Brake

The Factory becomes autonomous only when agents know when to stop. Operations is the branch that decides what counts as a run, what counts as work, what counts as proof, and when the machine has to stop. It owns the venture-system requirements for long-running mission execution, even when reusable runtime implementation belongs in Orion or Harness later. The false positive is a glowing control plane: tasks completed, agents active, logs full, costs spent, and no honest answer about maturity.

This is the governance problem in its plainest form. A founder can hand over execution and still keep judgment. A founder can also hand over the rhythm of judgment itself: when to continue, when to escalate, when to spend, when to kill, when to call a result good enough.

Operations has to make those handoffs visible. An approval button is weak if the human lacks the context or authority to reject the move.

Maturity States Are Claims

Factory has to keep its maturity language strict. Runs means a command executed. Works means behavior appeared on representative cases. Tests pass means automated checks cover known failure modes. Quality passes means domain gates passed. Polished means the operator surface is clear and hard to misuse. Launch-ready means monitoring, rollback, and constraints exist. Launched means activated. Landed means the intended outcome happened.

Those states cannot collapse. A scraper that runs has not produced trusted evidence. A product that deploys has not landed. A founder packet that exists has not improved a decision.

The Runtime Has To Leave Evidence

Operations needs queues, retries, logs, approvals, blocks, escalations, replays, pauses, budget ceilings, token ceilings, and kill/scale rules. Every run should say what object was acted on, what action was proposed, who owned it, what could go wrong, whether the action was reversible, what external obligations existed, what verification route applied, and what writeback happened. Without that evidence, the Factory cannot tell useful autonomy from expensive drift.

The loop should be accountable after the fact. Which agent moved? Which source did it trust? Which gate let it pass? Which budget did it spend? Which human could have stopped it? Which later outcome proved the run useful or wasteful?

The First Real Proof

Operations lands when a Factory run reports maturity honestly and changes future execution through verified writeback. The branch succeeds when the runtime can pause or kill work for a better reason than exhaustion. It fails when completed tasks become the only proof that the system is working.