Recursion Lab

3/10/2026 seed

Preamble

Recursion Lab tests whether agent loops can improve from failure without learning the evaluator, flattering their own reports, or promoting a brittle repair.


The Loop Wants To Praise Itself

Self-improving systems have a simple failure mode: they learn how to satisfy their own judge. The score rises, the loop gets cleaner, and the operator inherits a machine that optimized the test while the work stayed weak.

Recursion Lab keeps the risk inspectable through loop designs, replay methods, repair policies, evaluator behavior, and promotion rules. A repair should survive outside the run that created it before it becomes doctrine.

Replay Before Promotion

Observed failure starts as raw material. A bad handoff, weak routing decision, lazy verifier, missing source, or overconfident model answer needs replay before it changes the system.

The lab should keep the trace, the hypothesis, the patch, the evaluator result, and the keep-or-discard decision. Without that chain, recursive improvement becomes self-editing with a better name.

A Better Score Can Be A Trap

The strongest contradiction is that Orion needs recursion and should distrust it. The system has to learn from failures quickly enough to compound, then slow down enough to avoid promoting accidental lessons.

Holdout failures, regression checks, bounded experiments, and explicit promotion gates keep the loop honest. If a repair improves one scenario and damages another, the lab should preserve the wound instead of sanding it down.

The First Real Proof

Recursion Lab is proven when an observed orchestration failure becomes a replayed repair that works in a later run with different context. The proof is boring on purpose: fewer repeated failures, cleaner handoffs, narrower routing rules, and evals that catch the same weakness before the operator does.