Observatory

3/13/2026 seed

Preamble

Observatory records agent runs so Orion can tell a one-off mistake from a pattern, a trace from evidence, and a clean-looking pass from a claim worth trusting.


The Row Chooses What Can Be Learned

Telemetry feels objective because it arrives as rows, counts, costs, and timestamps. A trace is still a shadow of the work, shaped by what the runtime can observe and what the schema knows how to keep.

The row decides what Orion can learn. If the schema records cost and status but loses override, evidence quality, failure class, or repair, the future system learns the wrong lesson cleanly. Observatory needs canonical events, attribution, append-only raw truth, replayable projections, quarantine for unresolved capture, repair receipts, and public-safe exports.

The Trace Has To Change The System

The current substrate has proof paths: wrapped command capture, Codex App imports, replay, export, doctor, and eval. The launch blocker is ambient capture. Normal Codex and OMX work still has to emit session, prompt, tool, stop, attribution, and failure events without manual wrapping.

If the operator has to remember to log the interesting parts, the log will flatter the system. Observatory becomes serious when capture is ordinary enough to record boring work, failed work, abandoned work, and work that felt successful before the eval disagreed.

One Mistake Or A Pattern

Orion should treat different failures differently. A bad prompt, missing context, weak routing, broken workflow, stale memory, low-quality source, wrong model, and human override should land in different places. Otherwise every failure becomes a vague lesson.

The useful record names the class, the locus, the repair, and the next test. It separates project outcome from orchestration quality so a single row cannot smear blame across Factory, 2nd Brain, Harness, and Orion.

The First Real Proof

Observatory is proven by a repair that survives replay. A captured failure becomes a score history, a routing change, a prompt update, a skill warning, a quarantine repair, or a public receipt that changes what the garden can honestly claim.

Until ambient capture passes real sessions, Observatory has a substrate and a gap. The gap is the point the page has to keep visible.