Lab / public-safe proofs

A public lab for real system proofs.

Selected Pume experiments, pilots, and operating proofs that can be inspected without exposing private runs or internal state.

Public snapshot packaged May 5, 07:56 PM. Raw runs continue privately until a milestone is ready for public review.

public-safe proofsselected milestonesoperating evidencequeued next probes

Lead public proof

Pume Cockpit v2

Operational state becomes reviewable in one place instead of spreading across chats, tickets, local notes, and private run traces.

Command surface active in April 2026 · Agents Discovery

Operating evidence

Product Idea Intake Coach burst (Mar 19, 2026)

Multiple related runs are visible as one cluster, making variation and rerun behavior legible.

Cluster from Mar 19, 2026 · Product agents

Codex-backed work trail

Lab Trail: Apr 17-May 5

Between the headline proof points, the lab kept turning field work into reusable operating patterns.

  1. Apr 17

    Recovery Map Becomes a Playbook

    A publishing-platform investigation turned scattered failure signals into a reusable recovery map: symptoms, trace points, and checks that move repair from guesswork to verified action.

    field pattern

  2. Apr 20-21

    Command Surfaces Get Sharper

    Agent and product-control work tightened how requests are routed, reviewed, and shaped into public proof instead of disappearing into one-off sessions.

    routing work

  3. Apr 22-24

    Evidence Loops Under Load

    Browser checks, build feedback, and session evidence were stress-tested so UI changes and operational findings could be inspected with cleaner proof trails.

    inspection loop

  4. Apr 27-28

    Field Signals Become Patterns

    A dense run of publishing-platform work converted repeated operational questions into reusable patterns for import health, editorial recovery, and delivery visibility.

    pattern work

  5. Apr 29-May 5

    Runtime Drift and Lab Packaging

    Runtime checks, reconciliation work, and the strongest field findings were distilled into public-facing Lab proof without raw prompts, paths, tickets, or private system names.

    public proof

Live proofs

What Pume has made inspectable

These are the strongest public-safe stories right now: real systems work with visible routes, concrete outcomes, or operating proofs attached.

Delivery method

Codex Delivery Lane maturity

Hardens the Slack-to-Codex delivery lane into a repeatable operating method with clearer gates, evidence, approvals, and recovery points.

Public proof

The lane is moving beyond single successful runs toward a system that can explain what happened, where it paused, and what needs approval.

Ops / VPSMaturing pilotMaturing pilot in April 2026

Publishing platform

Publishing telemetry pattern store

Turns recurring publishing-platform incidents into reusable investigation patterns across logs, APM, rollout state, and host-level skew.

Public proof

A field guide turns rollout failures, stale service targets, host skew, and latency tails into reusable investigation patterns.

Field PatternsField evidenceField evidence from May 2026

Publishing platform

Editorial import recovery map

Maps how complex editorial imports can pause, retry, recover, and prove state without exposing customer data or platform internals.

Public proof

Legacy content, queues, state mapping, and import feedback become observable stages instead of a one-way batch operation.

Field PatternsField evidenceField evidence from May 2026

Operating evidence

Signals that keep the work honest

Representative run clusters and infrastructure checks belong here so the lab shows how Pume validates reality, not only polished surfaces.

System boundary

Backend boundary extraction

Separates internal backend responsibilities from public web delivery so operator workflows can evolve without weakening the static site boundary.

Why it matters

Private control-plane behavior stays out of the exported website while preserving a path for richer internal operator tooling.

Ops / BackendIn progressBoundary work from April 2026

Capability control

Agent capability gate

Scores which agents can act, which tools they may use, and what evidence is required before larger operational work is promoted.

Why it matters

Capability control makes agent behavior inspectable before it becomes operational authority.

Agents DiscoveryGate activeGate active in April 2026

Browser evidence

Browser-debug evidence loop

Connects browser inspection, screenshots, traces, console findings, network evidence, and fix verification into one repeatable debugging trail.

Why it matters

Visual and runtime issues become trace-backed work items instead of subjective observations from a local preview.

Ops / QAActive loopEvidence loop active in April 2026

Infra reconciliation

Pinned-state VPS drift reconciliation

Compares intended repository state against what is actually pinned on the VPS, then classifies drift before operational sign-off.

Why it matters

Pinned state, expected state, and drift status are visible together before operational sign-off.

Ops / VPSUnder reconciliationReconciliation active in April 2026

Run cluster

Product Idea Intake Coach burst (Mar 19, 2026)

7 related runs were logged together, so the page shows a repeatable pattern instead of one isolated success.

Why it matters

Multiple related runs are visible as one cluster, making variation and rerun behavior legible.

Product agentsSuccessCluster from Mar 19, 2026

Visible failure

Product Idea Intake Coach run (Feb 24, 2026)

Failures stay visible here so the public lab reads like a proof surface, not only a highlight reel.

Why it matters

A failed run remains reviewable in the same system instead of being edited out after the fact.

Product agentsFailedVisible in Feb 24, 2026

Upcoming experiments

What the lab is exploring next

A short preview of what may become the next public experiment, kept lighter than the live proof surface above.

Web Discovery

Accessibility and performance baseline

A public check on whether the site stays fast, clear, and accessible as the operating surface keeps growing.

Web DiscoveryIn exploration

Web Discovery

Systems story for the control plane

A clearer public walkthrough of how the website, VPS, approvals, and operator tooling fit together.

Web DiscoveryUnder evaluation

Agent Discovery

Multi-agent pilot

A closer look at whether structured handoffs between roles produce better outcomes than a single-pass assistant flow.

Agent DiscoveryIn exploration

Coverage at a glance

Where the current proof surface has depth

The public lab should show breadth across web, agent, and ops work before anyone opens the full ledger.

Ops / VPS

4

Latest packaged proof Apr 17, 2026

Web Discovery

4

Latest packaged proof Mar 20, 2026

Field Patterns

2

Latest packaged proof May 5, 2026

Agents Discovery

2

Latest packaged proof Apr 17, 2026

Experiment ledger

Active probes and outcomes

The ledger keeps public-safe proof, representative run clusters, and next moves visible without exposing private transcripts.

Loaded 5 of 15 grouped cards (480 total runs)

Available history starts Feb 21, 2026

Need a closer read?

Ask for the proof trail behind a specific system.

Pume can walk through the public surface, the operating constraints, and the milestones that still remain private.