Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .claude-plugin/plugin.json
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
{
"name": "ooda-loop",
"displayName": "OODA-loop",
"version": "1.11.0",
"version": "1.12.0",
"description": "An autonomous operations layer for your live side project. It watches, re-orients from which PRs you merge and reject, and opens small revertible PRs — bounded by a HALT file, protected paths, and a hard cost cap. Built on Boyd's OODA loop. You stay in command.",
"author": {
"name": "Taeil Ma",
Expand Down
23 changes: 23 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,29 @@ independently. Bump there signals migration work for downstream projects.

---

## [v1.12.0] — 2026-06-21

### Added — capture fidelity (the feedback-coverage fix)

Dogfooding v1.11.0 on the f1 game (10 research-grounded grade→ground→regrade
rounds, honest re-grade 0.31 → ~0.41 stills / ~0.58 true) surfaced the next maze
variant: **measurable improvement is bounded by FEEDBACK FIDELITY.** The 5-G
critic captured ONE low-speed first-person cockpit frame (the car kept stopping at
the start gantry), so a chase camera (where the clearcoat hero car + HDRI
reflections live), the HDRI sky, speed motion-blur/FOV, and slip-angle physics —
all real, gated, merged — were INVISIBLE to the grader. The grade stalled and the
critic kept steering effort toward only-what-was-in-that-one-frame.

- **`capture_states` (evolve 5-G + config).** A dimension is only gradeable in the
state(s) where it MANIFESTS; capture each dimension by driving the artifact INTO
those states (chase view for paint, high-speed for sense-of-speed, pitched-up for
sky) and pass the critic ALL frames. A state that can't be reached is a
capture_failure (null), never a low score — don't grade a dimension from a frame
that structurally can't show it.

Spec/config only (verify.py unchanged at 64). plugin 1.11.0→1.12.0.
Builds on v1.8.0 per-dimension `capture_method` + v1.11.0 reference grounding.

## [v1.11.0] — 2026-06-21

### Added — Research-Grounded OODA (the anti-maze methodology)
Expand Down
2 changes: 2 additions & 0 deletions config.example.json
Original file line number Diff line number Diff line change
Expand Up @@ -281,6 +281,8 @@
"name": "visual_fidelity",
"weight": 0.25,
"capture_method": "screenshot",
"__capture_states_doc__": "v1.12.0 (feedback fidelity) — a dimension is only gradeable in the state(s) where it MANIFESTS; a single default frame under-credits + misdirects the loop. List the states to drive the artifact into before capturing; the critic gets ALL frames. The f1 probe earned this: a chase camera / HDRI sky / speed blur / slip-angle physics were all added but every capture was one low-speed cockpit frame, so the work was invisible and the grade stalled. A state that can't be reached = capture_failure (null), not a low score.",
"capture_states": ["chase view, car stationary on a straight", "chase view at ~80% top speed", "camera pitched up to show the sky/horizon"],
"description": "3D visual quality vs SHIPPED games. Score against the reference anchors, not the artifact's past.",
"reference": {
"score_0.10": "flat-shaded primitive meshes, solid-colour sky, no shadows/post-processing (a 1990s look)",
Expand Down
20 changes: 19 additions & 1 deletion skills/evolve/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -1908,7 +1908,25 @@ for dim in rubric.dimensions:
if not independent: dim_artifact[dim] = null (capture_failure); continue
dim_artifact[dim] = run_protected_harness(dim.gameplay_metrics_command) -- metrics JSON
else if method == "screenshot":
dim_artifact[dim] = the shared screenshot (captured once)
-- CAPTURE COVERAGE (v1.12.0 — feedback fidelity). A SINGLE default frame
-- under-credits and MISDIRECTS: a dimension is only gradeable in the state(s)
-- where it MANIFESTS. The f1 probe proved this — 10 grounded rounds added a
-- chase camera (where the clearcoat hero car + HDRI reflections live), real
-- HDRI sky, speed motion-blur/FOV, and slip-angle physics, but every capture
-- was one LOW-SPEED COCKPIT frame facing a wall, so NONE of that work was
-- visible to the critic → the still-grade stalled (~0.41) while the true
-- quality (~0.58) was far higher, and the critic kept steering effort toward
-- only-what-was-in-that-frame. Rule: capture each dimension in the state(s)
-- where it shows. dim.capture_states (list) enumerates them, e.g.
-- visual_paint: ["chase_view at speed 0", "chase_view at speed 0.8"]
-- sense_of_speed:["cockpit at speed 0.9 on a straight"]
-- sky/lighting: ["camera pitched up to the horizon"]
-- Drive the artifact INTO each state (set the view/speed/scenario) before the
-- frame; pass the critic ALL of a dimension's frames. A dimension whose state
-- can't be reached is a CAPTURE_FAILURE (null), NOT a low score — never grade
-- a dimension from a frame that structurally cannot show it.
states = dim.capture_states or rubric.capture_states or ["default"]
dim_artifact[dim] = [ capture_in(state) for state in states ] -- 1+ frames
else:
dim_artifact[dim] = run dim.capture_command (or rubric.capture_command)
-- If a dimension's harness is MISSING entirely, score it null (capture_failure) +
Expand Down
Loading