feat(v1.8.0): drive quality to good — fix thrashing guard, per-dimension capture, dimension-lock, remainder by mataeil · Pull Request #72 · mataeil/OODA-loop

mataeil · 2026-06-19T06:05:31Z

The F1 probe stayed crap after three v1.7.0 leaps (artifact 0.394 → 0.447 → 0.472 → 0.522, never reaching bar 0.65). A 13-agent adversarially-verified diagnosis (.claude/ooda-evolution-v1.8.0.md) found the loop detects a quality gap but isn't built to close it.

Root cause

~45% of the rubric is frozen — driving_feel + fun_challenge were scored from a still screenshot (unmeasurable) and sat unchanged across all 25 cycles.
Settles for +0.05 and rotates targets instead of driving one dimension to bar.
Thrashing guard silently broken — read a nonexistent leap_delta field → fails always 0 → HALT never fired → could thrash forever.
Accepts partial implementation of its own leap plans (leap 3's materials/lighting silently vanished).

The devil's-advocate agent confirmed the leap routing is fine — the binding constraint is perception, not more leap machinery.

The 4 fixes (ranked)

Thrashing-guard bug fix (prerequisite) — count leap_attempts[].delta_score on leap_target (rubric_score.failed_leaps()).
Per-dimension capture_method (5-G) — experiential axes use a human-authored, hash-verified, protected gameplay_metrics harness; missing → null + skill_gap, never faked.
Dimension lock until bar (2-G) — keep leaping the same below-bar target (rubric_score.lock_target(), config.leap.lock_until_bar).
Auto-queue remainder (5-G) — critic-driven, so dropped scope can't be orphaned.

Rejected (devil's-advocate-validated): raising the bar to 0.80 yet; an inner refine loop; an LLM-component-coverage gate; multi_probe.

Validation (leap 4, separate game PR)

Ran a materials/lighting leap under v1.8.0: visual_fidelity 0.59 → 0.63, shipping the previously-dropped shadows/tone-mapping. It hit the screenshot-critique ceiling (~0.63) — proving the bottleneck has moved to perception. The loop's next target (fun_challenge) is unmeasurable by screenshot, so the fixed guard now HALTs requesting a human metrics harness instead of thrashing.

tests/verify.py 59 → 61. plugin 1.7→1.8, config schema 1.3→1.4.

🤖 Generated with Claude Code

…nsion capture, dimension-lock, remainder The F1 probe stayed crap after 3 v1.7.0 leaps (artifact 0.394→0.447→0.472→0.522, never reaching bar 0.65). A 13-agent adversarially-verified diagnosis found the loop DETECTS a quality gap but isn't built to CLOSE it. - 2-G thrashing-guard BUG FIX: counted a nonexistent `leap_delta` on weakest_dimension → fails was ALWAYS 0, HALT never fired. Now counts leap_attempts[].delta_score on leap_target (rubric_score.failed_leaps()). - 5-G per-dimension capture_method: experiential axes (driving_feel + fun_challenge = 45% of weight, frozen across all 25 cycles) use a human-authored, hash-verified, protected gameplay_metrics harness; missing → null + skill_gap, never a faked/silent-screenshot score. - 2-G dimension lock until bar: a successful leap below bar keeps the plateau on the SAME target (drive-to-bar, not detect-and-nudge+rotate). lock_target(); config.leap.lock_until_bar; tolerance band + working max-attempts HALT. - 5-G auto-queue remainder: a gate-passing leap still below bar queues a high-RICE remainder, triggered by the independent critic (not self-report). Rejected: raising bar to 0.80 yet, inner refine loop, LLM-coverage gate, multi_probe. verify.py 59 → 61. plugin 1.7→1.8, config schema 1.3→1.4. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…ttleneck moved to perception Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

mataeil and others added 2 commits June 19, 2026 14:55

docs(v1.8.0): fill validation — leap 4 hit the screenshot ceiling; bo…

47c958a

…ttleneck moved to perception Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

mataeil merged commit 4fd791c into main Jun 19, 2026
2 checks passed

mataeil deleted the feat/v1.8.0-quality-loop branch June 19, 2026 06:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(v1.8.0): drive quality to good — fix thrashing guard, per-dimension capture, dimension-lock, remainder#72

feat(v1.8.0): drive quality to good — fix thrashing guard, per-dimension capture, dimension-lock, remainder#72
mataeil merged 2 commits into
mainfrom
feat/v1.8.0-quality-loop

mataeil commented Jun 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

mataeil commented Jun 19, 2026

Root cause

The 4 fixes (ranked)

Validation (leap 4, separate game PR)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant