Skip to content

[codex] add theory-modeling vertical#17

Draft
FuZhiyu wants to merge 39 commits intomainfrom
superRA-model-skill
Draft

[codex] add theory-modeling vertical#17
FuZhiyu wants to merge 39 commits intomainfrom
superRA-model-skill

Conversation

@FuZhiyu
Copy link
Copy Markdown
Owner

@FuZhiyu FuZhiyu commented Apr 22, 2026

Summary

Adds a first-class superRA:theory-modeling domain vertical for mathematical-modeling work, organized around intuition and interpretability as the through-line:

  • New skill skills/theory-modeling/ with a restructured Iron Law ("NO MANIPULATION WITHOUT DEFINED OBJECTS, INTERPRETABLE ASSUMPTIONS, AND STATED INTUITION") and a four-gate shared checklist — Objects & Notation, Assumptions, Derivations, Verification & Rendering.
  • Stage-scoped references planning.md, integrate-drift-tests.md, integration.md wiring the vertical into the PLAN → IMPLEMENT → INTEGRATE workflow.
  • Discovery + runtime wiring: skills/using-superRA/SKILL.md manifest add-on, skills/planning-workflow/SKILL.md routing, skills/refactor-and-integrate/SKILL.md integration pointer, skills/handoff-doc/references/plan-anatomy.md generalization, hooks/exit-plan-mode reminder, .agents/skills/theory-modeling symlink, and tests/check-harness-compatibility.sh assertions.
  • Notation discipline: narrative-order-of-introduction rule; PLAN.md's Notation Conventions table is authoritative and must be inline-edited atomically with any new symbol used in derivation.
  • Intuition/interpretability blocking gates: every symbol carries stated intuition; every assumption carries a plain-language interpretation a researcher can defend; scattered weak assumptions should be synthesized into stronger interpretable primitives when a clearly cleaner synthesis is available; every non-trivial derivation step carries a one-sentence reason.
  • Docs: README.md Domain Skills table, skills/CATEGORIES.md, CLAUDE.md updated to describe the vertical as implemented.

Workflow trail

Tasks 1–6 all Review status: APPROVED and Integration status: APPROVED per in-branch review cycles. Post-sync integration review re-run against current main (BASE 886fda8) — all six tasks APPROVED on the four-gate restructure, with two stale-row currency fixes (README.md, skills/CATEGORIES.md) landed in commit d6fc9ce. Sync APPROVED via semantic-merge workflow-sync mode.

Validation

  • tests/check-harness-compatibility.sh — passes on rebased HEAD.
  • Full drift-test suite — skipped per 2026-04-22 user decision (no results-bearing code; skill/workflow changes only).

Deferred

  • Document phase (RESULTS.md maturation + relocation under docs/plans/) deferred per 2026-04-24 user decision; can land as a follow-up commit on this branch if reviewers need the permanent record before merge.

🤖 Generated with Claude Code

FuZhiyu and others added 30 commits April 23, 2026 18:52
Existing Define §"defined before first use" check was weak on narrative
ordering (a symbol could appear in algebra before its introducing
paragraph) and silent on how implementers should evolve the canonical
PLAN.md Notation Conventions table once planning is done. Task 4 closes
both gaps: strengthens the ordering check in SKILL.md and adds an
explicit inline-edit rule requiring the Notation Conventions table be
updated BEFORE any newly introduced symbol appears in a derivation.

Rolls back Execution-complete milestone and logs the researcher's
scope decision in ## Decisions per handoff-doc User Decisions Log.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Strengthen the Define ordering check to require narrative-order
introduction (no symbol appears before the paragraph/table that
introduces it) and name PLAN.md's Notation Conventions table as
the authoritative cross-task source for reused symbols.

Add a new Documentation and handoff [BLOCKING] item requiring
implementers to inline-edit the Notation Conventions table before
using any newly introduced symbol in algebra, committed atomically
with the derivation work. Mirror the update mechanism in
references/planning.md Principles so the table reads as a living
record, not a one-time planning artifact. Add one row to Common
Rationalizations for the late-update failure mode.

Minimum net diff: only the one pre-existing Define bullet is
rewritten; one bullet is inserted into Documentation and handoff;
one Common Rationalizations row is appended; one Principles bullet
is added in the planning reference.
All four tasks APPROVED; tests/check-harness-compatibility.sh passes
(including Codex generated-agent sync). Integration REVISE items on
Tasks 2 and 3 are pre-existing and unaffected by Task 4.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ents

Rebase onto current main inherited main's data-analysis-only direct-mode
reference files. Regenerate from the rebased `agents/implementer.md` and
`agents/reviewer.md` so the Domain Discipline section covers both data
analysis and theory/modeling in line with the theory-modeling vertical.
- Log the 2026-04-23 rebase-onto-main decision in the Decisions section
  (dropped 34 objective-first commits; kept 17 theory-modeling commits).
- Update Project Conventions HEAD pointer from the removed `7d8f123`
  to the current base `b6e0640`.
- Remove the stale Tasks 2 and 3 integration review-notes blockquotes:
  both flagged scope creep from the now-dropped objective-first commits
  (missing files and release-ledger entry no longer exist on the
  rebased tree). Reset Integration status on both tasks to the
  pre-integration default so Phase B re-runs against the new base.
- Refresh RESULTS.md header with the post-rebase status.
Cumulative integration review of the rebased theory-modeling branch (b6e0640..HEAD) against current main. All four tasks' integration status flipped to APPROVED: the cumulative diff is tightly scoped to theory-modeling wiring with no residue from the dropped objective-first commits, the derived-artifact coherence checks pass (codex custom agents and direct-mode refs both in sync with agents/*.md), harness compatibility passes end to end, and PLAN.md + RESULTS.md accurately describe the rebased-branch reality.
Phase B integration reviewer returned APPROVE on the cumulative
b6e0640..HEAD diff (all four tasks' Integration status flipped to
APPROVED in e0c18bb). Flip the `Refactored` workflow milestone
accordingly, and fix the minor wording nit the reviewer flagged in
the 2026-04-23 rebase decision entry (three → four conflict stops,
matching the four resolved areas listed).
…lity

Add Tasks 5 and 6. Task 5 rewrites the Iron Law to name stated
intuition, replaces the Define-Derive-Validate section with a
four-gate structure (Objects & Notation / Assumptions / Derivations /
Verification & Rendering), and adds BLOCKING items for new-notation
intuition, assumption interpretability, assumption synthesis,
reason-per-derivation-step, and economic interpretation of limiting
cases. Task 6 mirrors the framing into references/planning.md
(Interpretation column on the Assumptions table, non-optional "Why this
notation", intuition-failure mistakes and red flags) and references/
integration.md (BLOCKING items that intuition artifacts survive
refactor).

Roll back the project-level Refactored milestone so Phase B re-runs
once Task 6 lands. Tasks 1-4 remain APPROVED because Task 4's narrative
-ordering and atomic Notation-Conventions-update rules compose cleanly
under the new structure.

Researcher feedback: intuition and interpretability are the top
modeling requirement and were treated as implicit in the D-D-V frame;
the frame itself was a mechanical mirror of describe-analyze-validate.
…etability

Replace Define-Derive-Validate with a four-gate structure (Objects &
Notation / Assumptions / Derivations / Verification & Rendering) built
around the reader's trust chain. Rewrite the Iron Law to name stated
intuition and interpretable assumptions alongside defined objects. Add
five new BLOCKING items (intuition/mnemonic per new symbol,
plain-language assumption interpretation, prefer synthesis over
scattered weak restrictions, reason per derivation move, economic
interpretation of limiting cases) and four Common Rationalizations rows
for the matching failure modes. All prior load-bearing BLOCKING /
ADVISORY items relocated verbatim under the new gates; Task 4's
narrative-ordering rule and atomic Notation-Conventions-table-update
rule preserved intact.
…ng planning and integration references

Task 5 restructured the SKILL.md around four gates and added five new
[BLOCKING] items on intuition, assumption interpretability, assumption
synthesis, per-step reason, and economic interpretation of special
cases. The stage-scoped references still reflected the old frame.

planning.md now records interpretability at planning time. The
Assumptions table gains an Interpretation column with one concrete
example row showing the target one-short-phrase shape. The Notation
Conventions section explicitly requires "Why this notation" for
non-conventional symbols. A new Principles bullet flags
interpretability as blocking and synthesis as preferred, pointing at
SKILL.md rather than restating its checklist. Common mistakes and
Red Flags gain the three intuition-failure modes the implementer must
not defer.

integration.md §"Derivation discipline preserved through refactoring"
gains three [BLOCKING] items verifying each new-symbol intuition,
assumption interpretation, and per-step reason present in the
original work survives refactor, pointing at the matching SKILL.md
gate.

Cleans up the two remaining Define-Derive-Validate mentions:
planning.md §Handoff to Implementation now names the intuition +
interpretability + stated reason through-line, and integration.md's
verdict-protocol pointer now references ## The Four Gates.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Step 4 completion menu answered after Tasks 5 and 6 APPROVED —
dispatch integration-workflow to re-run Phase B on the restructured
skill, then docs finalization, then merge/PR. Drift tests remain
skipped per the 2026-04-22 decision.
Merge main's 132-commit tighten-integration-rules restructure into the
theory-modeling branch. Main split semantic-merge into core + three
mode references, split result-protection out of refactor-and-integrate,
folded codebase-integration and merge-quality content into the new
owning skills, and reoriented refactor-and-integrate around minimum net
diff. All theory-modeling artifacts survive; cross-skill pointers in
theory-modeling/references/integrate-drift-tests.md and integration.md
were retargeted to the new locations
(result-protection/references/drift-test-quality.md;
refactor-and-integrate/SKILL.md). Agent bodies adopted main's shorter
teach-the-protocol wording — theory-modeling routing is preserved
through the manifest. tests/check-harness-compatibility.sh updated to
read the surviving files.

Sync Map and task-local Sync impact annotations recorded in PLAN.md.
Add 872c4d8 (the SHA-recording commit) to the Sync Map's Sync commits
field so the chain reflects every commit this sync landed, per
semantic-merge §workflow-sync-author ("update Sync commits to list the
full commit chain this mode landed"). Pure bookkeeping; no code or skill
edits. All tests still pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
FuZhiyu and others added 9 commits April 24, 2026 16:08
…t mismatch

Record sync-review REVISE with one MAJOR and two MINOR findings against
the main-restructure sync cluster: Task 6's target integration.md was
touched by the merge but has no task-local Sync impact annotation; Task
5 is listed under "affects Tasks" without a corresponding Sync impact
or diff evidence; and the Sync commits field omits the chain-extension
bookkeeping commit itself.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…from cluster, reject chain self-reference finding

Address sync-review findings:
- [MAJOR] Add Task 6 **Sync impact:** pointing at the two sync-origin hunks
  in skills/theory-modeling/references/integration.md (preamble retarget +
  verdict-protocol trailer) so the integration reviewer can classify them
  as sync-origin, not task work.
- [MINOR] Drop Task 5 from the main-restructure cluster's affects list;
  sync diff did not touch Task 5's target SKILL.md and no task-local
  pointer is warranted.
- [MINOR] Reject — self-reference cliff is accepted; chain now lists
  4b3f9c1, 872c4d8, ecdd09d and stops, matching the
  tighten-integration-rules precedent.

Flip Sync review status to IMPLEMENTED for narrow re-review.
…README/CATEGORIES; Tasks 1/3/5/6 APPROVED

Post-sync integration review of the theory-modeling branch. Task 2
flagged REVISE for two `[BLOCKING]` Documentation Currency findings:
`README.md:67` and `skills/CATEGORIES.md:25` still describe the
theory-modeling flagship discipline as `Define–Derive–Validate`,
which was superseded by the four-gate structure in the 2026-04-23
restructure (Tasks 5 and 6). Task 2 added those rows in commit
`31bb0c2` and they were never refreshed when Task 5 landed, so
contributor/runtime docs disagree with `using-superRA/SKILL.md:56`
and `theory-modeling/SKILL.md`. Tasks 1, 3, 5, 6 APPROVED with
dates stamped; Task 4 unchanged (no sync impact).
…ne rows to four-gate framing

Addresses the two accepted integration-review findings (stale Define–Derive–Validate wording in README.md:67 and skills/CATEGORIES.md:25) by rewriting the theory-modeling row in both tables to match the four-gate structure already authoritative in skills/using-superRA/SKILL.md and skills/theory-modeling/SKILL.md.

Also records Final Diff Self-Check trails against 886fda8..HEAD inside each in-scope task block (Tasks 1, 2, 3, 5, 6).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… verified, all in-scope tasks APPROVED

Narrow re-review of commit d6fc9ce against 886fda8..HEAD:

- Task 2 items 1 and 2 (stale Define-Derive-Validate rows in README.md:67 and
  skills/CATEGORIES.md:25) verified fixed: both rows now use the four-gate
  framing (Objects & Notation, Assumptions, Derivations, Verification &
  Rendering) consistent with skills/using-superRA/SKILL.md:56 and
  skills/theory-modeling/SKILL.md. Row shape matches the adjacent
  econ-data-analysis row in both tables. Removed the now-empty
  Integration review notes blockquote and flipped Task 2 to
  Integration status: APPROVED (post-sync re-review 2026-04-24).
- Final Diff Self-Check trails on Tasks 1, 2, 3, 5, 6 each name the
  governing command/range (git diff 886fda8..HEAD -- <paths>), record
  surviving-change classes (task objective, Sync impact, row rewrites
  this round), and state no suspicious hunks. Task 4 is out of scope
  (pre-sync APPROVED) and correctly carries no trail.
- Branch-wide governing-diff sweep over git diff 886fda8..HEAD found no
  stray hunks — every touched file traces to a task objective, Sync
  impact, or generated-artifact regeneration.

Overall verdict: APPROVE. All six task blocks now carry
Integration status: APPROVED.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…k-local Sync impact

Integration reviewer APPROVED all tasks on narrow re-review (commit 09048bc).
Per integration-workflow §Integrate Step 5, close out Integrate by removing
the temporary Sync Map section and task-local `**Sync impact:**` fields
(no lasting task assumptions to preserve), and flipping the Refactored
milestone in §Workflow Status.
User elected to open the PR before running the Document-phase doc-writer.
RESULTS.md maturation and permanent relocation under docs/plans/ can land
as a follow-up on the PR branch if reviewers need the record before merge.
@FuZhiyu FuZhiyu force-pushed the superRA-model-skill branch from 16dcfe7 to e70a68e Compare April 24, 2026 22:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant