Skip to content

Checkpoint config-driven env GRPO eval work#92

Draft
ProfSynapse wants to merge 3 commits into
mainfrom
codex/eval-config-driven-assertions
Draft

Checkpoint config-driven env GRPO eval work#92
ProfSynapse wants to merge 3 commits into
mainfrom
codex/eval-config-driven-assertions

Conversation

@ProfSynapse
Copy link
Copy Markdown
Owner

Summary:

  • Adds the current workspace multistep GRPO projection and refreshed lean SFT datasets/config.
  • Adds workspace multi-turn eval scenario coverage plus focused tests for agentic loops, environment search, response scoping, and stage gates.
  • Adds generic environment execution/scoring support for configured tool aliases so path scoring can match schema-facing commands without hardcoding a toolset.
  • Adds env generation diagnostics, SFT prompt alignment migration, local GRPO image, and PEFT merge helper.

Validation:

  • python -m pytest tests/shared/test_agentic_loop.py tests/shared/test_local_environment_search.py tests/shared/test_workspace_multiturn_scenarios.py tests/synthchat/test_agentic_episode_messages.py tests/synthchat/test_response_scope_message_selection.py tests/synthchat/test_stage_gates.py
  • python .skills/scripts/sync_skill_trees.py --check
  • git diff --check

ProfSynapse and others added 3 commits May 5, 2026 12:48
Snapshot of in-progress work prior to merging origin/main (recipe system).
To be reorganized into proper commits later.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ven-assertions

# Conflicts:
#	.agents/skills/fine-tuning/SKILL.md
#	.claude/skills/fine-tuning/SKILL.md
#	.skills/fine-tuning/SKILL.md
#	.skills/fine-tuning/reference/cloud-training.md
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant