Skip to content

feat: consolidate evolution gates stack (#55-#61)#70

Closed
steezkelly wants to merge 14 commits into
NousResearch:mainfrom
steezkelly:consolidate/55-61-evolution-gates
Closed

feat: consolidate evolution gates stack (#55-#61)#70
steezkelly wants to merge 14 commits into
NousResearch:mainfrom
steezkelly:consolidate/55-61-evolution-gates

Conversation

@steezkelly
Copy link
Copy Markdown

Summary

Consolidation of evolution-gates PR stack: #55, #56, #57, #58, #59, #60, #61.
Also supersedes the narrower consolidation PR #67 (which covered #60-#66, now fully absorbed here).

This is a lower-overhead alternative to reviewing 7 overlapping PRs individually. All merged locally (conflicts resolved), tests pass clean.

What's included (18 files, 1441 additions, 95 deletions vs upstream origin/main)

From #55: ingestion reports and promotion gates

  • evolution/core/benchmark_gate.py (new)
  • evolution/core/dataset_builder.py (updated)
  • evolution/core/external_importers.py (updated)
  • evolution/core/pr_builder.py (new)
  • evolution/core/run_report.py (new)
  • evolution/skills/evolve_skill.py (updated — wiring for all gates)
  • tests/core/test_issue54_ingestion.py (new)
  • tests/core/test_issue54_promotion.py (new)

From #56: DSPy 3.x GEPA construction fix

  • evolution/skills/evolve_skill.py (GEPA optimizer helper)
  • tests/skills/test_evolve_skill_gepa.py (new)

From #57: LLM judge feedback for skill fitness

  • evolution/core/fitness.py (rubric-based LLM judge scoring)
  • evolution/skills/evolve_skill.py (wired into metric)
  • tests/core/test_fitness.py (new)

From #58: enforce run-tests evolution gate

  • evolution/core/constraints.py (pytest gate)
  • evolution/skills/evolve_skill.py (wired into pipeline)
  • tests/core/test_constraints.py (updated)
  • tests/skills/test_evolve_skill_gates.py (new)

From #59: reject empty holdout datasets

  • evolution/skills/evolve_skill.py (holdout gate)
  • tests/skills/test_evolve_skill_dataset_gates.py (new)

From #60: declare reportlab dependency

  • pyproject.toml (reportlab>=4.0)
  • tests/test_generate_report.py (new)

From #61: fail fast on invalid baseline skills

  • evolution/skills/evolve_skill.py (baseline constraint gate)
  • tests/skills/test_evolve_skill_constraint_gates.py (updated)

Conflict resolution

Two conflict sites in evolution/skills/evolve_skill.py:

  1. fix: enforce run-tests evolution gate #58 vs fix: update GEPA construction for DSPy 3 #56: Both added helper functions in the same region. Resolution keeps _create_gepa_optimizer(), _run_pytest_gate_if_requested(), and _save_failed_variant() — all independently needed.

  2. fix: reject empty holdout datasets #59 + fix: fail fast on invalid baseline skills #61 vs earlier merges: Added _require_non_empty_holdout(), _require_constraints_pass(), and _validate_baseline_constraints() — all preserved.

No other conflicts. Merge order used: #55#56#57#58#59#60#61.

Local test evidence

GitHub reports no configured checks on any of these PRs (statusCheckRollup length 0). Local verification:

Targeted tests (41 tests covering all changed modules):
  41 passed, 11 warnings in ~2.3s

Full suite:
  164 passed, 11 warnings in ~2.4s

Warnings are DSPy deprecation noise (prefix argument in InputField/OutputField) from installed dependency paths only — not from changed code.

Test environment: local venv .venv-review, installed via pip install -e '.[dev]' at commit 74f5da3.

Superseded PRs

PR Status Action
#55 feat: add ingestion reports and promotion gates → close (absorbed here)
#56 fix: update GEPA construction for DSPy 3 → close (absorbed here)
#57 fix: use LLM judge feedback for skill fitness → close (absorbed here)
#58 fix: enforce run-tests evolution gate → close (absorbed here)
#59 fix: reject empty holdout datasets → close (absorbed here)
#60 fix: declare reportlab dependency → close (absorbed here)
#61 fix: fail fast on invalid baseline skills → close (absorbed here)
#67 earlier #60-#66 consolidation → close (fully absorbed here)

Local evidence file at issue-55-61-superstack-local-evidence.md in the fork worktree.

Post-merge

Closes #54 (ingestion and promotion gates slice is now a single reviewable unit).

@steezkelly
Copy link
Copy Markdown
Author

Closing — unintentional duplicate of the earlier #68 (already closed). Active development has moved to steezkelly/hermes-agent-self-evolution where PRs #8-#16 have been merged. See https://github.com/steezkelly/hermes-agent-self-evolution for current state.

@steezkelly steezkelly closed this May 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement all-agent session ingestion and promotion gates

1 participant