Skip to content

Latest commit

 

History

History
51 lines (45 loc) · 13.6 KB

File metadata and controls

51 lines (45 loc) · 13.6 KB

Assistant

Findings

Intent

  • Hermes appears intended to be a runtime-only sibling of Codex/OpenCode: exact-prefix ooo or /ouroboros: intercepts backed by shared packaged skills, plus ouroboros setup --runtime hermes that writes Hermes config and installs the shared skill bundle.
  • The installed-wheel packaging side looks sound. uv build succeeded, the wheel contains ouroboros/skills/*, importlib.resources.files("ouroboros.skills") resolved from an installed wheel, and resolve_packaged_codex_skill_path("run") still worked after wheel install via the fallback path. The source-checkout/editable-install path is not sound because default Codex skill resolution falls into src/ouroboros/skills/ instead of the repo-root skills/ tree.

Open Questions

  • I treated “Codex/OpenCode setup no longer registers Claude integration” as intentional, not accidental, because src/ouroboros/cli/commands/setup.py, scripts/install.sh, and the updated tests all agree on that scoping. If users relied on the old side effect, it needs release-note or docs coverage.

Residual Risks

  • No regression test protects the editable-install Codex skill resolution path after the ouroboros.skills packaging change; that gap allowed the critical source-checkout breakage through.
  • No CLI or e2e coverage exists for ouroboros run ... --runtime hermes; that would have caught the first finding.
  • No Hermes test covers recoverable dispatcher fallthrough parity with Codex/OpenCode.
  • No Hermes test covers timeout or hung-subprocess behavior.
  • No Hermes test covers child-env stripping or recursion-depth protection parity with Codex/OpenCode.
  • No Hermes test covers non-mapping or unterminated SKILL.md frontmatter.
  • No installer integration tests cover scripts/install.sh; I only syntax-checked it with bash -n.
  • Hermes config tests cover scalar top-level corruption, but not nested bad shapes like mcp_servers: [] or a malformed existing mcp_servers.ouroboros.
  • Quiet-output tests cover happy-path banner and reasoning stripping, but not malformed partial quiet output, misleading embedded session_id: text, or content that appears after the session marker.
  • No installer test asserts that Hermes installs only skill bundles and excludes package scaffolding such as __init__.py.

Verification

  • Passed: uv sync --dev, ruff check, mypy, targeted Hermes tests, targeted CLI/runtime/provider tests, full tests/unit/orchestrator/, full tests/unit/cli/, uv build, installed-wheel packaging smoke, and a temp-home Hermes setup smoke.
  • Temp-home Hermes smoke behaved well: ~/.ouroboros/config.yaml switched to runtime_backend: hermes while preserving unrelated llm.backend; ~/.hermes/config.yaml preserved unrelated keys while adding mcp_servers.ouroboros; skills landed under ~/.hermes/skills/autonomous-ai-agents/ouroboros/; .claude/mcp.json stayed untouched.
  • Installed-wheel smoke confirmed the new packaging layout directly: the wheel contains ouroboros/skills/* and not ouroboros/codex/skills/*; resolve_packaged_codex_skill_path("run") still works from the installed wheel; and installed-wheel install_hermes_skills() copies the shared bundle, including __init__.py.
  • Not green in targeted editable-install Codex verification: env UV_CACHE_DIR=/tmp/uv-cache uv run pytest tests/unit/test_codex_artifacts.py tests/integration/test_codex_skill_smoke.py tests/integration/test_codex_cli_passthrough_smoke.py tests/integration/test_codex_skill_fallback.py -q returned 8 failed, 17 passed, matching the critical packaging regression described above.
  • Not green: ruff format --check.
  • Not fully verified: uv run pytest tests/ --cov=src/ouroboros --cov-report=term-missing -v collected 4529 tests and started passing, but the exec harness never returned a trustworthy terminal summary after the pytest process disappeared. I would rerun that one in a normal shell before merge.
  • Local execution was on Python 3.14.2. python3.12 is present locally, python3.13 is not. The defects above are logic-level, so I would expect them to reproduce on 3.12 and 3.14 equally.

Recommendations

  • Move toward a backend-agnostic shared skill resolver. Hermes and OpenCode already depend on Codex-named helpers, and the validation behavior has drifted.
  • Add a common subprocess runtime base class. Hermes duplicated enough Codex/OpenCode logic to miss recoverable-dispatch handling and Codex’s safer frontmatter guards.
  • Introduce a declarative runtime registry. Backend lists are now duplicated across setup, config, runtime factory, docs, and tests; the broken --runtime hermes path is a direct symptom.
  • Unify installer and config-writer behavior. setup.py and install.sh both encode runtime-specific side effects and merge semantics.
  • Add a dedicated packaging/setup smoke harness covering source checkout, built wheel, and temp-home setup. This change touched wheel packaging, importlib.resources, runtime setup, and home-directory writes, and manual smokes were necessary to validate it.