Findings
- Critical: src/ouroboros/skills/init.py, src/ouroboros/codex/artifacts.py, src/ouroboros/codex/artifacts.py, pyproject.toml. Editable-install Codex skill resolution is broken.
_packaged_codex_skills_dir()first fails to findouroboros.codex/skills, then falls back to the first parentskills/directory, which is nowsrc/ouroboros/skills/because of the new package marker. That directory contains only__init__.py, not the actual skill bundles, so default skill lookup and install flows fail from source checkouts. I reproduced the exact targeted blast radius:env UV_CACHE_DIR=/tmp/uv-cache uv run pytest tests/unit/test_codex_artifacts.py tests/integration/test_codex_skill_smoke.py tests/integration/test_codex_cli_passthrough_smoke.py tests/integration/test_codex_skill_fallback.py -qreturns8 failed, 17 passed. - High: src/ouroboros/cli/commands/run.py, src/ouroboros/cli/commands/run.py, docs/runtime-guides/hermes.md. Hermes is setup/config-valid, but the
runCLI still rejects--runtime hermes.env HOME=/tmp/ouro-review-home UV_CACHE_DIR=/tmp/uv-cache uv run ouroboros run workflow examples/dummy_seed.yaml --runtime hermes --no-orchestratorfails with an invalid-value error because the enum/help only listclaude,codex, andopencode. This breaks the documented override path; only config-selected Hermes appears usable. - High: src/ouroboros/orchestrator/hermes_runtime.py, src/ouroboros/orchestrator/hermes_runtime.py, src/ouroboros/orchestrator/runtime_factory.py, src/ouroboros/orchestrator/command_dispatcher.py, src/ouroboros/orchestrator/codex_cli_runtime.py. Hermes reuses
create_codex_command_dispatcher, but unlike Codex it never recognizes the dispatcher’s recoverable-error tuple as “fall through to the CLI.” Becauseexecute_task()returns on any truthy intercept result, exact-prefix skills hard-fail on transient MCP/dispatch errors instead of degrading gracefully. Minimal repro reasoning: feed Hermes the recoverable two-message tuple built byCommandDispatcher._build_recoverable_failure_messages(); Hermes yields it and returns, while Codex logs it and returnsNone. - High: src/ouroboros/orchestrator/hermes_runtime.py, src/ouroboros/orchestrator/hermes_runtime.py, src/ouroboros/orchestrator/codex_cli_runtime.py, src/ouroboros/orchestrator/opencode_runtime.py. Hermes launches the child process with the ambient environment and no
_OUROBOROS_DEPTHtracking. Codex and OpenCode both strip Ouroboros runtime env vars and enforce a nesting cap before spawning subprocesses; Hermes does neither. That leaves Hermes exposed to recursive Ouroboros->MCP->Hermes re-entry and makes its subprocess behavior inconsistent with the other runtimes. - High: src/ouroboros/orchestrator/hermes_runtime.py, src/ouroboros/orchestrator/codex_cli_runtime.py, src/ouroboros/orchestrator/opencode_runtime.py. Hermes waits on a single unbounded
process.communicate()call. Codex and OpenCode enforce startup and idle-output timeouts around stream reads; Hermes has no equivalent protection, so a hung CLI can stall the orchestrator indefinitely. - Medium: src/ouroboros/orchestrator/hermes_runtime.py, src/ouroboros/orchestrator/hermes_runtime.py, src/ouroboros/orchestrator/codex_cli_runtime.py. Hermes frontmatter parsing is unsafe in two ways: unterminated frontmatter silently becomes
{}, and valid YAML that is not a mapping crashes onfrontmatter.get(...). ASKILL.mdcontaining---\n- not\n- a\n- mapping\n---raisedAttributeErrorin Hermes in a direct repro; the same repro with Codex logged a validation warning and skipped the intercept. This makes corrupted or custom skill bundles a runtime failure mode. - Medium: docs/runtime-guides/hermes.md, src/ouroboros/cli/commands/setup.py, tests/unit/cli/test_setup.py. The guide says Hermes will be used for “all execution phases,” but the implementation explicitly preserves
llm.backendbecause Hermes is runtime-only today. My temp-home smoke confirmedsetup --runtime hermeskeeps unrelated LLM config intact. The guide currently overstates capability. - Medium: src/ouroboros/hermes/artifacts.py, src/ouroboros/hermes/artifacts.py.
install_hermes_skills()copies the entire resolved source tree into the Hermes bundle root. In the installed wheel that includesouroboros/skills/__init__.py, and I verified thatinstall_hermes_skills()places that file under~/.hermes/skills/autonomous-ai-agents/ouroboros/. It is harmless clutter rather than a functional blocker, but it shows the installer is copying package scaffolding instead of just skill bundles. - Medium: src/ouroboros/orchestrator/hermes_runtime.py.
_parse_quiet_output()truncates content at the firstsession_id:line it finds. Direct repro:_parse_quiet_output("alpha\nsession_id: 20260414_102135_d38d07\nomega")returns("alpha", "20260414_102135_d38d07"), silently discarding the trailingomega. If Hermes ever emits additional text after the session marker, Ouroboros will lose it. - Low:
ruff formatis not green forsrc/ouroboros/orchestrator/hermes_runtime.py,tests/unit/cli/test_setup.py,tests/unit/hermes/test_artifacts.py, andtests/unit/orchestrator/test_hermes_runtime.py. Repro:env UV_CACHE_DIR=/tmp/uv-cache uv run ruff format --check src/ tests/. - Low: src/ouroboros/hermes/artifacts.py, src/ouroboros/hermes/artifacts.py.
install_hermes_skills(prune=...)accepts apruneflag but ignores it entirely. The function always removes and replaces the whole managed bundle, so the parameter is misleading API surface rather than active behavior. - Low: src/ouroboros/cli/commands/setup.py, src/ouroboros/cli/commands/setup.py.
_setup_hermes()reads and writes~/.ouroboros/config.yamlwithout an explicitencoding="utf-8", unlike the Hermes MCP config path. That inconsistency is low risk, but it should be normalized while touching the code. - Low: repository hygiene only, not feature logic. Untracked
.codexis an empty file, and I found untracked*.pycartifacts undersrc/ouroboros/hermes/__pycache__/,src/ouroboros/skills/__pycache__/, andtests/unit/hermes/__pycache__/. These should not merge.
Intent
- Hermes appears intended to be a runtime-only sibling of Codex/OpenCode: exact-prefix
oooor/ouroboros:intercepts backed by shared packaged skills, plusouroboros setup --runtime hermesthat writes Hermes config and installs the shared skill bundle. - The installed-wheel packaging side looks sound.
uv buildsucceeded, the wheel containsouroboros/skills/*,importlib.resources.files("ouroboros.skills")resolved from an installed wheel, andresolve_packaged_codex_skill_path("run")still worked after wheel install via the fallback path. The source-checkout/editable-install path is not sound because default Codex skill resolution falls intosrc/ouroboros/skills/instead of the repo-rootskills/tree.
Open Questions
- I treated “Codex/OpenCode setup no longer registers Claude integration” as intentional, not accidental, because src/ouroboros/cli/commands/setup.py, scripts/install.sh, and the updated tests all agree on that scoping. If users relied on the old side effect, it needs release-note or docs coverage.
Residual Risks
- No regression test protects the editable-install Codex skill resolution path after the
ouroboros.skillspackaging change; that gap allowed the critical source-checkout breakage through. - No CLI or e2e coverage exists for
ouroboros run ... --runtime hermes; that would have caught the first finding. - No Hermes test covers recoverable dispatcher fallthrough parity with Codex/OpenCode.
- No Hermes test covers timeout or hung-subprocess behavior.
- No Hermes test covers child-env stripping or recursion-depth protection parity with Codex/OpenCode.
- No Hermes test covers non-mapping or unterminated
SKILL.mdfrontmatter. - No installer integration tests cover scripts/install.sh; I only syntax-checked it with
bash -n. - Hermes config tests cover scalar top-level corruption, but not nested bad shapes like
mcp_servers: []or a malformed existingmcp_servers.ouroboros. - Quiet-output tests cover happy-path banner and reasoning stripping, but not malformed partial quiet output, misleading embedded
session_id:text, or content that appears after the session marker. - No installer test asserts that Hermes installs only skill bundles and excludes package scaffolding such as
__init__.py.
Verification
- Passed:
uv sync --dev,ruff check,mypy, targeted Hermes tests, targeted CLI/runtime/provider tests, fulltests/unit/orchestrator/, fulltests/unit/cli/,uv build, installed-wheel packaging smoke, and a temp-home Hermes setup smoke. - Temp-home Hermes smoke behaved well:
~/.ouroboros/config.yamlswitched toruntime_backend: hermeswhile preserving unrelatedllm.backend;~/.hermes/config.yamlpreserved unrelated keys while addingmcp_servers.ouroboros; skills landed under~/.hermes/skills/autonomous-ai-agents/ouroboros/;.claude/mcp.jsonstayed untouched. - Installed-wheel smoke confirmed the new packaging layout directly: the wheel contains
ouroboros/skills/*and notouroboros/codex/skills/*;resolve_packaged_codex_skill_path("run")still works from the installed wheel; and installed-wheelinstall_hermes_skills()copies the shared bundle, including__init__.py. - Not green in targeted editable-install Codex verification:
env UV_CACHE_DIR=/tmp/uv-cache uv run pytest tests/unit/test_codex_artifacts.py tests/integration/test_codex_skill_smoke.py tests/integration/test_codex_cli_passthrough_smoke.py tests/integration/test_codex_skill_fallback.py -qreturned8 failed, 17 passed, matching the critical packaging regression described above. - Not green:
ruff format --check. - Not fully verified:
uv run pytest tests/ --cov=src/ouroboros --cov-report=term-missing -vcollected 4529 tests and started passing, but the exec harness never returned a trustworthy terminal summary after the pytest process disappeared. I would rerun that one in a normal shell before merge. - Local execution was on Python 3.14.2.
python3.12is present locally,python3.13is not. The defects above are logic-level, so I would expect them to reproduce on 3.12 and 3.14 equally.
Recommendations
- Move toward a backend-agnostic shared skill resolver. Hermes and OpenCode already depend on Codex-named helpers, and the validation behavior has drifted.
- Add a common subprocess runtime base class. Hermes duplicated enough Codex/OpenCode logic to miss recoverable-dispatch handling and Codex’s safer frontmatter guards.
- Introduce a declarative runtime registry. Backend lists are now duplicated across setup, config, runtime factory, docs, and tests; the broken
--runtime hermespath is a direct symptom. - Unify installer and config-writer behavior.
setup.pyandinstall.shboth encode runtime-specific side effects and merge semantics. - Add a dedicated packaging/setup smoke harness covering source checkout, built wheel, and temp-home setup. This change touched wheel packaging,
importlib.resources, runtime setup, and home-directory writes, and manual smokes were necessary to validate it.