Release hermesd 2026.6.15 — drift fixes + integrations by mudrii · Pull Request #7 · mudrii/hermesd

mudrii · 2026-06-14T18:40:38Z

hermesd 2026.6.15 — drift fixes + hermes-agent integrations

Brings hermesd back into sync with the current hermes-agent on-disk shapes and adds new read-only panels/metrics. Read-only invariant preserved throughout.

Highlights

Fixes (producer drift): credential pool list-vs-dict (auth.json), desktop build stamp camelCase, pr-monitor live keys + all naming families, included/exact cost authoritative, session billing/end fields surfaced.
Integrations: Gateway platform errors + agent counts; new Curator panel (13) with per-tool breakdown + state transitions; Sessions Billing & Context (model context limit); Tokens by-endpoint breakdown + cost-status reconciliation; Operations response-store stats; extra log streams; Kanban task workflow fields + decomposition tree.
Hardening: markup-escape of all untrusted ~/.hermes free-text (incl. formatted timestamps), symlink/path-escape guards on file reads, SQL LIKE escaping, cache-preservation on every source.
Infra: opt-in live contract test (HERMESD_CONTRACT_TEST=1) guarding against the next drift; CI hardened (3.11–3.13 matrix, -W error::ResourceWarning, pinned actions, concurrency, uv cache).

Quality gates (all green)

777 tests pass + 1 opt-in skip under -W error::ResourceWarning
ruff check + format, mypy, pip-audit, uv build (sdist + wheel) all clean
99% coverage (remaining lines are accepted-defensive TOCTOU/unreachable guards)
Reviewed across 4 axes (standards, spec/PLAN, invariants, tests) over multiple converged rounds; test suite TDD-reviewed for quality + coverage over converged rounds

Release

Version bumped to 2026.6.15 (calver) in pyproject.toml; CHANGELOG dated section updated; __version__ is dynamic from package metadata.
After merge, publish by creating a GitHub Release with tag v2026.6.15 — python-publish.yml runs the full gate → builds → trusted-publishes to PyPI.

Closes the validated drift-fix + integration plan (all loop-doable items; Part-3 future-features deferred pending live data).

…% coverage Audit-fix wave (review of HEAD, 2026-06-10): Bugs fixed: - collector: route _collect_sessions/_collect_runtime_status through the _CollectionHealth boundary so a bad SQLite row can no longer crash startup or stale the whole refresh; NULL session id no longer fails tool stats - app: G/k scroll deadlock (stored offset now clamped to effective max), console-swap race in copy_current_view (snapshot renders into a local Console), theme load moved outside the state lock, signal handlers installed before initial collect, health dot red at zero healthy sources - db: stale flag now set when the DB file disappears; LIKE search escapes %/_ wildcards with an ESCAPE clause - panels: ~$/$ cost prefix reflects estimated vs reported in compact, detail rows, and aggregate tables; unknown minlevel: filter shows all lines instead of none; SOUL.md empty vs missing distinguished; multi message: filter tokens resolve last-wins consistently Performance: - message search no longer serializes against the collect pass - per-profile session counts cached by DB mtime (no WAL re-snapshot per tick) - log tails skip re-reads when mtime+size unchanged - session summaries memoized while underlying rows are unchanged - cron output excerpts respect --log-tail-bytes Standards/cleanup: - all locks documented; ~/.hermes default centralized in paths.py; from __future__ import annotations across tests; dead code removed (cycle_log_view, _tail_log, status_bar_style, context_color, unreachable scroll hint, unused default-arg branches) Tests: - 409 -> 548 tests; coverage 93% -> 99% (app.py and all panels at 100%) - implementation-coupled tests reworked to public seams; duplicates removed - pytest-cov added to dev extras Docs: CHANGELOG Unreleased section; README cost-prefix and g/G wording.

- panels/logs.py: single-source the detail clamp via public max_detail_scroll_offset(); app.py no longer imports logs-private names - collector: derived() closure replaces 5 repeated memoization lambdas; _read_tail_text() replaces 3 copies of the tail-read idiom; LogStream construction collapsed to one call - app.py: OSC52 write falls back to an unguarded write if Rich ever renames the private Console._lock (no hard dependency on internals) - __main__.py: pragma no cover on the __name__ guard (covered by the subprocess test) - tests: render_to_str consolidated into conftest (was defined 7x with drifted widths), restore_signal_handlers fixture replaces hand-rolled try/finally in 4 tests, collector-level tests moved out of panel test files into test_collector_extended.py No behavior changes; 548 tests green, ruff and mypy clean.

… test - test_app_close_idempotent / test_close_idempotent now assert a post-close observable instead of passing vacuously - removed test_db_file_deleted_marks_cached_reads_stale (strict subset of test_db_file_deleted_marks_all_cached_reads_stale)

Round 1 audit (standards/spec/quality/coverage, 4 parallel agents): - test_gateway_resilience.py asserted PID-liveness but never the cache-preservation invariant its filename implies; add a test that forces a gateway collection error and asserts last-good gateway survives (running/pid/version unchanged). - Cover the reachable skip-branches in _tail_latest_cron_output (stray non-dir entry in the output root, non-file inside a job dir). Tests-only; no source behavior change. 549 passing, coverage 99%.

…p injection Round 3 adversarial audit found a HIGH-severity defect: panels passed free-text from ~/.hermes/ (skill descriptions, session source/model/cwd/ title, task titles, error excerpts, provider/plugin/MCP names, memory file names, etc.) directly into Rich Table.add_row(...), a markup-parsed context. Data containing '[xy]' was silently stripped, and an unbalanced '[/]' raised rich.markup.MarkupError at render time. That error escaped the app loop's KeyboardInterrupt-only handler and crashed the whole TUI, violating the never-blank-the-display / cache-preservation invariant. Escape every untrusted free-text cell with rich.markup.escape(). Numbers, fmt_* output, fixed labels, style tags, and the already-literal Text.append(...) compact views are left untouched. Adds tests/test_markup_safety.py: all 12 panels x compact+detail rendered with a '[/] desc [xy] tag' payload, asserting no crash and literal-bracket survival (48 cases). 597 passing, coverage maintained, ruff/mypy clean.

Round 3 audit found 3 test files violating the project's documented ruff formatter (.claude/rules/python-idioms.md). Mechanical reformat only; no logic change. Whole repo now passes ruff format --check.

TDD-focused test-quality audit (4 parallel agents) findings, fixed test-first (no source changes; suite already 99% line coverage): Fixes: - Eliminate ResourceWarning: two app tests and one db test reopened a SQLite connection via a read-after-close/collect path and never closed it; close the reopened resource (suite now passes -W error::ResourceWarning). - test_cost_estimation: replace self-derived expected (computed via the same _estimate_cost under test) with hand-computed literals (0.325, 0.315), and pin > 0 assertions to pytest.approx — a regressed estimator now actually fails. - Strengthen unchecked/vacuous assertions: ToolStats.call_count, exact today-token sum (21_500/26_500), full session id-set, len(sessions)==2 before all(is_active). Coverage gaps (behavior, not just line): - Cache-preservation force-error tests for the 6 untested safe_collect sources (cron/channels/kanban/operations/skills/memory): force each _collect_* to raise and assert last-good state survives and the source is marked failed. Invariant holds for all six. - test_readonly_invariant: snapshot ~/.hermes manifest (path+size+mtime_ns) before/after a full collect and the snapshot paths; assert nothing is written — guards the read-only Critical Rule. - test_import_hygiene: ast-scan every hermesd module; assert no hermes-agent import — guards the no-import Critical Rule. 606 passing (+9), ruff/format/mypy clean.

TDD Round 2 (validate prior wave + close deferred behavior gaps): - Empty-state sweep: render every panel (compact+detail) against a default DashboardState() — the real first-launch condition where ~/.hermes exists but the agent never ran (all collections empty, counts zero, optionals None at once). 6 panels previously had no empty-detail assertion; guards None-formatting / zero-division. - Bracket-nav wrap: existing test stopped at panel 12 then stepped back; add the wrap-seam case (']' on 12 -> 1, '[' on 1 -> 12) where the modulo off-by-one would live. - Strengthen test_build_header_with_custom_skin: asserted only 'is not None'; now asserts app._theme.skin_name == 'ares' so a broken skin load actually fails. Deferred defensive cases (TOCTOU OSError guards, --log-tail-bytes<=0, malformed-TOML, db FTS-probe error) assessed and skipped: already covered or unreachable without fault injection — forcing them tests except-mechanics, not behavior. 631 passing (+25), ruff/format/mypy clean, -W error::ResourceWarning clean.

TDD Cycle 2 (fresh full re-audit) findings: - Determinism: two today-token assertions seeded session started_at from raw time.time() then asserted an exact today total. Between local midnight and the collector's _today_epoch() re-read, the row could fall before the cutoff and flake (test_collect_tokens_today_filters_by_date had up to a 1-hour post-midnight window — a regression from the prior cycle's exactness tightening). Pin hermesd.collector._today_epoch via monkeypatch so the cutoff is deterministic regardless of wall clock. - Extend the force-error cache-preservation harness with config and logs (both non-default under populated_hermes_home, so the equality is meaningful). Skipped skin: its populated value equals the model default ('default'), which would make the assertion vacuous. 633 passing (+2), ruff/format/mypy clean, -W error::ResourceWarning clean.

TDD Cycle 3 (final fresh re-audit) findings: - file_cache: LastGoodFileCache only had invalid-shape reuse tests; the core cache-hit (unchanged mtime returns cached value without re-reading) and invalidation (newer mtime with valid content yields the new value) paths were untested — a regression making _cached_read always-stale or always-reload would have passed. Add a deterministic test using os.utime (corrupt-but-same-mtime proves a hit skips the reload; bumped-mtime proves invalidation). - logs detail: the unfiltered-empty branch ("No log lines") was untested; only the filtered-empty branch ("No matching log lines") was. Add the unfiltered-empty assertion. 635 passing (+2), ruff/format/mypy clean, -W error::ResourceWarning clean.

Final confirmation pass found the cache-hit assertion over-claimed: the corrupt-but-same-mtime trick returned the cached value under BOTH a real cache hit and an always-reload regression (the reload would fail on the corrupt bytes and fall back to last-good), so it did not prove the reload was skipped. Strengthen it with the file's established counting_open pattern: assert open_calls == 0 across the unchanged-mtime read (reload truly skipped) and == 1 across the invalidation read (reload happened). 635 passing, ruff/format/mypy clean.

Read end_reason, billing_base_url, and billing_mode from state.db (columns existed in schema but were unread) and surface them in the Sessions detail Runtime sub-table. Placed in the Runtime sub-table rather than the main sessions table because 15 columns over-squeezed the main table at width 120. Unblocks C2 (context/limit gauge) and NF1 (billing-endpoint breakdown), which join on billing_base_url. PLAN.md: FIX B.

Live auth.json maps each provider to a LIST of credential entries, but hermesd fed that list to _as_dict (-> {}), blanking every credential field in the pool view. Add _select_pool_entry to collapse the list to one representative entry (lowest priority = next credential used; ties keep list order), while still accepting the legacy single-dict shape. Provider name remains the outer dict key. PLAN.md: FIX A1.

gateway_state.json carries per-platform error_code/error_message plus top-level active_agents and restart_requested, none of which hermesd read. Surface them: a warning marker in the compact view, an Error column and restart/active-agent indicators in the detail view. Error text is escaped before reaching Rich markup. PLAN.md: C1.

The live hermes-agent producer never emits cost_status "reported"; it uses unknown/estimated/exact/included. So the authoritative-cost branch was never taken on live data and every cost rendered with the estimated "~$" prefix. Map exact and included (alongside legacy reported) to authoritative via a shared AUTHORITATIVE_COST_STATUSES set used by the collector's estimated-flag logic, per-row cost resolution, and the tokens panel prefix. Subscription- included rows (cost 0.0) now render an authoritative $0.00 rather than a token-based estimate. PLAN.md: A4.

Live desktop-build-stamp.json uses builtAt/contentHash/sourceMode, but the operations collector looked only for version/stamp/built_at/created_at, so the build stamp always rendered blank. Add builtAt to the lookup chain and fall back to a 12-char contentHash prefix. PLAN.md: A2.

Live pr-monitor JSON uses prs/tracked_numbers/author_prs/checked_at, but the collector counted the retired monitored/tracked keys (stuck at 0) and tried a dead camelCase checkedAt before the snake_case key. Read prs -> monitored count and tracked_numbers -> tracked count (legacy keys kept as fallback), and drop the dead checkedAt first-guess. Fixture realistic-ified to the live shape to expose the drift. PLAN.md: A3 + A5.

The agent writes PR-monitor state under several families: flat pr-monitor-*.json and pr_monitor_*.json, plus per-repo files inside pr-monitor/ and pr_monitor/ subdirs. hermesd globbed only the flat hyphen family, missing large active state files. Glob all four families and collapse the same repo (seen across families) to its newest checked_at so the panel isn't flooded with near-duplicate rows; repo-less files stay distinct. PLAN.md: A6.

Add CuratorRun model and Collector._collect_curator reading the newest ~/.hermes/logs/curator/<stamp>/run.json (model/provider, before/after/delta counts, archived/pruned/added/consolidated, tool-call total, llm summary and error). Registered as a new 'curator' safe_collect source with last-good cache preservation. Panel rendering and registration follow in the next step. PLAN.md: C3 (data layer).

Render the newest memory-curation run as panel 13: a compact summary (last run stamp, skill before->after delta, archived/pruned, error marker) and a detail view (full counts, model/provider, duration, tool calls, and the LLM summary or error). All free-text from run.json is markup-escaped. Registered in PANEL_NAMES/_RENDERERS and the wide/compact layouts (tall-narrow and snapshot derive automatically); reachable via ] cycling or --snapshot-panel 13. README/CLAUDE.md panel counts and CHANGELOG updated. PLAN.md: C3 (panel).

Join each session's model@billing_base_url against context_length_cache.yaml to surface the model's context-window size (SessionInfo.context_limit). Split the sessions detail Runtime table into a lean Runtime table (operational fields) and a new Billing & Context table (end reason, endpoint, mode, context limit) so neither over-squeezes at typical widths. Labeled as the context limit vs lifetime cumulative tokens, not live occupancy. Base-url join normalizes a trailing slash; model names are not lowercased. PLAN.md: C2.

Aggregate token usage and spend per billing endpoint (billing_base_url) and render it as a By Endpoint sub-table in the Tokens / Cost detail panel. This is finer than the existing per-provider breakdown — the same provider can bill through several base URLs. Reuses the existing breakdown summarizer and table renderer. PLAN.md: NF1.

Count sessions by cost_status and render a one-line distribution in the Tokens / Cost detail panel (e.g. unknown N · included N · estimated N) so the user can see how much spend is authoritative versus unknown. NULL cost_status is counted as 'unknown'. PLAN.md: NF2.

Read ~/.hermes/response_store.db read-only (mode=ro&immutable=1, reusing the kanban read-only connector) and surface conversation/response row counts and file size as a Response Store row in the Operations detail panel. Missing tables count as zero; an absent DB shows nothing. PLAN.md: NF3.

Add audit.log, mcp-stderr.log, workspace.log, and workspace.error.log to the log-stream specs so they appear as Tab sub-views in the Logs panel when present (existence-gated, so absent files are simply skipped). PLAN.md: NF4.

Read task completed_at, workspace_path, goal_mode, and current_step_key from kanban.db (additive SELECT *), show branch_name in the worker/problem task tables, and add guarded Decomposition Links / Attachments summary rows from task_links / task_attachments (rendered only when populated). PLAN.md: NF5.

Add HERMESD_CONTRACT_TEST=1-gated test that runs the collector against the real Hermes home and asserts drift-sensitive fields are populated wherever their source data exists (credential labels, session billing fields, gateway platforms, desktop stamp, pr-monitor counts, curator run). Skipped by default so CI never couples to a machine's data; catches the next producer-side schema drift the moment it lands. PLAN.md: Part 4 contract test.

Update the Features section header (12 -> 13 Dashboard Panels) and the keyboard-shortcuts row to include panels 11-13 (Curator).

- actions/checkout v4 -> v5, astral-sh/setup-uv v5 -> v6 in CI and the publish workflow (verified current majors). - Add [tool.hatch.build.targets.wheel] packages = ["hermesd"] so the wheel target is explicit rather than relying on hatchling auto-detection.

Add tests flagged by the branch audit for production branches that lacked direct coverage: - tokens 'By Endpoint' sub-table actually renders (NF1 was collector-only) - credential_pool empty-list / list-of-non-dicts degrade to a name-only entry - repo-less pr-monitor files stay distinct (the ::filename dedupe key path)

Round-1 audit fixes: - Add an end-to-end test that seeds real audit.log/mcp-stderr.log and asserts both streams collect and are reachable via the Tab cycle (was only proven with synthetic stream names). - Assert kanban link_count/attachment_count == 0 explicitly when the task_links/task_attachments tables are absent (was implicit via no failed source). The pr-monitor checkedAt finding was reviewed and dismissed: PLAN A5 intentionally dropped the dead camelCase guess, all live families use snake_case checked_at, and the newest-timestamp dedupe is correct as-is.

Round-2 audit fixes: - tokens panel: format all cost figures via fmt_usd (with a ~ estimate marker) so negative costs render -$0.50 / ~-$0.50 instead of $-0.50; factored into a shared _fmt_cost helper used by the session, window, and breakdown tables and the compact view. - collector: _collect_curator now skips symlinked run dirs and paths that escape the Hermes home, and bails on a symlinked run.json — matching the symlink hardening already applied to the cron/checkpoint readers. - tests: negative-cost prefix render; curator run-dir-without-run.json degrades to empty (not failed); curator symlinked-run-dir is skipped. Dismissed: tokens per-session prefix ignoring estimated_cost_usd-None (the field is float on SessionInfo, never None at the panel layer); _billing_table [:10] cap (mirrors _runtime_table).

Raise coverage 99% (49 -> 14 missed lines) with tests that assert real observable behavior, not line execution: - db.py 96->99%: message-search cache-preservation (error -> last-good + stale flag), consecutive-error reset, FTS-unavailable graceful degrade. - collector.py 97->99% (new test_collector_coverage.py): available-tools mtime cache hit, log-stream OSError -> last-good, curator empty/symlinked run, _path_resolves_under guard, context-length key-without-@, unreadable checkpoint/cron/soul files, skill-dir filters, dashboard-process matching. - curator_panel.py 97->100%: compact warning marker, long-summary truncation. - Strengthen the by-endpoint render assertion (was URL-echo only). Remaining 14 misses are documented accepted-defensive (2 unreachable db guards; 12 collector TOCTOU/empty-source-factory branches not deterministically reproducible without brittle stat/open mocks). chmod-based tests skip cleanly when run as root.

Round-2 test-quality fixes: - Skip the WAL snapshot-failure test when running as root (chmod 000 does not block reads for root, which would make the assertion vacuous in CI). - Use the imported _RECONNECT_ERROR_THRESHOLD constant instead of a magic 3 in the dead-handle reconnect test so it tracks the source threshold.

gemini-code-assist · 2026-06-14T18:40:59Z

Warning

Gemini encountered an error creating the review. You can try again by commenting /gemini review.

Copilot

Pull request overview

Prepares the 2026.6.14 hermesd release by expanding dashboard functionality (new Curator panel + richer Kanban/Operations/Tokens/Sessions/Gateway details), hardening read-only + markup-safety behavior, and aligning docs/tests/CI with the new release gates.

Changes:

Add panel 13 (Curator) and extend multiple panels (Tokens, Sessions, Gateway, Kanban, Operations) with new surfaced fields and summaries.
Improve resilience/security: escape untrusted text before Rich markup parsing; tighten read-only invariants; strengthen DB/WAL + cache behavior and message search.
Modernize CI/release workflows and test suite (shared render_to_str, expanded coverage/guards, updated docs/changelog/version).

Reviewed changes

Copilot reviewed 62 out of 63 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
tests/test_tools_panel.py	Switch to shared Rich render helper; add fallback label coverage for watch/checkpoint rows.
tests/test_theme.py	Update theme assertions to match refactored Theme API.
tests/test_skills_panel.py	Use shared render helper; keep skills panel expectations stable under no-color rendering.
tests/test_session_active.py	Extend session schema mapping coverage for new billing/end fields.
tests/test_readonly_invariant.py	New invariant tests asserting collector/app read paths do not mutate `~/.hermes`.
tests/test_profiles.py	Add symlink-escape hardening tests for profiles discovery + HermesPaths validation.
tests/test_profiles_panel.py	Panel rendering tests updated; add size/mtime formatting coverage.
tests/test_panels.py	Broaden panel rendering tests; add coverage for new/expanded panel behaviors.
tests/test_models.py	Small model assertions update (ToolStats call_count).
tests/test_memory_panel.py	Extend memory panel compact/detail behavior (SOUL present/empty/missing).
tests/test_markup_safety.py	New cross-panel Rich markup injection hardening tests.
tests/test_main.py	Extend snapshot panel validation/output tests; add entrypoint/version fallback coverage.
tests/test_import_hygiene.py	New guard test preventing `hermesd` importing `hermes-agent` modules.
tests/test_gateway_resilience.py	Minor import/annotation alignment.
tests/test_formatting.py	Add `fmt_iso_timestamp` empty-input dash behavior tests.
tests/test_file_cache.py	Add mtime cache-hit behavior test for `LastGoodFileCache`.
tests/test_db.py	Make DB session assertions order-independent; remove outdated caching/URI tests.
tests/test_db_resilience.py	New DB backoff/WAL failure/cache-staleness/message-search resilience coverage.
tests/test_db_extended.py	Expand DB/WAL/LIKE-escaping/read-after-close behaviors; add symlink-sidecar protection coverage.
tests/test_curator.py	New collector+panel tests for curator run parsing and edge cases.
tests/test_curator_resilience.py	New curator cache-preservation test on corruption.
tests/test_curator_panel.py	New Curator panel rendering tests (compact/detail/error/truncation).
tests/test_cron_panel.py	Update cron panel tests for new metadata and shared render helper.
tests/test_cost_estimation.py	Expand cost authority/estimation semantics and collector/token analytics tests.
tests/test_contract.py	New opt-in live `~/.hermes` drift/contract test (env-gated).
tests/test_collector.py	Major expansion of collector fallback/caching/health-redaction/search concurrency tests.
tests/test_collector_coverage.py	New targeted coverage for collector IO-error fallbacks and read-only guards.
tests/test_app.py	Update key handling API; add run/signal/loop behavior tests.
tests/conftest.py	Add shared `render_to_str` helper and signal handler restore fixture; refactor kanban DB setup.
tests/init.py	Add future-annotations package marker.
README.md	Update docs for 13 panels, new logs, cost semantics, snapshot behavior, and commands.
pyproject.toml	Bump version to `2026.6.14`; add `pytest-cov`; ensure wheel packaging config.
hermesd/theme.py	Remove unused style helpers; expose `status_bar_bg` only.
hermesd/paths.py	Add `default_hermes_home`; harden profiles root against symlink escape.
hermesd/panels/tools.py	Escape untrusted tool/process/checkpoint strings for Rich markup safety.
hermesd/panels/tokens.py	Add `~$` vs `$` formatting based on estimation; add endpoint + status reconciliation sections.
hermesd/panels/sessions.py	Escape untrusted fields; add Billing & Context table; improve filter and token-sort semantics.
hermesd/panels/profiles.py	Escape untrusted profile names in tables.
hermesd/panels/overview.py	Escape untrusted skills/integrations fields; tweak skills header behavior.
hermesd/panels/operations.py	Escape untrusted fields; add response_store summary; adjust empty-state logic.
hermesd/panels/memory_panel.py	Escape untrusted fields; distinguish SOUL present/empty/none in compact view.
hermesd/panels/logs.py	Factor log view resolution; fix unknown minlevel behavior; expose max scroll offset helper.
hermesd/panels/kanban.py	Escape untrusted fields; add links/attachments + metadata tables; extend task/run rendering.
hermesd/panels/gateway.py	Escape untrusted fields; add per-platform error column and compact warning marker.
hermesd/panels/curator_panel.py	New Curator panel renderer (compact/detail, tool counts, transitions, summary/error).
hermesd/panels/cron.py	Escape untrusted cron job fields; surface more cron metadata in detail.
hermesd/panels/config_panel.py	Escape untrusted config fields; improve empty-config detail output.
hermesd/panels/init.py	Register panel 13 renderer and name.
hermesd/models.py	Add new model fields (cost authority, endpoints, kanban links, curator run, ops response store, etc.).
hermesd/file_cache.py	Document lock purpose; clarify cast rationale.
hermesd/db.py	Add stale marking on db disappearance; safe sidecar copying; LIKE escaping; improved message search caching semantics.
hermesd/app.py	Add panel 13 to layouts; improve signal handling + view locking; clamp scroll offsets; safer OSC52 write.
hermesd/main.py	Use `default_hermes_home`; expand snapshot panel range; improve help text/coverage marker.
CONTRIBUTING.md	Align contributor commands with CI/release gates (adds `uv build`).
CLAUDE.md	Update docs to reflect 13 panels.
CHANGELOG.md	Add `2026.6.14` release notes section with new features/fixes.
.github/workflows/python-publish.yml	Modernize Actions versions; enable uv cache; update test invocation flags.
.github/workflows/ci.yml	Add workflow_dispatch + concurrency; modernize Actions/uv setup; matrix settings updates.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

    for s in state.sessions:
+        estimated = s.cost_status not in AUTHORITATIVE_COST_STATUSES
        table.add_row(


chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e2a3fbc747

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-06-14T18:48:45Z

-      - uses: actions/checkout@v4
-      - uses: astral-sh/setup-uv@v5
+      - uses: actions/checkout@v6
+      - uses: astral-sh/setup-uv@v8


Pin setup-uv to an existing immutable tag

This workflow now references astral-sh/setup-uv@v8, but setup-uv v8 intentionally stopped publishing major/minor tags: the upstream v8.0.0 release notes say users will not be able to use @v8 or @v8.0 and recommend astral-sh/setup-uv@v8.0.0 instead (https://github.com/astral-sh/setup-uv/releases/tag/v8.0.0). As written, each CI job fails during action resolution before install/test can run; the same invalid pin also appears in .github/workflows/python-publish.yml, so release publishing is blocked too.

Useful? React with 👍 / 👎.

Review-cycle findings: - _task_metadata_table deduped by task_id (a task in both active_tasks and recent_tasks rendered duplicate rows in the Task Metadata table). - Document the Curator panel's per-tool call breakdown and state-transition trail in CHANGELOG and README (were implemented but undocumented).

fmt_iso_timestamp returns its input verbatim for non-ISO strings, so an untrusted ~/.hermes/gateway.json updated_at value containing Rich markup (e.g. [red]) reached a markup-parsed table cell unescaped and could crash the TUI. Escape the formatted value (the sibling name cell was already escaped). Extend the markup-safety panel-1 builder to inject into updated_at/error_code/ error_message so the parametrized injection sweep covers these cells.

…asserts (TDD round 1) - Add a collector test for the alternate state_transitions entry shape ({"state": ...} with no from/to) -> labelled by state + timestamp (collector.py:2194 was uncovered). - Replace bare-digit substring assertions in the curator detail panel test with label-bound regex (Added 2 / Consolidated 1 / Tool Calls 67 / read_file 12) so a wrong count can no longer pass on a stray digit.

…DD round 2) - overview no-providers: bind zero counts to labels (Skills: 0 / 0 pools / 0 plug) instead of a vacuous '0' substring. - available-tools cache-hit: count real Path.open calls on the per-session file instead of wrapping the private _read_json_cached method. - gateway active-agents, overview skill count, tools call-count: bind each digit to its label (3 active agents / Skills: 70 (28 cat) / shell_exec 23) so a wrong value can't pass on a stray digit.

Bump version 2026.6.14 -> 2026.6.15 (calver, today) and roll the dated CHANGELOG section forward, folding in the two fixes that landed after the prior version stamp: - gateway platform timestamp markup-escape (injection guard) - kanban Task Metadata table task-id dedup The 2026.6.14 stamp was never tagged/published; 2026.6.15 is the first release of this branch's drift-fix + integration work.

Replace the outdated 2026.4 screenshots with a current 2026.6.15 set: overview plus a detail shot for every panel 1-13, named clearly (overview.png, panel-NN-<name>.png). Refresh captions to call out the new surfaces — gateway per-platform Error column, sessions Billing & Context table, tokens Cost-Status + By-Endpoint breakdown, extra log streams, and the new Curator panel. Removed the stale SCR-* images. No personal data in the new shots: credential pools show env-var names and presence only (no secret values); no emails, private chats, or non-local IPs.

mudrii added 30 commits June 10, 2026 09:17

style: apply ruff format to drifted test files

56444fc

Round 3 audit found 3 test files violating the project's documented ruff formatter (.claude/rules/python-idioms.md). Mechanical reformat only; no logic change. Whole repo now passes ruff format --check.

fix: resolve final review findings

1f5aaf3

test: harden coverage and resilience checks

6363cce

docs: loop-state — all loop-doable PLAN items complete

6b3fe22

mudrii added 10 commits June 14, 2026 22:12

docs: fix remaining stale 12-panel references in README

8b6a723

Update the Features section header (12 -> 13 Dashboard Panels) and the keyboard-shortcuts row to include panels 11-13 (Curator).

fix: close release review findings

0c1cfd6

test: close TDD coverage audit gaps

1ef4018

chore: prepare 2026.6.14 release

e2a3fbc

Copilot AI review requested due to automatic review settings June 14, 2026 18:40

Copilot started reviewing on behalf of mudrii June 14, 2026 18:41 View session

ci: pin verified action versions

cd59d56

github-advanced-security AI found potential problems Jun 14, 2026

View reviewed changes

Comment thread tests/test_main.py Fixed

Comment thread tests/test_panels_extended.py Fixed

ci: create uv virtualenv before install

f3ef142

Copilot AI reviewed Jun 14, 2026

View reviewed changes

Comment thread hermesd/panels/tokens.py

Comment on lines 73 to 75

for s in state.sessions:

estimated = s.cost_status not in AUTHORITATIVE_COST_STATUSES

table.add_row(

mudrii added 3 commits June 15, 2026 02:45

test: avoid url substring assertions

bc97344

test: make config panel assertion ansi-stable

2f3ec3c

test: make dashboard config assertion ansi-stable

bf2ab89

chatgpt-codex-connector Bot reviewed Jun 14, 2026

View reviewed changes

mudrii added 5 commits June 15, 2026 09:51

mudrii changed the title ~~Prepare hermesd 2026.6.14 release~~ Release hermesd 2026.6.15 — drift fixes + integrations Jun 15, 2026

mudrii merged commit 100f3db into main Jun 15, 2026
6 checks passed

mudrii deleted the feature-fixes branch June 15, 2026 05:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Release hermesd 2026.6.15 — drift fixes + integrations#7

Release hermesd 2026.6.15 — drift fixes + integrations#7
mudrii merged 53 commits into
mainfrom
feature-fixes

mudrii commented Jun 14, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot commented Jun 14, 2026

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Jun 14, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

mudrii commented Jun 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

hermesd 2026.6.15 — drift fixes + hermes-agent integrations

Highlights

Quality gates (all green)

Release

Uh oh!

gemini-code-assist Bot commented Jun 14, 2026

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 14, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

mudrii commented Jun 14, 2026 •

edited

Loading