Skip to content

Release hermesd 2026.6.15 — drift fixes + integrations#7

Merged
mudrii merged 53 commits into
mainfrom
feature-fixes
Jun 15, 2026
Merged

Release hermesd 2026.6.15 — drift fixes + integrations#7
mudrii merged 53 commits into
mainfrom
feature-fixes

Conversation

@mudrii

@mudrii mudrii commented Jun 14, 2026

Copy link
Copy Markdown
Owner

hermesd 2026.6.15 — drift fixes + hermes-agent integrations

Brings hermesd back into sync with the current hermes-agent on-disk shapes and adds new read-only panels/metrics. Read-only invariant preserved throughout.

Highlights

  • Fixes (producer drift): credential pool list-vs-dict (auth.json), desktop build stamp camelCase, pr-monitor live keys + all naming families, included/exact cost authoritative, session billing/end fields surfaced.
  • Integrations: Gateway platform errors + agent counts; new Curator panel (13) with per-tool breakdown + state transitions; Sessions Billing & Context (model context limit); Tokens by-endpoint breakdown + cost-status reconciliation; Operations response-store stats; extra log streams; Kanban task workflow fields + decomposition tree.
  • Hardening: markup-escape of all untrusted ~/.hermes free-text (incl. formatted timestamps), symlink/path-escape guards on file reads, SQL LIKE escaping, cache-preservation on every source.
  • Infra: opt-in live contract test (HERMESD_CONTRACT_TEST=1) guarding against the next drift; CI hardened (3.11–3.13 matrix, -W error::ResourceWarning, pinned actions, concurrency, uv cache).

Quality gates (all green)

  • 777 tests pass + 1 opt-in skip under -W error::ResourceWarning
  • ruff check + format, mypy, pip-audit, uv build (sdist + wheel) all clean
  • 99% coverage (remaining lines are accepted-defensive TOCTOU/unreachable guards)
  • Reviewed across 4 axes (standards, spec/PLAN, invariants, tests) over multiple converged rounds; test suite TDD-reviewed for quality + coverage over converged rounds

Release

  • Version bumped to 2026.6.15 (calver) in pyproject.toml; CHANGELOG dated section updated; __version__ is dynamic from package metadata.
  • After merge, publish by creating a GitHub Release with tag v2026.6.15python-publish.yml runs the full gate → builds → trusted-publishes to PyPI.

Closes the validated drift-fix + integration plan (all loop-doable items; Part-3 future-features deferred pending live data).

mudrii added 30 commits June 10, 2026 09:17
…% coverage

Audit-fix wave (review of HEAD, 2026-06-10):

Bugs fixed:
- collector: route _collect_sessions/_collect_runtime_status through the
  _CollectionHealth boundary so a bad SQLite row can no longer crash startup
  or stale the whole refresh; NULL session id no longer fails tool stats
- app: G/k scroll deadlock (stored offset now clamped to effective max),
  console-swap race in copy_current_view (snapshot renders into a local
  Console), theme load moved outside the state lock, signal handlers
  installed before initial collect, health dot red at zero healthy sources
- db: stale flag now set when the DB file disappears; LIKE search escapes
  %/_ wildcards with an ESCAPE clause
- panels: ~$/$ cost prefix reflects estimated vs reported in compact,
  detail rows, and aggregate tables; unknown minlevel: filter shows all
  lines instead of none; SOUL.md empty vs missing distinguished;
  multi message: filter tokens resolve last-wins consistently

Performance:
- message search no longer serializes against the collect pass
- per-profile session counts cached by DB mtime (no WAL re-snapshot per tick)
- log tails skip re-reads when mtime+size unchanged
- session summaries memoized while underlying rows are unchanged
- cron output excerpts respect --log-tail-bytes

Standards/cleanup:
- all locks documented; ~/.hermes default centralized in paths.py;
  from __future__ import annotations across tests; dead code removed
  (cycle_log_view, _tail_log, status_bar_style, context_color, unreachable
  scroll hint, unused default-arg branches)

Tests:
- 409 -> 548 tests; coverage 93% -> 99% (app.py and all panels at 100%)
- implementation-coupled tests reworked to public seams; duplicates removed
- pytest-cov added to dev extras

Docs: CHANGELOG Unreleased section; README cost-prefix and g/G wording.
- panels/logs.py: single-source the detail clamp via public
  max_detail_scroll_offset(); app.py no longer imports logs-private names
- collector: derived() closure replaces 5 repeated memoization lambdas;
  _read_tail_text() replaces 3 copies of the tail-read idiom; LogStream
  construction collapsed to one call
- app.py: OSC52 write falls back to an unguarded write if Rich ever
  renames the private Console._lock (no hard dependency on internals)
- __main__.py: pragma no cover on the __name__ guard (covered by the
  subprocess test)
- tests: render_to_str consolidated into conftest (was defined 7x with
  drifted widths), restore_signal_handlers fixture replaces hand-rolled
  try/finally in 4 tests, collector-level tests moved out of panel test
  files into test_collector_extended.py

No behavior changes; 548 tests green, ruff and mypy clean.
… test

- test_app_close_idempotent / test_close_idempotent now assert a
  post-close observable instead of passing vacuously
- removed test_db_file_deleted_marks_cached_reads_stale (strict subset
  of test_db_file_deleted_marks_all_cached_reads_stale)
Round 1 audit (standards/spec/quality/coverage, 4 parallel agents):
- test_gateway_resilience.py asserted PID-liveness but never the
  cache-preservation invariant its filename implies; add a test that
  forces a gateway collection error and asserts last-good gateway
  survives (running/pid/version unchanged).
- Cover the reachable skip-branches in _tail_latest_cron_output
  (stray non-dir entry in the output root, non-file inside a job dir).

Tests-only; no source behavior change. 549 passing, coverage 99%.
…p injection

Round 3 adversarial audit found a HIGH-severity defect: panels passed
free-text from ~/.hermes/ (skill descriptions, session source/model/cwd/
title, task titles, error excerpts, provider/plugin/MCP names, memory
file names, etc.) directly into Rich Table.add_row(...), a markup-parsed
context. Data containing '[xy]' was silently stripped, and an unbalanced
'[/]' raised rich.markup.MarkupError at render time. That error escaped
the app loop's KeyboardInterrupt-only handler and crashed the whole TUI,
violating the never-blank-the-display / cache-preservation invariant.

Escape every untrusted free-text cell with rich.markup.escape(). Numbers,
fmt_* output, fixed labels, style tags, and the already-literal
Text.append(...) compact views are left untouched. Adds
tests/test_markup_safety.py: all 12 panels x compact+detail rendered with
a '[/] desc [xy] tag' payload, asserting no crash and literal-bracket
survival (48 cases).

597 passing, coverage maintained, ruff/mypy clean.
Round 3 audit found 3 test files violating the project's documented
ruff formatter (.claude/rules/python-idioms.md). Mechanical reformat
only; no logic change. Whole repo now passes ruff format --check.
TDD-focused test-quality audit (4 parallel agents) findings, fixed
test-first (no source changes; suite already 99% line coverage):

Fixes:
- Eliminate ResourceWarning: two app tests and one db test reopened a
  SQLite connection via a read-after-close/collect path and never closed
  it; close the reopened resource (suite now passes -W error::ResourceWarning).
- test_cost_estimation: replace self-derived expected (computed via the
  same _estimate_cost under test) with hand-computed literals (0.325,
  0.315), and pin > 0 assertions to pytest.approx — a regressed estimator
  now actually fails.
- Strengthen unchecked/vacuous assertions: ToolStats.call_count,
  exact today-token sum (21_500/26_500), full session id-set,
  len(sessions)==2 before all(is_active).

Coverage gaps (behavior, not just line):
- Cache-preservation force-error tests for the 6 untested safe_collect
  sources (cron/channels/kanban/operations/skills/memory): force each
  _collect_* to raise and assert last-good state survives and the source
  is marked failed. Invariant holds for all six.
- test_readonly_invariant: snapshot ~/.hermes manifest (path+size+mtime_ns)
  before/after a full collect and the snapshot paths; assert nothing is
  written — guards the read-only Critical Rule.
- test_import_hygiene: ast-scan every hermesd module; assert no
  hermes-agent import — guards the no-import Critical Rule.

606 passing (+9), ruff/format/mypy clean.
TDD Round 2 (validate prior wave + close deferred behavior gaps):
- Empty-state sweep: render every panel (compact+detail) against a
  default DashboardState() — the real first-launch condition where
  ~/.hermes exists but the agent never ran (all collections empty,
  counts zero, optionals None at once). 6 panels previously had no
  empty-detail assertion; guards None-formatting / zero-division.
- Bracket-nav wrap: existing test stopped at panel 12 then stepped back;
  add the wrap-seam case (']' on 12 -> 1, '[' on 1 -> 12) where the
  modulo off-by-one would live.
- Strengthen test_build_header_with_custom_skin: asserted only
  'is not None'; now asserts app._theme.skin_name == 'ares' so a broken
  skin load actually fails.

Deferred defensive cases (TOCTOU OSError guards, --log-tail-bytes<=0,
malformed-TOML, db FTS-probe error) assessed and skipped: already
covered or unreachable without fault injection — forcing them tests
except-mechanics, not behavior.

631 passing (+25), ruff/format/mypy clean, -W error::ResourceWarning clean.
TDD Cycle 2 (fresh full re-audit) findings:
- Determinism: two today-token assertions seeded session started_at from
  raw time.time() then asserted an exact today total. Between local
  midnight and the collector's _today_epoch() re-read, the row could fall
  before the cutoff and flake (test_collect_tokens_today_filters_by_date
  had up to a 1-hour post-midnight window — a regression from the prior
  cycle's exactness tightening). Pin hermesd.collector._today_epoch via
  monkeypatch so the cutoff is deterministic regardless of wall clock.
- Extend the force-error cache-preservation harness with config and logs
  (both non-default under populated_hermes_home, so the equality is
  meaningful). Skipped skin: its populated value equals the model default
  ('default'), which would make the assertion vacuous.

633 passing (+2), ruff/format/mypy clean, -W error::ResourceWarning clean.
TDD Cycle 3 (final fresh re-audit) findings:
- file_cache: LastGoodFileCache only had invalid-shape reuse tests; the
  core cache-hit (unchanged mtime returns cached value without re-reading)
  and invalidation (newer mtime with valid content yields the new value)
  paths were untested — a regression making _cached_read always-stale or
  always-reload would have passed. Add a deterministic test using os.utime
  (corrupt-but-same-mtime proves a hit skips the reload; bumped-mtime proves
  invalidation).
- logs detail: the unfiltered-empty branch ("No log lines") was untested;
  only the filtered-empty branch ("No matching log lines") was. Add the
  unfiltered-empty assertion.

635 passing (+2), ruff/format/mypy clean, -W error::ResourceWarning clean.
Final confirmation pass found the cache-hit assertion over-claimed: the
corrupt-but-same-mtime trick returned the cached value under BOTH a real
cache hit and an always-reload regression (the reload would fail on the
corrupt bytes and fall back to last-good), so it did not prove the reload
was skipped. Strengthen it with the file's established counting_open
pattern: assert open_calls == 0 across the unchanged-mtime read (reload
truly skipped) and == 1 across the invalidation read (reload happened).

635 passing, ruff/format/mypy clean.
Read end_reason, billing_base_url, and billing_mode from state.db (columns
existed in schema but were unread) and surface them in the Sessions detail
Runtime sub-table. Placed in the Runtime sub-table rather than the main
sessions table because 15 columns over-squeezed the main table at width 120.

Unblocks C2 (context/limit gauge) and NF1 (billing-endpoint breakdown),
which join on billing_base_url.

PLAN.md: FIX B.
Live auth.json maps each provider to a LIST of credential entries, but
hermesd fed that list to _as_dict (-> {}), blanking every credential field
in the pool view. Add _select_pool_entry to collapse the list to one
representative entry (lowest priority = next credential used; ties keep list
order), while still accepting the legacy single-dict shape. Provider name
remains the outer dict key.

PLAN.md: FIX A1.
gateway_state.json carries per-platform error_code/error_message plus
top-level active_agents and restart_requested, none of which hermesd read.
Surface them: a warning marker in the compact view, an Error column and
restart/active-agent indicators in the detail view. Error text is escaped
before reaching Rich markup.

PLAN.md: C1.
The live hermes-agent producer never emits cost_status "reported"; it uses
unknown/estimated/exact/included. So the authoritative-cost branch was never
taken on live data and every cost rendered with the estimated "~$" prefix.
Map exact and included (alongside legacy reported) to authoritative via a
shared AUTHORITATIVE_COST_STATUSES set used by the collector's estimated-flag
logic, per-row cost resolution, and the tokens panel prefix. Subscription-
included rows (cost 0.0) now render an authoritative $0.00 rather than a
token-based estimate.

PLAN.md: A4.
Live desktop-build-stamp.json uses builtAt/contentHash/sourceMode, but the
operations collector looked only for version/stamp/built_at/created_at, so the
build stamp always rendered blank. Add builtAt to the lookup chain and fall
back to a 12-char contentHash prefix.

PLAN.md: A2.
Live pr-monitor JSON uses prs/tracked_numbers/author_prs/checked_at, but the
collector counted the retired monitored/tracked keys (stuck at 0) and tried a
dead camelCase checkedAt before the snake_case key. Read prs -> monitored
count and tracked_numbers -> tracked count (legacy keys kept as fallback),
and drop the dead checkedAt first-guess. Fixture realistic-ified to the live
shape to expose the drift.

PLAN.md: A3 + A5.
The agent writes PR-monitor state under several families: flat
pr-monitor-*.json and pr_monitor_*.json, plus per-repo files inside
pr-monitor/ and pr_monitor/ subdirs. hermesd globbed only the flat hyphen
family, missing large active state files. Glob all four families and collapse
the same repo (seen across families) to its newest checked_at so the panel
isn't flooded with near-duplicate rows; repo-less files stay distinct.

PLAN.md: A6.
Add CuratorRun model and Collector._collect_curator reading the newest
~/.hermes/logs/curator/<stamp>/run.json (model/provider, before/after/delta
counts, archived/pruned/added/consolidated, tool-call total, llm summary and
error). Registered as a new 'curator' safe_collect source with last-good
cache preservation. Panel rendering and registration follow in the next step.

PLAN.md: C3 (data layer).
Render the newest memory-curation run as panel 13: a compact summary (last
run stamp, skill before->after delta, archived/pruned, error marker) and a
detail view (full counts, model/provider, duration, tool calls, and the LLM
summary or error). All free-text from run.json is markup-escaped. Registered
in PANEL_NAMES/_RENDERERS and the wide/compact layouts (tall-narrow and
snapshot derive automatically); reachable via ] cycling or --snapshot-panel
13. README/CLAUDE.md panel counts and CHANGELOG updated.

PLAN.md: C3 (panel).
Join each session's model@billing_base_url against context_length_cache.yaml
to surface the model's context-window size (SessionInfo.context_limit). Split
the sessions detail Runtime table into a lean Runtime table (operational
fields) and a new Billing & Context table (end reason, endpoint, mode, context
limit) so neither over-squeezes at typical widths. Labeled as the context
limit vs lifetime cumulative tokens, not live occupancy. Base-url join
normalizes a trailing slash; model names are not lowercased.

PLAN.md: C2.
Aggregate token usage and spend per billing endpoint (billing_base_url) and
render it as a By Endpoint sub-table in the Tokens / Cost detail panel. This
is finer than the existing per-provider breakdown — the same provider can bill
through several base URLs. Reuses the existing breakdown summarizer and table
renderer.

PLAN.md: NF1.
Count sessions by cost_status and render a one-line distribution in the
Tokens / Cost detail panel (e.g. unknown N · included N · estimated N) so the
user can see how much spend is authoritative versus unknown. NULL cost_status
is counted as 'unknown'.

PLAN.md: NF2.
Read ~/.hermes/response_store.db read-only (mode=ro&immutable=1, reusing the
kanban read-only connector) and surface conversation/response row counts and
file size as a Response Store row in the Operations detail panel. Missing
tables count as zero; an absent DB shows nothing.

PLAN.md: NF3.
Add audit.log, mcp-stderr.log, workspace.log, and workspace.error.log to the
log-stream specs so they appear as Tab sub-views in the Logs panel when
present (existence-gated, so absent files are simply skipped).

PLAN.md: NF4.
Read task completed_at, workspace_path, goal_mode, and current_step_key from
kanban.db (additive SELECT *), show branch_name in the worker/problem task
tables, and add guarded Decomposition Links / Attachments summary rows from
task_links / task_attachments (rendered only when populated).

PLAN.md: NF5.
Add HERMESD_CONTRACT_TEST=1-gated test that runs the collector against the real
Hermes home and asserts drift-sensitive fields are populated wherever their
source data exists (credential labels, session billing fields, gateway
platforms, desktop stamp, pr-monitor counts, curator run). Skipped by default
so CI never couples to a machine's data; catches the next producer-side schema
drift the moment it lands.

PLAN.md: Part 4 contract test.
mudrii added 10 commits June 14, 2026 22:12
Update the Features section header (12 -> 13 Dashboard Panels) and the
keyboard-shortcuts row to include panels 11-13 (Curator).
- actions/checkout v4 -> v5, astral-sh/setup-uv v5 -> v6 in CI and the
  publish workflow (verified current majors).
- Add [tool.hatch.build.targets.wheel] packages = ["hermesd"] so the wheel
  target is explicit rather than relying on hatchling auto-detection.
Add tests flagged by the branch audit for production branches that lacked
direct coverage:
- tokens 'By Endpoint' sub-table actually renders (NF1 was collector-only)
- credential_pool empty-list / list-of-non-dicts degrade to a name-only entry
- repo-less pr-monitor files stay distinct (the ::filename dedupe key path)
Round-1 audit fixes:
- Add an end-to-end test that seeds real audit.log/mcp-stderr.log and asserts
  both streams collect and are reachable via the Tab cycle (was only proven
  with synthetic stream names).
- Assert kanban link_count/attachment_count == 0 explicitly when the
  task_links/task_attachments tables are absent (was implicit via no failed
  source).

The pr-monitor checkedAt finding was reviewed and dismissed: PLAN A5
intentionally dropped the dead camelCase guess, all live families use
snake_case checked_at, and the newest-timestamp dedupe is correct as-is.
Round-2 audit fixes:
- tokens panel: format all cost figures via fmt_usd (with a ~ estimate
  marker) so negative costs render -$0.50 / ~-$0.50 instead of $-0.50;
  factored into a shared _fmt_cost helper used by the session, window, and
  breakdown tables and the compact view.
- collector: _collect_curator now skips symlinked run dirs and paths that
  escape the Hermes home, and bails on a symlinked run.json — matching the
  symlink hardening already applied to the cron/checkpoint readers.
- tests: negative-cost prefix render; curator run-dir-without-run.json
  degrades to empty (not failed); curator symlinked-run-dir is skipped.

Dismissed: tokens per-session prefix ignoring estimated_cost_usd-None (the
field is float on SessionInfo, never None at the panel layer); _billing_table
[:10] cap (mirrors _runtime_table).
Raise coverage 99% (49 -> 14 missed lines) with tests that assert real
observable behavior, not line execution:
- db.py 96->99%: message-search cache-preservation (error -> last-good + stale
  flag), consecutive-error reset, FTS-unavailable graceful degrade.
- collector.py 97->99% (new test_collector_coverage.py): available-tools mtime
  cache hit, log-stream OSError -> last-good, curator empty/symlinked run,
  _path_resolves_under guard, context-length key-without-@, unreadable
  checkpoint/cron/soul files, skill-dir filters, dashboard-process matching.
- curator_panel.py 97->100%: compact warning marker, long-summary truncation.
- Strengthen the by-endpoint render assertion (was URL-echo only).

Remaining 14 misses are documented accepted-defensive (2 unreachable db
guards; 12 collector TOCTOU/empty-source-factory branches not deterministically
reproducible without brittle stat/open mocks). chmod-based tests skip cleanly
when run as root.
Round-2 test-quality fixes:
- Skip the WAL snapshot-failure test when running as root (chmod 000 does not
  block reads for root, which would make the assertion vacuous in CI).
- Use the imported _RECONNECT_ERROR_THRESHOLD constant instead of a magic 3 in
  the dead-handle reconnect test so it tracks the source threshold.
Copilot AI review requested due to automatic review settings June 14, 2026 18:40
@gemini-code-assist

Copy link
Copy Markdown

Warning

Gemini encountered an error creating the review. You can try again by commenting /gemini review.

Comment thread tests/test_main.py Fixed
Comment thread tests/test_panels_extended.py Fixed

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Prepares the 2026.6.14 hermesd release by expanding dashboard functionality (new Curator panel + richer Kanban/Operations/Tokens/Sessions/Gateway details), hardening read-only + markup-safety behavior, and aligning docs/tests/CI with the new release gates.

Changes:

  • Add panel 13 (Curator) and extend multiple panels (Tokens, Sessions, Gateway, Kanban, Operations) with new surfaced fields and summaries.
  • Improve resilience/security: escape untrusted text before Rich markup parsing; tighten read-only invariants; strengthen DB/WAL + cache behavior and message search.
  • Modernize CI/release workflows and test suite (shared render_to_str, expanded coverage/guards, updated docs/changelog/version).

Reviewed changes

Copilot reviewed 62 out of 63 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
tests/test_tools_panel.py Switch to shared Rich render helper; add fallback label coverage for watch/checkpoint rows.
tests/test_theme.py Update theme assertions to match refactored Theme API.
tests/test_skills_panel.py Use shared render helper; keep skills panel expectations stable under no-color rendering.
tests/test_session_active.py Extend session schema mapping coverage for new billing/end fields.
tests/test_readonly_invariant.py New invariant tests asserting collector/app read paths do not mutate ~/.hermes.
tests/test_profiles.py Add symlink-escape hardening tests for profiles discovery + HermesPaths validation.
tests/test_profiles_panel.py Panel rendering tests updated; add size/mtime formatting coverage.
tests/test_panels.py Broaden panel rendering tests; add coverage for new/expanded panel behaviors.
tests/test_models.py Small model assertions update (ToolStats call_count).
tests/test_memory_panel.py Extend memory panel compact/detail behavior (SOUL present/empty/missing).
tests/test_markup_safety.py New cross-panel Rich markup injection hardening tests.
tests/test_main.py Extend snapshot panel validation/output tests; add entrypoint/version fallback coverage.
tests/test_import_hygiene.py New guard test preventing hermesd importing hermes-agent modules.
tests/test_gateway_resilience.py Minor import/annotation alignment.
tests/test_formatting.py Add fmt_iso_timestamp empty-input dash behavior tests.
tests/test_file_cache.py Add mtime cache-hit behavior test for LastGoodFileCache.
tests/test_db.py Make DB session assertions order-independent; remove outdated caching/URI tests.
tests/test_db_resilience.py New DB backoff/WAL failure/cache-staleness/message-search resilience coverage.
tests/test_db_extended.py Expand DB/WAL/LIKE-escaping/read-after-close behaviors; add symlink-sidecar protection coverage.
tests/test_curator.py New collector+panel tests for curator run parsing and edge cases.
tests/test_curator_resilience.py New curator cache-preservation test on corruption.
tests/test_curator_panel.py New Curator panel rendering tests (compact/detail/error/truncation).
tests/test_cron_panel.py Update cron panel tests for new metadata and shared render helper.
tests/test_cost_estimation.py Expand cost authority/estimation semantics and collector/token analytics tests.
tests/test_contract.py New opt-in live ~/.hermes drift/contract test (env-gated).
tests/test_collector.py Major expansion of collector fallback/caching/health-redaction/search concurrency tests.
tests/test_collector_coverage.py New targeted coverage for collector IO-error fallbacks and read-only guards.
tests/test_app.py Update key handling API; add run/signal/loop behavior tests.
tests/conftest.py Add shared render_to_str helper and signal handler restore fixture; refactor kanban DB setup.
tests/init.py Add future-annotations package marker.
README.md Update docs for 13 panels, new logs, cost semantics, snapshot behavior, and commands.
pyproject.toml Bump version to 2026.6.14; add pytest-cov; ensure wheel packaging config.
hermesd/theme.py Remove unused style helpers; expose status_bar_bg only.
hermesd/paths.py Add default_hermes_home; harden profiles root against symlink escape.
hermesd/panels/tools.py Escape untrusted tool/process/checkpoint strings for Rich markup safety.
hermesd/panels/tokens.py Add ~$ vs $ formatting based on estimation; add endpoint + status reconciliation sections.
hermesd/panels/sessions.py Escape untrusted fields; add Billing & Context table; improve filter and token-sort semantics.
hermesd/panels/profiles.py Escape untrusted profile names in tables.
hermesd/panels/overview.py Escape untrusted skills/integrations fields; tweak skills header behavior.
hermesd/panels/operations.py Escape untrusted fields; add response_store summary; adjust empty-state logic.
hermesd/panels/memory_panel.py Escape untrusted fields; distinguish SOUL present/empty/none in compact view.
hermesd/panels/logs.py Factor log view resolution; fix unknown minlevel behavior; expose max scroll offset helper.
hermesd/panels/kanban.py Escape untrusted fields; add links/attachments + metadata tables; extend task/run rendering.
hermesd/panels/gateway.py Escape untrusted fields; add per-platform error column and compact warning marker.
hermesd/panels/curator_panel.py New Curator panel renderer (compact/detail, tool counts, transitions, summary/error).
hermesd/panels/cron.py Escape untrusted cron job fields; surface more cron metadata in detail.
hermesd/panels/config_panel.py Escape untrusted config fields; improve empty-config detail output.
hermesd/panels/init.py Register panel 13 renderer and name.
hermesd/models.py Add new model fields (cost authority, endpoints, kanban links, curator run, ops response store, etc.).
hermesd/file_cache.py Document lock purpose; clarify cast rationale.
hermesd/db.py Add stale marking on db disappearance; safe sidecar copying; LIKE escaping; improved message search caching semantics.
hermesd/app.py Add panel 13 to layouts; improve signal handling + view locking; clamp scroll offsets; safer OSC52 write.
hermesd/main.py Use default_hermes_home; expand snapshot panel range; improve help text/coverage marker.
CONTRIBUTING.md Align contributor commands with CI/release gates (adds uv build).
CLAUDE.md Update docs to reflect 13 panels.
CHANGELOG.md Add 2026.6.14 release notes section with new features/fixes.
.github/workflows/python-publish.yml Modernize Actions versions; enable uv cache; update test invocation flags.
.github/workflows/ci.yml Add workflow_dispatch + concurrency; modernize Actions/uv setup; matrix settings updates.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread hermesd/panels/tokens.py
Comment on lines 73 to 75
for s in state.sessions:
estimated = s.cost_status not in AUTHORITATIVE_COST_STATUSES
table.add_row(

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e2a3fbc747

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread .github/workflows/ci.yml Outdated
- uses: actions/checkout@v4
- uses: astral-sh/setup-uv@v5
- uses: actions/checkout@v6
- uses: astral-sh/setup-uv@v8

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Pin setup-uv to an existing immutable tag

This workflow now references astral-sh/setup-uv@v8, but setup-uv v8 intentionally stopped publishing major/minor tags: the upstream v8.0.0 release notes say users will not be able to use @v8 or @v8.0 and recommend astral-sh/setup-uv@v8.0.0 instead (https://github.com/astral-sh/setup-uv/releases/tag/v8.0.0). As written, each CI job fails during action resolution before install/test can run; the same invalid pin also appears in .github/workflows/python-publish.yml, so release publishing is blocked too.

Useful? React with 👍 / 👎.

mudrii added 5 commits June 15, 2026 09:51
Review-cycle findings:
- _task_metadata_table deduped by task_id (a task in both active_tasks and
  recent_tasks rendered duplicate rows in the Task Metadata table).
- Document the Curator panel's per-tool call breakdown and state-transition
  trail in CHANGELOG and README (were implemented but undocumented).
fmt_iso_timestamp returns its input verbatim for non-ISO strings, so an
untrusted ~/.hermes/gateway.json updated_at value containing Rich markup
(e.g. [red]) reached a markup-parsed table cell unescaped and could crash the
TUI. Escape the formatted value (the sibling name cell was already escaped).
Extend the markup-safety panel-1 builder to inject into updated_at/error_code/
error_message so the parametrized injection sweep covers these cells.
…asserts (TDD round 1)

- Add a collector test for the alternate state_transitions entry shape
  ({"state": ...} with no from/to) -> labelled by state + timestamp
  (collector.py:2194 was uncovered).
- Replace bare-digit substring assertions in the curator detail panel test
  with label-bound regex (Added 2 / Consolidated 1 / Tool Calls 67 /
  read_file 12) so a wrong count can no longer pass on a stray digit.
…DD round 2)

- overview no-providers: bind zero counts to labels (Skills: 0 / 0 pools /
  0 plug) instead of a vacuous '0' substring.
- available-tools cache-hit: count real Path.open calls on the per-session
  file instead of wrapping the private _read_json_cached method.
- gateway active-agents, overview skill count, tools call-count: bind each
  digit to its label (3 active agents / Skills: 70 (28 cat) / shell_exec 23)
  so a wrong value can't pass on a stray digit.
Bump version 2026.6.14 -> 2026.6.15 (calver, today) and roll the dated
CHANGELOG section forward, folding in the two fixes that landed after the
prior version stamp:
- gateway platform timestamp markup-escape (injection guard)
- kanban Task Metadata table task-id dedup

The 2026.6.14 stamp was never tagged/published; 2026.6.15 is the first
release of this branch's drift-fix + integration work.
@mudrii mudrii changed the title Prepare hermesd 2026.6.14 release Release hermesd 2026.6.15 — drift fixes + integrations Jun 15, 2026
Replace the outdated 2026.4 screenshots with a current 2026.6.15 set:
overview plus a detail shot for every panel 1-13, named clearly
(overview.png, panel-NN-<name>.png). Refresh captions to call out the new
surfaces — gateway per-platform Error column, sessions Billing & Context
table, tokens Cost-Status + By-Endpoint breakdown, extra log streams, and the
new Curator panel. Removed the stale SCR-* images.

No personal data in the new shots: credential pools show env-var names and
presence only (no secret values); no emails, private chats, or non-local IPs.
@mudrii mudrii merged commit 100f3db into main Jun 15, 2026
6 checks passed
@mudrii mudrii deleted the feature-fixes branch June 15, 2026 05:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants