Skip to content

feat(sensor): Zuhn-native session-capture hook (Option A)#5

Merged
gorajing merged 2 commits into
mainfrom
feat/session-capture-hook
May 28, 2026
Merged

feat(sensor): Zuhn-native session-capture hook (Option A)#5
gorajing merged 2 commits into
mainfrom
feat/session-capture-hook

Conversation

@gorajing
Copy link
Copy Markdown
Owner

Summary

A Claude Code SessionEnd hook that captures the session transcript as a Zuhn session source. Mechanical only — no LLM in the hook. Insight extraction stays the deliberate, gated npm run autoknowledge step, so Phase 8's "no auto-extraction" still holds — only capture is automated.

Why this path: claude-mem (the alternative) digests via a paid LLM API (no key set + interactive install). Zuhn's own autoknowledge already uses the authenticated claude CLI — so this is zero new credentials, zero extra cost, no background daemon.

What it does

Claude Code SessionEnd
  └─ .claude/hooks/session-capture.sh   (bash, exit 0 always, errors → ~/.claude/session-capture.log)
        └─ scripts/capture-session.ts   (reads hook JSON from stdin; never throws)
              └─ lib/transcript.ts      (parse JSONL → drop noise → user/assistant turns → digest)
                    └─ writes one "session" source to sources/session/<slug>.md
                          │
                          │  later (manual / scheduled):
                          ▼
       ZUHN_GATE_BLOCKING_CHECKS=stance_present,stance_directional npm run autoknowledge
                          │
                          ▼
                       claude-CLI extraction → the gate → KB

Safety design

  • Hook never breaks the session. try/catch around every I/O in capture-session.ts; bash wrapper always exit 0; process.stdin.isTTY guard prevents blocking on manual runs.
  • Errors are LOGGED, not silenced. ~/.claude/session-capture.log (was a real bug-shaped design when I had 2>/dev/null; fixed).
  • Idempotent. Slug/id salted by session_id and date-independent → same session always maps to the same file; existsSync short-circuits re-captures.
  • Aggressive noise-stripping. Drops isSidechain (subagent), tool_use/tool_result, assistant thinking, attachment/system/custom-title lines, and framework-injection user turns (<system-reminder>/<command-*>).

Records a deliberate design supersession

docs/session-pipeline-setup.md updated: Phase 8's "explicit intent only / no passive scraping" existed because there was no automatic quality control. The gate now provides it, so passive transcript capture is safe. Capture is automated; extraction (and quality enforcement) stay gated.

Test plan

  • 8 parser tests (noise-stripping edge cases, cap/tail, idempotency, null on empty)
  • Full suite green (578 passed / 3 skipped) at commit time
  • Typecheck clean
  • Verified on the real 38 MB session transcript (→ 8,831-word clean digest, correct title)
  • Verified on bogus input (graceful no-op, no stray write)
  • Verified end-to-end via the logged hook chain with a synthetic transcript + temp KB (log shows fire + success; source written; exit 0)
  • CI

Note on review

Local codex exec review stalled twice (the harness backgrounds long commands; codex exec hangs on stdin/tty in that mode). Killed both. Relying on the GitHub codex App for PR review, plus the test coverage above.

🤖 Generated with Claude Code

gorajing and others added 2 commits May 27, 2026 22:15
…-free)

A Claude Code SessionEnd hook that captures the session transcript as a
Zuhn "session" source. MECHANICAL only — no LLM in the hook. Insight
extraction stays the deliberate, gated `npm run autoknowledge` step, so
Phase 8's "no auto-extraction" still holds; only capture is automated.

Why this and not claude-mem: claude-mem digests via a paid LLM API (no key
in this env, install is interactive). Zuhn's autoknowledge already uses the
authenticated `claude` CLI — so this path has zero new credentials, zero
extra API cost, no background daemon.

Components:
- scripts/lib/transcript.ts — parseTranscript (JSONL → user-prompt +
  assistant-text turns; drops isSidechain, tool_use/tool_result, assistant
  thinking, non-conversation types, and framework-injection user turns
  matching <system-reminder>/<command-*>); renderConversation (caps to
  60K chars keeping the TAIL with a truncation note); buildTranscriptSource
  (title=first user prompt; slug/id salted by session_id → idempotent;
  null when there's no usable conversation).
- scripts/capture-session.ts — the hook script. Reads hook JSON on stdin
  (guards process.stdin.isTTY to avoid blocking on manual runs), graceful
  no-op on missing transcript, idempotent file-exists skip, --dry-run, and
  NEVER throws out of the hook (try/catch around every I/O; always exit 0).
- templates/hooks/session-capture.sh — bash wrapper. Errors are LOGGED to
  ~/.claude/session-capture.log instead of silenced (a hook you can't watch
  fire is one where silent failure looks like "it works").
- docs/session-pipeline-setup.md — records the design supersession: the
  Phase-8 "explicit intent only" constraint existed because there was no
  automatic quality control; the gate now provides it, so passive transcript
  capture is safe.

Frontmatter: "session" was already added to SourceFrontmatter.type / the
health glob / the autoknowledge discovery glob in the claude-mem PR (#4),
so session sources flow through the existing extract→gate path unchanged.

Tests: 8 (noise-stripping edge cases, cap/tail, idempotency, null on empty).
Verified end-to-end on the real 38MB transcript (→ 8831-word digest), on
bogus input (graceful no-op), and via the logged hook chain with a
synthetic transcript + temp KB.

Local codex review stalled twice (the harness backgrounded `codex exec`,
which hangs on stdin/tty in that mode). Relying on the GitHub codex App
for PR review.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CI's "Verify hook templates are executable" check caught that the Write
tool created the template at 0644. The local copy installed in
.claude/hooks/ was already chmod +x'd, but the template itself wasn't —
so a fresh clone would copy a non-executable template into .claude/hooks/.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@gorajing gorajing merged commit fdb368b into main May 28, 2026
1 check passed
@gorajing gorajing deleted the feat/session-capture-hook branch May 28, 2026 06:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant