feat(sensor): claude-mem → Zuhn session sync adapter (Phase 1)#4
Merged
Conversation
Layers claude-mem (auto-captures + digests every Claude Code session) into Zuhn as a "sensor": reads ~/.claude-mem/claude-mem.db READ-ONLY and writes one Zuhn "session" source per new session. It does NOT extract insights — the normal extract→gate flow authors stances and filters quality, so the epistemic layer stays the single gatekeeper for what enters the KB. - scripts/lib/claude-mem.ts: drift-tolerant reads (SELECT * + defensive coercion; warns on dropped rows), date-INDEPENDENT title → stable slug, latestPerSession (collapses multiple summaries per session, tie-broken by prompt_number), and the session-source builder. - scripts/sync-claude-mem.ts: graceful skip if claude-mem absent; read-only; watermark (meta/claude-mem-sync.json: last_epoch + synced ids); idempotent via file-exists + watermark; empty digests skipped WITHOUT poisoning the watermark (so a later fuller summary still syncs); --dry-run; fail-loud only on schema-read errors. - "session" added to SourceFrontmatter.type, health.ts validation glob, and autoknowledge.ts discovery glob (so synced sessions actually get extracted). - npm run sync-claude-mem. health.ts/autoknowledge.ts also adopt the pending KB_ROOT-import refactor (now resolvable — kb-root.ts is tracked on main since the gate PR). Tests: 13 (pure transforms + in-memory sqlite fixture covering reads, string coercion, dedup, prompt_number tie-break). Codex: 4 review rounds (caught the autoknowledge glob gap, slug date-drift, within-run dedup, epoch coercion, empty-watermark poisoning, prompt_number tie-break) → converged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
gorajing
added a commit
that referenced
this pull request
May 28, 2026
…-free) A Claude Code SessionEnd hook that captures the session transcript as a Zuhn "session" source. MECHANICAL only — no LLM in the hook. Insight extraction stays the deliberate, gated `npm run autoknowledge` step, so Phase 8's "no auto-extraction" still holds; only capture is automated. Why this and not claude-mem: claude-mem digests via a paid LLM API (no key in this env, install is interactive). Zuhn's autoknowledge already uses the authenticated `claude` CLI — so this path has zero new credentials, zero extra API cost, no background daemon. Components: - scripts/lib/transcript.ts — parseTranscript (JSONL → user-prompt + assistant-text turns; drops isSidechain, tool_use/tool_result, assistant thinking, non-conversation types, and framework-injection user turns matching <system-reminder>/<command-*>); renderConversation (caps to 60K chars keeping the TAIL with a truncation note); buildTranscriptSource (title=first user prompt; slug/id salted by session_id → idempotent; null when there's no usable conversation). - scripts/capture-session.ts — the hook script. Reads hook JSON on stdin (guards process.stdin.isTTY to avoid blocking on manual runs), graceful no-op on missing transcript, idempotent file-exists skip, --dry-run, and NEVER throws out of the hook (try/catch around every I/O; always exit 0). - templates/hooks/session-capture.sh — bash wrapper. Errors are LOGGED to ~/.claude/session-capture.log instead of silenced (a hook you can't watch fire is one where silent failure looks like "it works"). - docs/session-pipeline-setup.md — records the design supersession: the Phase-8 "explicit intent only" constraint existed because there was no automatic quality control; the gate now provides it, so passive transcript capture is safe. Frontmatter: "session" was already added to SourceFrontmatter.type / the health glob / the autoknowledge discovery glob in the claude-mem PR (#4), so session sources flow through the existing extract→gate path unchanged. Tests: 8 (noise-stripping edge cases, cap/tail, idempotency, null on empty). Verified end-to-end on the real 38MB transcript (→ 8831-word digest), on bogus input (graceful no-op), and via the logged hook chain with a synthetic transcript + temp KB. Local codex review stalled twice (the harness backgrounded `codex exec`, which hangs on stdin/tty in that mode). Relying on the GitHub codex App for PR review. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Layers claude-mem (auto-captures + digests every Claude Code session) into Zuhn as a sensor — so Zuhn finally captures your own reasoning, not just ingested articles. It reads
~/.claude-mem/claude-mem.dbread-only and writes one Zuhnsessionsource per new session. It deliberately does not extract insights — the normal extract→gate flow authors stances and filters quality, so the epistemic layer stays the single gatekeeper.Division of labor = division of moats: claude-mem = capture (commodity); Zuhn = epistemic consolidation (the moat).
What's here
scripts/lib/claude-mem.ts— drift-tolerant DB reads (SELECT *+ defensive coercion, warns on dropped rows), date-independent title → stable slug,latestPerSession(collapses multiple summaries/session, tie-broken byprompt_number), source builder.scripts/sync-claude-mem.ts— graceful skip if claude-mem absent; read-only open; watermark (meta/claude-mem-sync.json); idempotent (file-exists + watermark); empty digests skipped without poisoning the watermark;--dry-run; fail-loud only on schema-read errors.sessionadded toSourceFrontmatter.type,health.tsglob, andautoknowledge.tsdiscovery glob.npm run sync-claude-mem.Usage
npx claude-mem install(your global env; starts capturing sessions).npm run sync-claude-mem→ writes newsessionsources.ZUHN_GATE_BLOCKING_CHECKS=stance_present,stance_directional npm run autoknowledge→ authors stances + runs the stricter gate (Step-3 env knob), so only sharp, novel session-insights land.Review
autoknowledgeglob gap (would've silently never extracted session sources), slug date-drift breaking idempotency, missing within-run dedup,created_at_epochcoercion, empty-digest watermark poisoning, and theprompt_numbertie-break.Test plan
sessionsource; dedup collapses multi-row sessions; empty session skipped + absent from watermark🤖 Generated with Claude Code