Skip to content

feat: replace JSONL file-tailing with hooks + OTEL observability#40

Open
SDS-Mode wants to merge 14 commits into
mainfrom
feat/hook-otel-observability
Open

feat: replace JSONL file-tailing with hooks + OTEL observability#40
SDS-Mode wants to merge 14 commits into
mainfrom
feat/hook-otel-observability

Conversation

@SDS-Mode
Copy link
Copy Markdown
Owner

Summary

  • Replace the bridge's brittle JSONL file-tailing for transcript and usage capture with three native Claude Code channels: plugin-manifest hooks (session discovery + tool events), native OpenTelemetry (token usage/cost), and a slim JSONL reader (assistant prose only)
  • Bridge transcript-watcher.ts reduced from ~470 LOC to ~140 LOC — all discovery heuristics (birthtime correlation, /proc PID lookup, cwd encoding, newest-JSONL guessing) deleted
  • Server gains two new HTTP endpoints: POST /hook for Claude Code hook events, POST /otlp/v1/* for OTEL telemetry ingestion
  • Plugin declares hooks in its manifest (hooks/hooks.json) — auto-registered on install, zero user config
  • scripts/setup.ts extended to write OTEL env vars to .claude/settings.local.json

Architecture

SessionStart hook  ──┐
PostToolUse hook   ──┼──► hook-relay.ts ──► POST /hook ──► Server
Stop hook          ──┘

OTEL SDK (built-in) ──► POST /otlp/v1/metrics ──► Server

JSONL tail (prose only) ──► Bridge ──► Server

Hook events provide the authoritative transcript path (via session-context WS message) and tool calls. OTEL provides token usage/cost. JSONL is retained only for assistant prose text.

Resolved issues

  • bridge-proc-linux-only — code deleted
  • bridge-birthtime-unreliable-fs — code deleted
  • bridge-duplicate-chunk-logic — code deleted
  • bridge-fswatch-persistent — fixed (persistent: false)
  • bridge-reconnect-transcriptwatcher-restart — fixed by design (bridge waits for session-context instead of replaying from offset 0)

Test plan

  • 537 unit/integration tests pass (0 fail)
  • Hook ingress: auth validation, SessionStart→session-context, PostToolUse→transcript-entry, PostToolUseFailure, Stop, malformed payload rejection
  • OTEL ingress: OTLP JSON metric parsing, usage-update broadcast, empty/malformed payload handling
  • Slim watcher: prose-only emission, tool/usage entry filtering, incremental reads, persistent: false
  • Integration: SessionStart delivers session-context to bridge, PostToolUse broadcasts to browser, OTEL metrics broadcast usage-update
  • Manual: install plugin, start Claude with channel flag, verify transcript tab shows tool calls, usage tab populates from OTEL, /clear picks up new session

🤖 Generated with Claude Code

SDS-Mode and others added 14 commits April 16, 2026 18:28
Introduces the SessionContextMessage interface and SessionContextSchema
for the new server→bridge message that delivers transcript path from a
SessionStart hook, replacing heuristic discovery in the bridge.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ntext replay

Add two public helpers on WebSocketHub (sendToBridge, broadcastToSession) and
replay a stored transcriptPath as a session-context message when a bridge
registers so the bridge can start its transcript watcher immediately even if
hook ingress delivered the path before the bridge connected.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Implements POST /hook handler via createHookIngress factory with
dependency injection, routing SessionStart/PostToolUse/PostToolUseFailure/Stop
events to the appropriate bridge/watcher calls. Includes 19 tests covering
auth, validation, all event routes, and graceful unknown-event handling.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds POST /otlp/v1/metrics and /otlp/v1/logs handlers that parse
Claude Code's native OpenTelemetry JSON export, extract
claude_code.token.usage data points, and broadcast usage-update
messages to watching browsers without requiring auth.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Hook fires before bridge registers, so setTranscriptPath was a no-op
when the session didn't exist yet. Add pendingPaths map to survive
across registry.add() which creates a fresh Session object.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add optional routes map to createHttpHandler so API routes take
precedence over static file serving, then wire /hook, /otlp/v1/metrics,
and /otlp/v1/logs into server/index.ts via createHookIngress and
createOtelIngress.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds hooks.json declaring SessionStart, PostToolUse, PostToolUseFailure,
and Stop hook events, each invoking hook-relay.ts. The relay script reads
the hook payload from stdin, discovers the running server via state.json,
and POSTs the payload to /hook — exiting silently if the server is
unreachable so hooks never block Claude Code.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace the ~470 LOC transcript-watcher with an ~120 LOC version that
takes a mandatory file path (no discovery), tails JSONL incrementally,
and emits only user/assistant text entries. Tool-use, tool-result, and
usage extraction are all removed — those now come from hooks and OTEL.

Deleted: discoverTranscriptByBirthtime, discoverTranscriptPath,
readSubagentEntries, rediscover, UsageData, onUsage, onFileSwitch, and
all cwd/birthtime/subagent state. Added { persistent: false } to
fs.watch to fix bridge-fswatch-persistent (blocked MCP stdio shutdown).
bridge/index.ts updated to the new 2-arg constructor and receives the
transcript path via session-context message instead of discovery.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds three integration test scenarios verifying end-to-end flows through
the running server: SessionStart delivers session-context to bridge,
PostToolUse broadcasts transcript-entry to watching browser, and OTLP
metrics broadcast usage-update to watching browser.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…status schema mismatch

- Delete resolveProjectDir() and /proc dependency
- Replace startTranscriptWatcher with startWatcherForPath (called on session-context)
- Remove transcript-status messages that Zod rejects (available:bool vs status:enum)
- ws.onopen no longer starts watcher — waits for session-context from server
- Add hooks reference to plugin.json so Claude Code discovers hook manifest

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant