Skip to content

Pebble robustness: ntfy feedback, multi-skill, KB-default cwd#70

Merged
max-tet merged 14 commits into
mainfrom
clayde/pebble-robustness
May 14, 2026
Merged

Pebble robustness: ntfy feedback, multi-skill, KB-default cwd#70
max-tet merged 14 commits into
mainfrom
clayde/pebble-robustness

Conversation

@ClaydeCode
Copy link
Copy Markdown
Owner

Summary

  • ntfy completion notifications on every Pebble webhook terminal outcome (success, claude-reported failure, parse fallback, timeout, usage limit, CLI error, auth error, worker exception, queue full). 401 / 422 / 404 are deliberately silent (auth-fail flood vector, Pebble-side payload bugs, unknown routes).
  • Multi-skill freedom + KB-default cwd. New system prompt drops the single-skill cap and points Claude at /home/clayde/knowledge_base (RW mount, host-side Syncthing handles cross-device sync — no git inside the container). Built-in ping skill baked into the image at /skills/builtin/ lets you sanity-check the whole chain (watch → Traefik → Clayde → ntfy) from the watch.
  • New CliInvocationError so the worker distinguishes CLI failure from a successful no-summary run. Worker classifies every terminal path into a new pebble.outcome OTel span attribute: success | claude_fail | parse_fallback | timeout | rate_limited | cli_error | worker_error.
  • CLAYDE_PEBBLE_TIMEOUT default lowered 600 → 300 (pocket-dial guard). New env vars: CLAYDE_NTFY_TOPIC, CLAYDE_NTFY_BASE_URL, CLAYDE_NTFY_TIMEOUT_S, CLAYDE_KB_PATH.
  • ~25 new tests (notify dispatcher, JSON-tail parser, expanded worker branch coverage, e2e webhook → worker → fake CLI → fake ntfy via respx). 363/363 pass on this branch.
  • ntfy.sh public topic is unauthenticated — anyone with the topic string can read voice-command transcripts. Acknowledged in the spec; switching to a self-hosted ntfy or a token-protected reserved topic is a separate change.

Spec / plan

  • Spec: `docs/superpowers/specs/2026-05-13-pebble-robustness-design.md`
  • Plan: `docs/superpowers/plans/2026-05-13-pebble-robustness.md`

Known follow-ups (not in this PR)

  • Worker-dead notification dropped. The spec listed a "worker dead" notification path. In practice `worker_loop` catches every `Exception` so the only way the worker task can die is a `BaseException` (e.g. cancellation), which is already routed by `asyncio.gather`. Decided to monitor via OTel (`pebble.outcome`) rather than wire a separate watchdog. Reconsider if the worker ever proves fragile in production.
  • `RuntimeError` catch is broad. The worker catches `RuntimeError` to label the existing runner's plain-RuntimeError auth-failure path as "Pebble: auth error". Any other `RuntimeError` raised inside the runner or its callees would be mislabeled. Fix: introduce a dedicated `CliAuthError(Exception)` in `clayde.claude` and raise it from the two `_is_auth_error` sites. Three-line follow-up.
  • Fallback-title string coupling. The worker classifies `parse_fallback` outcomes by string-equality against `extract_notification_payload`'s fallback title (`"Pebble: done (no summary)"`). If the parser's title ever changes, classification silently drifts. Worth replacing with a sentinel field on `NotificationPayload` or an explicit `is_fallback` flag from the parser.

Test plan

  • `uv run pytest` — full suite green locally (363 passed).
  • On staging: speak "ping" into the Pebble app → receive `pong` notification within seconds.
  • On staging: speak "remember to buy milk" → file appears under `~/knowledge_base/inbox/` (Claude judgement, no explicit skill) and a success notification fires.
  • On staging: force `CLAYDE_PEBBLE_QUEUE_MAX=0` and POST → observe `Pebble: queue full` notification + 503 response.
  • On staging: block ntfy.sh DNS (e.g. /etc/hosts) → the worker still records `outcome=success` in `traces.jsonl` with `pebble.notify_ok=false` (best-effort invariant).

🤖 Generated with Claude Code

ClaydeCode and others added 14 commits May 13, 2026 19:52
Phase 2 over the merged phase-1 webhook (PR #69). Three deltas:
- ntfy.sh notification on every terminal outcome, with Claude-emitted
  JSON tail for title/body; pre-Claude failures notify too.
- Multi-skill freedom: drop the single-skill cap, Claude composes.
- ~/knowledge_base mounted RW as default cwd; Syncthing handles sync,
  no git operations in the container.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- Replace fictional TimeoutExpired/CalledProcessError with the actual
  phase-1 exception types (InvocationTimeoutError, UsageLimitError) and
  the new CliInvocationError that the runner must raise on rc!=0 not
  matching auth/limit.
- Auth-error path (RuntimeError from existing runner) added to failure
  matrix and worker try/except.
- Promote literal /home/clayde/knowledge_base in the system prompt
  (Claude reads the prompt as text — no ~ expansion).
- Clarify parser lives in runner.py as extract_notification_payload(),
  called from worker; remove conflicting duplicate mention.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Eleven TDD bite-sized tasks covering ntfy notification on every terminal
outcome, multi-skill prompt, KB-default cwd, built-in ping skill, and
end-to-end integration coverage.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The phase-1 test test_pebble_returns_503_when_full filled the queue and
let the 503 branch run unmocked, so the autouse call to send_ntfy hit
ntfy.sh on the real network during local test runs. Patch the dispatcher
on the app module before invoking the client so the test stays offline.
@max-tet max-tet merged commit 169e60c into main May 14, 2026
2 checks passed
@max-tet max-tet deleted the clayde/pebble-robustness branch May 14, 2026 06:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants