Dev by im4codes · Pull Request #19 · im4codes/imcodes

im4codes · 2026-05-14T08:11:46Z

No description provided.

The visual canvas editor was previously gated behind the Participants tab with no entry point for new users — they would see only the agent grid and never reach the workflow canvas. Move the canvas, allowed-executables allowlist, migration banner, future-schema banner, and capability banners into a dedicated "Advanced Workflow" tab. Auto-bootstrap a starter draft when a user first enters the tab so the canvas is reachable from a cold panel. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Users can now save multiple named workflows per session and pick which one P2P invokes. The advanced tab gains a workflow library section with new / duplicate / delete buttons, an active badge, and a workflow name input above the canvas. Legacy single-draft configs auto-migrate into a single-entry library on load, with the legacy `workflowDraft` field kept in sync as a mirror so older clients mid-rollout continue to launch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…scade Production daemon on a self-hosted server (211, 2026-05-10) was hitting OOM at the default 4 GB V8 heap every 1–9 hours: 4 ABRT crashes in 24h. Each restart cost ~30 s of WS downtime, surfacing as the operator- visible "always offline" symptom. Diagnosis (in this order): 1. journalctl confirmed `code=dumped, status=6/ABRT` cycles, not systemd lifecycle issues. 2. /proc/PID/smaps showed 1.47 GB anon (V8) + 1.23 GB [heap] (glibc / onnxruntime + better-sqlite3). RSS 2.7 GB. 3. SIGUSR2 (`--heapsnapshot-signal`) triggered V8 major GC and dropped RSS by 779 MB IN A SINGLE CYCLE. Heap snapshot showed only 218 MB of *live* objects. Conclusion: not a leak. V8's major GC is lazy by design — it lets the old generation accumulate garbage until heap pressure forces a sweep. With the default 4 GB ceiling, "force" came at ~3.5 GB live + ~3 GB pending garbage = OOM whenever a transient spike landed in that window. With the 12 GB ceiling we set on 211 as a runtime workaround, daemons survive but RSS bloats to many GB and major-GC pauses grow multi-second (also looks like "offline" to the UI). Fix has two parts that ship paired: (a) `src/daemon/lifecycle.ts startGcPoller()` — calls `globalThis.gc()` every 5 min (tuneable via IMCODES_GC_POLL_MS). Logs only when GC freed >50 MB or took >200 ms so quiet daemons don't spam logs. Defensive: silent no-op if --expose-gc is not enabled. (b) `Environment="NODE_OPTIONS=--expose-gc --max-old-space-size=8192"` added to BOTH systemd unit templates (bind-flow.ts + setup-flow.ts) AND the macOS launchctl plist (bind-flow.ts). Without this, (a) is dead code. Pinned with a contract test that scans both source-code anchors so a future refactor can't silently break the pair. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Production observation (211, 2026-05-10): the server pushes `daemon.upgrade` every time it sees a new dev tag on the npm registry. With CI publishing every ~5 min during active dev work, four daemons each restart for ~7 s on every tag, and the windows tile so the operator perceives the fleet as "always offline" — even though each individual upgrade is fast and correct. Add a cooldown: handleDaemonUpgrade declines an AUTO upgrade (no targetVersion specified, or `latest`) when a previous upgrade completed within IMCODES_UPGRADE_COOLDOWN_MS (default 10 min). The state is persisted to ~/.imcodes/last-upgrade-at, written by upgrade.sh after a successful step 5 health check, so the cooldown survives the very restart it is throttling. Operator-pinned upgrades (`imcodes upgrade --version X`) bypass the cooldown — explicit intent always wins. Same for missing / unreadable / future-dated / NaN sentinel content. The cooldown can be disabled entirely by setting IMCODES_UPGRADE_COOLDOWN_MS=0. Logic extracted into `evaluateAutoUpgradeCooldown(input)` — pure function, IO injected via `readSentinel`. 10/10 tests cover: missing sentinel, garbage sentinel, in-window block + remainingMs report, out-of-window pass, undefined / '' / 'latest' all treated as auto, pinned target bypass, opt-out (cooldownMs<=0), NaN cooldown, clock-skew future-dated sentinel, whitespace tolerance. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The P2P quick-pick dropdown above the chat input now has a tab switcher between the original combo presets list and the saved advanced workflow library, so users can launch a saved workflow in one click without opening Settings. The active tab persists globally across sessions and reloads via a new userPref. Also fixes a daemon-upgrade race: the gate previously only counted 'running' as in-progress for process agents, so a turn dispatched a few hundred ms before a daemon.upgrade broadcast (still in 'queued' state) would be silently killed by the upgrade restart. The gate now matches the web client's isRunningSessionState and also blocks on 'queued'. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…18n language Three rounds of UX feedback addressed in one commit so the panel + canvas + orchestrator changes ship together: PR-λ — Split Save into Save (keep open) + Save & Close so users can persist mid-edit without dismissing the panel. Auto-align permissionScope + dispatchStyle when the user picks a new preset (fixes the "implementation preset = invalid_workflow_graph" trap). Surface a dispatchStyle dropdown so single_main vs multi_dispatch is editable. Show the per-preset default prompt as the promptAppend placeholder. Widen the desktop panel from 780 to 1400 px. PR-μ — Workflow runs now auto-include a per-round summary for every preset, matching the legacy combo system. New P2P_PRESET_DEFAULT_SUMMARY_PROMPT covers all 10 workflow presets with rich structured prompts; canvas inspector adds a per-node summaryPromptOverride textarea. single_main rounds (implementation etc.) that previously skipped the summary phase now also dispatch a summary hop. Final-run synthesis prefers the round's resolved summary prompt over the BUILT_IN_MODES fallback. PR-ν — Replace the 79-char tail-of-prompt English language hint ("Use the user's selected i18n language ...") with a concise locale-native one-liner sourced from p2p.discussion_language_instruction (e.g. "请用中文回复。" / "日本語で回答してください。"). The line now sits right after P2P_BASELINE_PROMPT in both the legacy combo and advanced workflow prompt builders, and the daemon no longer pollutes user-supplied extraPrompt with anything. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ate stale banner PR-ξ — The allowed-executables UI was always visible after the bootstrap auto-created an LLM-only draft, suggesting a configuration burden where none existed. Now hide it entirely when the workflow has no script nodes and no entries; when surfaced, default-collapse behind a disclosure with a yellow "Required for script nodes" inline warning. Daemon enforcement is unchanged — empty allowlist still rejects every script. PR-ο — After PR-λ widened the panel to 1400 px, the canvas SVG's width="100%" stretched to fill the parent, scaling every node ~80% bigger. Cap the SVG at its native viewBox width so nodes render at the authored 132×62 px size and the extra panel width becomes inspector breathing room. PR-π — Canvas now supports zoom via mouse wheel and Mac touchpad pinch (both delivered as wheel events; pinch gets ctrlKey=true). Added zoom-out / 100% reset / zoom-in toolbar buttons. Default node size also shrunk ~21% so out-of-the-box density matches user expectation. Wheel listener attached non-passively so preventDefault stops page-scroll. Also rewrote the cryptic "Daemon workflow capability information is stale." banner — was hardcoded English in every locale despite the "translation". New text in all 7 locales explains what still works (saved configs), what is paused (new advanced launches), and the typical recovery window (<30s). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…itive PR-ρ — Composer attachments now carry a per-composer sequential `seq` (1, 2, 3, ...). The badge UI surfaces the seq as a `#N` prefix on each attachment chip, and the send-payload `text` field prepends a `#${seq}: ${name}` mapping line for every attachment so the LLM sees both the short reference tag AND the filename in the prompt. The user can then reference `#1` / `#2` naturally in subsequent text. Counter resets on send because `clearComposer` wipes the attachments array. Removing a middle attachment renumbers the survivors consecutively. PR-σ canvas — PR-ο capped the SVG at `CANVAS_VIEW_WIDTH` to stop nodes auto-scaling but the side-effect was a permanent empty gutter to the right of the canvas at the new 1400 px panel width. Replace the cap with a `ResizeObserver` that tracks the parent container width; viewBox extents are derived from the measured width divided by the zoom level so 1 viewBox unit = 1 screen pixel at zoom=1. Canvas now fills the full panel width AND nodes stay at their authored 132×62 px. PR-σ bridge — The `capability_stale` banner kept firing as a false-positive even though the daemon was healthy. Root cause: the daemon only sends `daemon.hello` on (a) WS connect/reconnect and (b) capability change, and the server bridge never replayed cached state to newly-connected browsers. Browsers that opened AFTER the daemon's most recent hello never received one, so the 30 s `capability_stale` TTL fired even though the daemon was fine. Fix: bridge `handleBrowserConnection` now replays the cached `daemonP2pWorkflowCapabilities` to every newly-connected browser as part of the opening-state push. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…; single_main never) PR-μ over-generalised the summary contract by gating `synthesisStyle` on the presence of an effective summary prompt. The correct rule is to gate it purely on `executionMode`: - `multi_dispatch` (N parallel workers, each writing to an isolated copy of the discussion file) → ALWAYS run an initiator-led synthesis hop afterward. Workers cannot see each other within the round; the synthesis hop is the ONLY place their outputs converge into one authoritative paragraph. Falls back to a generic prompt when no override / preset prompt is supplied (closes the previously-broken `custom` preset case where SUMMARY_PROMPTS had no entry and the round silently lost its synthesis hop). - `single_main` (1 worker = the initiator itself) → NEVER run a synthesis hop. The worker's own output IS the round's authoritative segment; asking the same agent to summarise itself is wasteful + confusing. The resolved `summaryPrompt` is left populated so the FINAL-RUN synthesis (PR-μ chain) can still pick it up when this happens to be the last round. The canvas inspector also hides the per-node summary-prompt textarea when the node's effective `dispatchStyle` is `single_main` — it was dead config there (the executor's single_main branch never dispatches a synthesis hop) and showing it gave users a false signal. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Six P0 fixes for the OOM regression introduced after a368875 (advanced P2P workflow), and the related slow reconnect after server restart. Per the round 1-3 audit decision in .imc/discussions/94b9b837-822.md, these fixes form an unbreakable group: A3 alone is a dead fix because N1 prevents runs from reaching a terminal status, so the cleanup never runs. Reconnect fixes ship together because reconnect storms are documented in server-link.ts:111-125 as a secondary RSS pressure source. A1 (discussion-orchestrator.ts) — discussions Map was append-only; schedule a 60 s deferred delete on done/failed/stopDiscussion to match the P2P activeRuns cadence. A2 (p2p-orchestrator.ts + shared/p2p-workflow-constants.ts) — cap P2pRun.routingHistory at P2P_ROUTING_HISTORY_RETENTION_COUNT = 500 via a new pushRoutingHistory helper that mirrors helperDiagnostics' FIFO trim. Long-running advanced workflows that loop through compiled-edge jumps were growing routingHistory without bound. A3 (p2p-orchestrator.ts) — failRun / timed_out paths called scheduleP2pRunTerminalCleanup but never deleted the P2pRun from activeRuns; only completed and cancelled paths did. Move the activeRuns.delete into scheduleP2pRunTerminalCleanup's 60 s timer so every terminal status (completed/failed/timed_out/cancelled) hits a single cleanup path, and remove the now-redundant explicit setTimeouts on the success path. A4 (p2p-orchestrator.ts) — the writer-queue onWriteFailure / onSegmentDropped closures captured the full P2pRun, so even after the 60 s activeRuns delete the queue's callback still pinned the run object in the heap. Stage primitives (runId, contextFilePath, attempt, initiatorSession) before enqueue and look the run up via getP2pRun(runId) inside the closure; the queue now retains only strings, and stale runs swallow gracefully. N1 (p2p-orchestrator.ts) — runP2pScriptNode was called without an AbortSignal even though the runner already supports one. A script with argv ['/bin/sleep','9999'] and no script.timeoutMs would block executeAdvancedChain forever; ensureRunDeadline never fires because the loop never advances; the run stays running, A3 cleanup never schedules. Add an AbortController per dispatch, register an aborter into a module-level currentScriptAborters map so cancelP2pRun can reach in, schedule a setTimeout-based abort tied to run.deadlineAt (default 30 min via shared/p2p-advanced.ts:DEFAULT_ADVANCED_RUN_TIMEOUT_MINUTES), and clean up the timer + map entry in finally. Runner already escalates SIGTERM -> 5 s -> SIGKILL via process group. A6 (server-link.ts) — three-pack reconnect fix. INITIAL_BACKOFF_MS 1_000 -> 500, MAX_BACKOFF_MS 60_000 -> 5_000 (server's IP rate limit is 5 attempts per 10 s, so 500 ms initial / 5 s ceiling stays inside budget). Add an 8 s connect-timeout watchdog per attempt so a hung TCP SYN cannot wedge the daemon for 75-127 s on Linux/macOS. Apply +/-20% jitter to scheduleReconnect so multiple daemons behind one NAT don't trip the rate limiter together. Verification: - npx tsc --noEmit (daemon), npx tsc -p server/tsconfig.json --noEmit, cd web && npx tsc --noEmit — all clean - npm run test:unit — 315 files / 3356 tests pass - npm run test:server — 40 files / 505 tests pass - p2p-workflow-regression spec #59 regex relaxed to accept the A4 primitive-closure variant (run.contextFilePath | logicContextFilePath | contextFilePath); intent of the assertion preserved Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…transport restore restoreTransportSessions was rebuilding the qwen preset config (env / settings / availableModels / preferred model) but dropping the preset's contextWindow and systemPrompt onto the floor. After a daemon restart, ccPreset='MiniMax' sessions came back with the runtime catalog context window (and the qwen CLI's built-in identity), not the preset's declared one — causing usage-pane numbers and the "I am MiniMax" runtime facts to drift on every reconnect. Persist presetContextWindow into the upserted record alongside the preset's other rebuilt fields, and tighten the preferred-model selection so an explicit user-requested model is preserved unless the preset's catalog explicitly forbids it. Update the qwen-transport-flow e2e to assert the new restored fields (presetContextWindow=200000, qwenModel, qwenAuthType, qwenAvailableModels, systemPrompt containing the runtime facts). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…le (P0) Five P0 fixes for the user-visible breakage in screenshot c2642dfd955e6f525be4558408f1afb6.png (logic/script nodes both error out, "DAEMON 失联" banner never clears) plus 91a60a20400daec367041cf480c585cb.png (P2P review content displays "(加载失败)"). Audit trail in .imc/discussions/e940d73f-a8e.md (3 rounds, 3 reviewers). A1 (web/src/components/AdvancedWorkflowCanvasEditor.tsx) — nodeKind onChange now calls a new `alignNodeForKind` helper that forces preset='custom' (logic+script), permissionScope='analysis_only' (logic), and dispatchStyle='single_main'. The forward direction (preset onChange aligning scope/dispatch) shipped with R3 v2 PR-λ but the reverse direction was missed, so picking nodeKind=logic on a default llm+discuss+analysis_only node produced the cryptic `invalid_workflow_graph (nodes[N])` error in the user screenshot. For script nodes we deliberately do NOT auto-fill argv[0] — the executable is a security boundary; let the validator surface a precise required- field diagnostic instead. N3 (web/src/components/AdvancedWorkflowCanvasEditor.tsx) — legacy saved drafts that pre-date A1 still load with logic/script nodes in violating combinations (the user's screenshot is one such draft). A new pure helper `normalizeP2pWorkflowDraftForEditing` walks the incoming draft and returns the repair list; the editor surfaces a banner with Apply / Dismiss buttons. Per Cx1 R2-Cx1-1's design constraint, normalize is NEVER triggered as a render side-effect; the user must explicitly Apply before onChange fires. This preserves the contract that loading old data does not silently rewrite it. N5 (shared/p2p-workflow-validators.ts) — refine the `validateNodeCombination()` diagnostic fieldPath so the inspector can highlight the exact dropdown that's wrong. logic+non-custom-preset points at `nodes[N].preset`; logic+non-analysis_only-scope points at `nodes[N].permissionScope`; openspec_propose missing artifact points at `nodes[N].artifacts`. Multiple simultaneous violations on a logic node now produce two distinct diagnostics instead of a single opaque `nodes[N]` entry. N4 (web/src/ws-client.ts) — capability snapshot freshness now keys on a new `daemonLastSeenAt` clock that is bumped only by daemon-originated messages (whitelist: DAEMON_HELLO, RUN_UPDATE, daemon.stats, timeline.event, transport.* deltas/status/tools, etc.). Server-synthesized messages (`pong`, `session.event`) are explicitly excluded so the UI cannot show "fresh" while the daemon is actually down. Without this, healthy long-lived browser pages tripped the 30 s TTL on the one-time `daemon.hello.observedAt` and showed "DAEMON 失联" forever — the user's second screenshot symptom. M7 (web/src/app.tsx + src/daemon/command-handler.ts) — DiscussionsPage now passes `requestScope` derived from the active session, and the daemon's read_discussion / list_discussions handlers fall back to (a) active P2P run's contextFilePath → reverse-derive projectDir, and (b) cross-project file sweep, before returning `missing_or_invalid_scope`. Multi-project daemons used to fail every read because the UI didn't pass scope; this surfaced as the "(加载失败)" body in the second screenshot. Verification: - npx tsc --noEmit (daemon), npx tsc -p server/tsconfig.json --noEmit, cd web && npx tsc --noEmit — all clean - npm run test:unit — 316 daemon files / 3365 tests pass - npm run test:web — 107 files / 1324 tests pass - npm run test:server — 40 files / 505 tests pass - 3 new regression test files (15 fresh assertions): nodeKind onChange alignment, normalize banner, validator fieldPath specificity, daemonLastSeenAt whitelist - 7 i18n locale files synchronized for the new normalize_banner / apply / dismiss keys (MANDATORY per CLAUDE.md) Per the discussion's final plan (round 3 hop 2 §3, Cu1's grep evidence), N7 (server bridge `receivedAt` semantic shift) is intentionally OMITTED — `getDaemonP2pWorkflowCapabilities()` has no production caller, so N4 UI-only is sufficient to fix the user's "daemon stale" symptom without touching server-side semantics. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

… (PR-φ follow-up) User reported in screenshot 7c2570e96eeca1a9eefa3a92d3c7212e.png that the "DAEMON 失联" banner still fires on a healthy long-lived browser page, even after the N4 daemonLastSeenAt whitelist landed in c5dc1b1. Root cause: my N4 fix updated `WsClient.isDaemonCapabilityStale()` and the private `daemonLastSeenAt` clock, but `P2pConfigPanel.tsx:612-613` computed staleness inline from `capabilitySnapshot.observedAt`: const capabilityStale = !capabilitySnapshot || (Date.now() - capabilitySnapshot.observedAt) > P2P_CAPABILITY_FRESHNESS_TTL_MS; `observedAt` is set ONLY when `daemon.hello` arrives (WS connect or capability change). On long-lived browser pages it never refreshed, so the panel tripped the 30 s TTL and the banner stuck — exactly the screenshot symptom. The N4 fix on the WS client was correct in its own right but the panel never consumed it. Fix is structural — single source of truth for "is the daemon stale": - `P2pConfigPanelCapabilitySource` now exposes optional `isStale(now?: number): boolean`. The panel's `capabilityStale` computation prefers `daemonCapabilitySource.isStale()` and falls back to the legacy `observedAt` check only when the source omits the method (preserves existing test fixtures that pass plain object sources). - `SessionControls.tsx` source object now wires `isStale: (now) => ws.isDaemonCapabilityStale(now)` so the panel and the WS client share one definition of staleness. - The panel's freshness re-evaluation switches from a single setTimeout pinned to `observedAt` (which never re-armed once snapshot stayed constant) to a steady setInterval at TTL/2. Worst-case lag between daemon going silent and banner appearing is now bounded by `TTL + TTL/2`. Tests: - `web/test/components/P2pConfigPanel-stale-banner.test.tsx` (4 tests): panel hides banner when isStale()=false even with ancient observedAt; panel shows banner when isStale()=true even with fresh observedAt; legacy fixture without isStale falls back to observedAt-based check correctly in both directions. - `web/test/components/P2pConfigPanel-stale-banner-e2e.test.tsx` (3 tests): full WsClient ↔ panel chain integration. (a) healthy long-lived daemon with periodic daemon.stats keeps banner hidden across 90+ s (3× TTL); (b) silent daemon flips to stale past TTL; (c) server-only pong stream does NOT keep banner hidden — the key reverse assertion that prevents future regressions where someone "fixes" staleness by bumping on every WS message. - SessionControls test mock updated to include `isDaemonCapabilityStale: vi.fn(() => false)` so the panel's new `source.isStale()` call doesn't crash unrelated tests. Verification: - npx tsc --noEmit (daemon), npx tsc -p server/tsconfig.json --noEmit, cd web && npx tsc --noEmit — all clean - npm run test:unit — 316 daemon files / 3365 tests pass - npm run test:server — 40 files / 505 tests pass - npm run test:web — 109 files / 1331 tests pass (added 7 fresh assertions across 2 new test files) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ct storm Production logs on 116.62.239.78 showed a single daemon authenticating ~5 times per 10 seconds, with the daemon side reporting `code:4001 reason:auth_required` every cycle. The user-visible symptom was "server restart → daemon reconnect 极慢" and the persistent "DAEMON 失联" banner that survived all earlier fixes (A6 reconnect tuning, N4 daemonLastSeenAt whitelist, PR-φ panel.isStale routing). Root cause: a race in `WsBridge.handleDaemonConnection`'s async message handler. The daemon sends two messages back-to-back on every WS open (`server-link.ts:201-202`): ws.send(JSON.stringify({ type: 'auth', ... })); this.sendDaemonHello(); // sends `daemon.hello` Both messages reach the server before the auth message handler's `await db.queryOne(...)` settles. While the auth flow is parked at the DB await, `this.authenticated` is still `false`. The `daemon.hello` handler runs concurrently, sees `msg.type !== 'auth'`, and trips the gate at line 894-896: if (!this.authenticated) { if (msg.type !== 'auth' || ...) { ws.close(4001, 'auth_required'); return; } ... } The auth handler then completes, flips `authenticated` to true, and logs "Daemon authenticated" — but the WebSocket is already gone. The daemon sees the 4001 close, reconnects (fast, thanks to A6's 500 ms initial / 5 s cap), races again, and so on. None of the earlier fixes could break the cycle because they all live downstream of this race. Fix: capture the auth flow in `this.authPromise` and await it from every subsequent message handler before evaluating `this.authenticated`. Concurrent `daemon.hello` (or any other post-auth message) now waits for the DB lookup to settle, then sees the correct `authenticated === true` and proceeds normally. Implementation details: - New private field `WsBridge.authPromise: Promise<void> | null`. - The auth message handler creates the promise, runs the DB lookup, and `resolveAuth()`s it on both success and failure paths. The promise ALWAYS resolves (never rejects) — failure is signaled via `ws.close()` + `this.daemonWs = null`, which awaiting handlers detect with their `daemonWs !== ws` bail-out check. Resolving (vs rejecting) avoids unhandled-rejection warnings when no concurrent handler is currently awaiting. - Awaiting handlers also re-check `this.daemonWs === ws` AFTER the await; if a different connection has replaced this one (or the socket closed during the await), they bail. - `authPromise` is reset to `null` on (a) new connection (so the next reconnect doesn't await a stale promise from a different `ws`), (b) `ws.on('close')`, and (c) `kickDaemon()`. Regression test: `server/test/bridge.test.ts` adds "does NOT 4001-close when auth and daemon.hello arrive back-to-back during DB lookup". The test uses a deferred DB query so both messages can land before auth resolves, then asserts: 1. The socket is NOT closed during the in-flight auth window. 2. Once the DB query resolves with a valid token, auth completes and the socket stays open. Without this fix, the test fails immediately with `ws.closed === true` and `closeCode === 4001`. With it, both assertions pass. Verification: - npx tsc -p server/tsconfig.json --noEmit clean - npm run test:server — 506 tests pass (up from 505 with the new regression case) - Production daemon log on the dev box (118 KB before fix) showed 18+ auth flap cycles per minute. After deploy, expected: a single `Daemon authenticated` per real reconnect (1× per server restart + 1× per network blip). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Locks the PR-υ auth race fix (`42db4000`) against a real `ws` server + client stack — the previous bridge.test.ts coverage was a mocked EventEmitter, which can't reproduce the message-ordering semantics that caused the production reconnect-storm on 78. Three scenarios: 1. **Single back-to-back handshake** (latency=0): daemon sends `auth + daemon.hello` synchronously after WS open; assert no 4001-close and bridge.isAuthenticated flips to true within the observe window. 2. **50ms-DB-latency window**: deferred DB query sleep guarantees BOTH messages reach the server before auth's `await db.queryOne` resolves. This is the exact production race window. Without the fix, hits `ws.close(4001, 'auth_required')` 100% of the time. 3. **Burst of 10 back-to-back reconnect cycles** (latency=20): simulates the production reconnect cascade after a server restart. Asserts every single cycle authenticates cleanly with no 4001-close. Counting failures (rather than asserting a boolean) gives a clearer diagnostic when a flake creeps in. Test rig: - Spins up an in-process `http.Server` + `WebSocketServer` with `noServer: true`, mirroring `server/src/index.ts`'s upgrade handler. - Each test/cycle uses a fresh `serverId` extracted from the URL. Reason: `WsBridge.maybeCleanup` deletes from the shared `WsBridge.instances` map by serverId, NOT by instance pointer; a stale-bridge close handler firing AFTER the next test's connection has registered evicts the new bridge from the map. In production each serverId hosts exactly one bridge so the path is harmless, but rapid-cycling the same id in tests exposes the eviction. Per-test serverIds sidestep it. - Polls `bridge.isAuthenticated` AFTER the observe window but BEFORE the test closes the socket — the bridge's ws.on('close') resets the flag, so checking after the local close would always observe false. Capture-during-window is the correct contract. Verification: `npm run test:server` — 41 files / 509 tests pass (up from 506 with the new 3-test file). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Replaces the synthetic auth+daemon.hello scenarios in `server/test/bridge-auth-race-e2e.test.ts` with a true end-to-end test that wires the production daemon `ServerLink` (`src/daemon/server-link.ts`) against the production server `WsBridge` (`server/src/ws/bridge.ts`) over a real `ws` server. The reason: a synthesized handshake only covers the messages the test author thought to send. If a future change adds a new "do X immediately after open before auth" step on either side, the synthetic test continues to pass while production breaks. Driving the real `ServerLink` makes the test follow the daemon's real wire protocol. Two scenarios: 1. **Cold start** (50 ms DB latency — the worst-case race window): create a `ServerLink`, await auth, then sleep 1 s and assert EXACTLY ONE accepted WS connection + EXACTLY ONE successful auth. Pre-fix produces ≥2 connections within 1 s because the daemon reconnects immediately after every 4001 close. 2. **Server restart** (20 ms DB latency, simulates `docker compose restart server`): connect, auth, then `wss.close` + `httpServer.close` (terminating live clients), wait 200 ms, `listen` on the same port. Assert the ServerLink reconnects cleanly with EXACTLY ONE post-restart auth and ≤2 reconnect attempts (the +1 allowance covers an ECONNREFUSED race when the port is still TIME_WAIT-free for a moment). Test rig: - In-process `http.Server` + `WebSocketServer({ noServer: true })` matching `server/src/index.ts:setupWebSocketUpgrade`. The real upgrade path (URL parse → `WsBridge.get(serverId)` → `handleDaemonConnection`) runs unmodified. - `handleDaemonConnection` is invoked with an `onAuthenticated` callback so the rig can count successful auths without intercepting the bridge's logger. - Server restart uses `wss.clients.forEach(c => c.terminate())` + `httpServer.closeAllConnections()` to immediately drop in-flight sockets — without this, `wss.close` blocks waiting for clients and the test hangs ~30 s. This mirrors the ECONNRESET behaviour `docker compose stop` actually produces. - Each test uses a fresh `serverId` so stale-bridge cleanup (which deletes from the shared `WsBridge.instances` map by serverId) cannot evict the current bridge entry. Verification: - `npx tsc --noEmit` clean (daemon + server) - `npm run test:server` — 41 files / 509 tests (unchanged; this test runs in the e2e workspace, not server) - `npx vitest run --project e2e test/e2e/daemon-server-real-handshake.test.ts` — 2 tests pass in ~6 s Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Per CLAUDE.md "FORBIDDEN — Never `git add` these directories: openspec/ and docs/ are local-only planning/documentation directories. NEVER stage, commit, or push any file under openspec/ or docs/ to git." `.gitignore` already lists `openspec/` (line 56) and `docs/` (line 65), but 12 openspec files + 1 docs/plan file were committed to git BEFORE those gitignore rules existed. gitignore does not retroactively untrack anything, so they continued to be tracked — visible in git status when edited or deleted locally. This commit removes them from the index via `git rm -r --cached`. Local copies on disk are preserved for files the user still has; already-deleted ones (the `daemon-file-preview-worker/` set the user manually deleted on mobile) are simply unstaged. After this commit, `openspec/` and `docs/` are fully out of version control and stay ignored on every future change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

User report: the P2P progress banner appeared in EVERY active session's sub-session bar, even when the discussion had nothing to do with the session the user was currently viewing. Cross-session noise. Root cause: `app.tsx:3733` filtered the bar's discussions list only by status (`d.state !== 'done'`) — there was no session-identity gate. Every running P2P run rendered its banner everywhere. The mapping in `p2p-run-mapping.ts` was also dropping the run's `main_session` / `initiator_session` / hop participant identities during `mapP2pRunToDiscussion`, so even if the bar wanted to filter, it had nothing to filter on. Two-part fix: 1. **Preserve session identity in the mapping** (`web/src/p2p-run-mapping.ts`). Add `mainSession`, `initiatorSession`, and `participantSessions[]` (de-duplicated set of initiator + main + current target + every `hop_states[].session` + every `all_targets[].session`). Empty/missing aggregates degrade to `undefined` so legacy server payloads still round-trip cleanly. 2. **Filter at the bar render** (`web/src/app.tsx:3765`). The SubSessionBar's discussions prop is now scoped: - `mainSession === activeRootSession` covers the common case (user viewing the session that launched the discussion or any of its sub-sessions, since `activeRootSession` resolves sub→parent). - `participantSessions.includes(activeSession || activeRootSession)` covers the cross-root case (user navigates into a sub-session that's a hop in another root's discussion). - Discussions with no scope info (legacy mid-rollout entries) fall through and show unscoped — preserves the previous behaviour for those rather than hiding them. Also expand the local discussions state shape in `app.tsx` to declare the three new fields so TypeScript pins the contract. Tests: - 3 new cases in `web/test/p2p-run-mapping.test.ts`: - Advanced run with full session payload preserves all three fields and de-duplicates `participantSessions`. - Legacy run without session fields → all three undefined (caller treats as "show unscoped" via the legacy fallback). - Pre-dispatch run uses `all_targets` when `hop_states` is absent. Verification: - `cd web && npx tsc --noEmit` clean - `npm run test:web` — 109 files / 1336 tests pass (added 3) Note: this only changes the bar's filter — `DiscussionsPage` (`liveDiscussions={discussions}` at app.tsx:4003) intentionally still receives the full unfiltered set so the global "View all discussions" panel keeps showing every run as before. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ns button Follow-up to 6977512 (P2P bar scoping). After scoping the bar to the active session, users lost visibility of P2P runs happening in OTHER sessions — the bar correctly hid them from the current view but nothing told the user they exist. Easy to forget about background runs and switch sessions thinking nothing's going on. Add a numeric badge to the 📋 View Discussions button that ALWAYS shows the daemon-wide running discussion count, regardless of which session the user is currently viewing. Click-through opens the DiscussionsPage which already shows the unfiltered global list. Implementation: - New `totalRunningDiscussions?: number` prop on SubSessionBar (defaults to 0 so existing callers don't break). - Absolute-positioned span on the button, blue circle with bold white digit. `99+` for runaway counts. Hidden when 0. - Tooltip switches to "{count} running discussions — view all" when count > 0, falling back to the original "P2P discussions" label otherwise. - `aria-label` provides screen-reader friendly count. - `data-running-discussions` attribute for tooling/test inspection. Wiring: - `app.tsx:3779` passes `discussions.filter((d) => d.state !== 'done').length` — the UNFILTERED count, NOT the scoped subset that goes to `discussions={...}` above. The two numbers can legitimately differ: - `totalRunningDiscussions = 3` (daemon-wide) - shown banners = 1 (only one P2P run involves the active session) → user sees "3" badge AND knows 2 are elsewhere. i18n — 7 locales (en/zh-CN/zh-TW/es/ru/ja/ko) get: - `subsessionBar.p2p_discussions_with_running` (+ `_one`/`_other`) - `subsessionBar.p2p_running_count_aria` (+ `_one`/`_other`) Tests — 4 new cases in SubSessionBar.test.tsx: - badge hidden when count is 0 - badge renders with the count when count >= 1 - count caps at "99+" for runaway daemons - data-running-discussions attribute reflects the count Verification: - `cd web && npx tsc --noEmit` clean - `npm run test:web` — 109 files / 1340 tests pass (added 4) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…matching file User report (screenshot 71e2d014d9cf975f): on the discussions page, the live P2P progress bar at the top and the discussion file list below were two unrelated UIs. Clicking the live bar did nothing; users had to manually find the matching entry in the list by id and click it. Root cause: the P2pProgressCard rendered in the discussions page's "live progress strip" was missing an `onClick` handler (DiscussionsPage.tsx:305-311). The card already supports `onClick` (used by SubSessionBar in app.tsx) but DiscussionsPage never wired one. Additionally `P2pProgressDiscussion.fileId` wasn't declared on the interface even though the run-mapping function populates it. Fix: - Add `fileId?: string` to `P2pProgressDiscussion` so callers can rely on it without a `(d as any).fileId` cast. The mapping in `p2p-run-mapping.ts` already sets it from the run's `discussion_id` field; this just makes the type honest. - In DiscussionsPage's live-strip render, pass `onClick={d.fileId ? () => selectDiscussion(d.fileId!) : undefined}`. `selectDiscussion(fileId)` is the same function the file list uses, so: 1. it sends `p2p.read_discussion` with the fileId, 2. the daemon returns the file content, 3. the right-pane (or full-screen on mobile) shows the discussion, 4. the matching list entry gets the `active` class — visual link between the bar at top and the highlighted entry below. - Runs without a fileId (failed-bind / supervision-internal / pre-dispatch) get no onClick — a click is simply a no-op rather than crashing on `undefined.fileId`. Tests — 2 new cases in `web/test/pages/DiscussionsPage.test.tsx`: - clicking a live progress card with `fileId` sends `p2p.read_discussion` for that fileId AND highlights the matching list entry as `.active`. - clicking a live progress card WITHOUT fileId is a no-op (no new `ws.send` calls). The P2pProgressCard mock in the test file is upgraded from `() => null` to a clickable button forwarding the `onClick` prop, so the click contract can actually be exercised. Production rendering (SVG layout) is unchanged. Verification: - `cd web && npx tsc --noEmit` clean - `npm run test:web` — 109 files / 1342 tests pass (added 2) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…a tight loop Production symptom (mobile screenshot aab0338a3d2bb6f5708f0ea5f): the discussions page hung on "加载中…" forever — no live progress bar and no file list ever appeared. Server logs revealed the cause: "p2p per-socket pending cap exceeded — dropped" type: "p2p.list_discussions" requestId: ... The bridge enforces a per-socket cap on outstanding p2p workflow requests. The web page was dispatching `p2p.list_discussions` faster than the daemon could respond, so the bridge dropped them; with no response ever returning, `loading` stayed true and nothing rendered. Two compounding causes: 1. **Inline `requestScope` literal in `app.tsx:4017`** — every parent render of `App` produced a fresh `{ sessionName, projectDir }` object with new identity. That made `DiscussionsPage`'s `useCallback(loadList, [requestScope])` re-identify, fired its mount-time `useEffect([loadList])`, and dispatched another list request — once per parent render. 2. **`RUN_UPDATE` handler called `loadList()` synchronously** (DiscussionsPage.tsx:235). When many P2P runs update in quick succession (canvas projection at ~5 Hz × N runs), this fired several list requests per second on its own. Three-layer fix: A. **`app.tsx`** — wrap the request scope object in `useMemo` keyed on `[activeSession, activeSessionInfo?.projectDir]`. Stable identity across parent renders. B. **`DiscussionsPage.tsx`** — defense-in-depth: even if a future refactor reverts the parent's `useMemo` (or a test/caller passes an inline literal), normalise `requestScope` internally via `useMemo(() => requestScope, [JSON.stringify(...)])` so the downstream `loadList` callback's dependency only changes when the SCOPE CONTENT changes, not its identity. C. **`DiscussionsPage.tsx`** — debounce the `RUN_UPDATE`-driven refresh. Bursts of run updates now coalesce into a single `loadList()` call after a 250 ms quiet window. Cleanup timer is cleared on unmount. Tests — 2 new cases in `web/test/pages/DiscussionsPage.test.tsx`: - 5 parent rerenders with new-identity-but-content-equal `requestScope` literals → at most 2 `p2p.list_discussions` dispatches (covers initial mount + one tolerated retry). Pre-fix produced 6. - 10 rapid `RUN_UPDATE` messages → at most 1 coalesced `p2p.list_discussions` dispatch after the debounce window. Pre-fix produced 10. Verification: - `cd web && npx tsc --noEmit` clean (the two pre-existing errors in `shared/session-group-clone.ts` and `web/src/components/CloneSessionGroupDialog.tsx` are from a concurrent in-progress feature branch in the working tree, NOT from this commit). - `npm run test:web` — 109 files / 1344 tests pass (added 2 net new). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Symptom (screenshot a8495587-...): a logic node could end up with preset='implementation_audit' + scope='analysis_only' + dispatch='single_main', producing a cryptic `invalid_workflow_graph (nodes[N].preset)` diagnostic with no obvious recovery path from the UI. Root cause: the canvas editor's preset / permissionScope / dispatchStyle dropdowns exposed the FULL constant array regardless of the current nodeKind. The A1 fix (`alignNodeForKind`) covered the forward path — switching nodeKind to logic auto-set preset=custom — but a subsequent click on the preset dropdown could re-pick an LLM-only preset while nodeKind stayed `logic`, putting the node back into an invalid state with no banner trigger (since the user wasn't loading legacy data). Fix: filter each dropdown's option set against the validator's `validateNodeCombination` rules so the user simply cannot click their way into a rejected combination. Single-option dropdowns (e.g., logic preset locked to `custom`) are rendered disabled to make the constraint explicit; if the current value isn't in the legal subset (legacy draft), it's preserved as a transient extra option so the select still reflects what's stored and the normalize banner is the unambiguous path forward. Also adds the previously-missing `script.argv` textarea inline for script nodes: one argv entry per line, first line is the executable. Blank-line stripping keeps the argv array tight; clearing the textarea drops `node.script` entirely so the validator surfaces a clean required-field error instead of an opaque empty-array hit. Tests: 24 new regression tests in `AdvancedWorkflowCanvasEditor-dropdown-restrictions.test.tsx` cover every nodeKind/preset combination + the argv edit/clear paths. All 46 canvas-editor tests + 1381 web tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Symptom (screenshot 7f112b6e...): a script node with no `script.argv` shows the diagnostic `A workflow script contract is invalid. (nodes[1].script)`, and there was no inspector UI to recover from it. Commit f4e539b added the `script.argv` textarea; this commit closes the OTHER half — switching nodeKind FROM script back to llm (or to logic) left the `script` field dangling, and the validator's `validateNodeDraft` then emits `invalid_script_contract` on the llm node because it doesn't allow a `script` field on non-script kinds. `alignNodeForKind` could not express field deletion through its `Partial<P2pWorkflowNodeDraft>` return shape, so the cleanup now lives at the editor's nodeKind onChange call site: after merging the aligned partial, we explicitly drop `script` when the next kind is not `script`, and drop `logic` when the next kind is not `logic`. Same shape used for both kind-specific fields keeps the rule discoverable. Regression tests added: - `script node with no script.argv surfaces the nodes[N].script diagnostic` — reproduces the screenshot state (script node at index 1) and asserts the fieldPath appears in the inline Diagnostics list. - `after filling argv via the textarea, the nodes[N].script diagnostic clears` — full recovery flow: starts broken, fills argv via the new textarea, asserts the diagnostic is gone AND the validator accepts the draft. - `switching nodeKind from script to llm drops the lingering script field` — pins the new cleanup behaviour. - `shows "Required for script nodes" warning when workflow has a script node but allowedExecutables is empty` — pins the P2pConfigPanel badge the user also saw at the bottom of the screenshot. - `does NOT show "Required for script nodes" warning for LLM-only workflows` — symmetric guarantee. All 1386 web tests pass (4 net new). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…y, capability exports, and detail oracle Fixes from OpenSpec audit against tasks.md: 1. server-link DATA_PLANE_SEND_QUEUE_CAP=256 with overflow telemetry - task 4.5: bounded queue capacity + observable overflow - Previously unbounded queue; now logs warning when depth exceeds cap 2. daemon hello includes TIMELINE_PROTOCOL_CAPABILITY in base capabilities - task 1.6: timeline protocol capability via daemon hello - Updated p2p-workflow-runtime.test.ts expectations to include timeline.protocol.v1 alongside existing P2P capabilities 3. detail store eventId/fieldPath mismatch returns MISSING (not UNAUTHORIZED) - task 2.5 / spec D6: non-enumerating error to avoid detailId oracle - Updated command-handler-transport-queue.test.ts assertion to expect MISSING instead of UNAUTHORIZED for field mismatch Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

Copying assistant messages used to flatten paragraphs and list structure because `Element.textContent` joins descendant text nodes with no separator, and `Selection.toString()` is browser-defined at block boundaries (Safari often collapses). Both copy paths now route through a new `domNodeToPlainText` DOM walker that emits explicit newlines for block elements, expands `<br>`, preserves `<pre>` content verbatim, and prefixes list items / blockquotes — so what the user sees is what they paste. On touch devices the chat view disables `user-select` so long-press can fire the custom Copy/Quote menu, which makes native selection of a specific portion of a message impossible. Double-tapping a chat bubble now opens a ZoomedTextDialog with selection re-enabled and a Copy-all button, giving users a place to drag the iOS/Android handles and pick out exactly the substring they want. Also extracted shared `copyToClipboard()` so ZoomedTextDialog and the existing CodeBlock copy button share one implementation of the non-secure-context fallback. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The mobile double-tap-to-zoom detector was pairing taps by HTMLElement identity, which fails the common case where a streaming assistant block re-renders between the two taps and Preact replaces the underlying DOM node — `===` returns false even though the logical bubble is unchanged and the user feels nothing happens. Pair by `data-event-id` string instead. AssistantBlock now threads its merged-block key onto the bubble (user messages already carry their event id), so the comparator is stable across re-renders. Also widen the double-tap window to 450ms (450ms reads as "forgiving" on a phone where fingers are slower than mouse buttons), bump tap-vs-scroll tolerance to 15px, and add `touch-action: manipulation` on chat bubbles so iOS hands the second touchend to JS without the 300ms double-tap-zoom probe. Adds two regression tests covering full `.chat-event` extraction for the right-click context-menu path, which surfaces the same format-preserving text used by mobile zoom. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The touchend-based double-tap pairing didn't fire on real iOS Safari even after switching to event-id matching: the synthetic touchend after a short tap is racy on touch devices (subject to scroll resolution and the system's tap-vs-scroll decision), so the second tap was sometimes missed entirely. Move pairing to the synthetic `click` event, which iOS fires reliably on every short tap once viewport `user-scalable=no` removes the 300 ms zoom probe. Long-press still suppresses the click via the existing `cancelEvent` preventDefault on touchend, so the menu and zoom paths remain mutually exclusive. touchend now only clears the long-press timer. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

`'ontouchstart' in window` was returning true on every device that has any touch capability, which meant Surface-class laptops in mouse mode landed on the mobile-gesture path (long-press menu, double-tap zoom) and desktop selection felt broken. Worse, the predicate didn't help diagnose why double-tap "isn't working" on Android — both iOS and Android Chrome report touch support, but the chat-gesture user is really asking for the phone-class layout, which is more about narrow viewport + coarse pointer than about touch capability alone. Switch to `matchMedia('(pointer: coarse), (max-width: 768px)')` and react to viewport changes via the same media-query listener. CSS picks up the matching predicate so the two stay in sync — a narrow desktop window now also disables native text selection and gets the mobile gesture set, while a 1080p touchscreen with a mouse falls through to the desktop path. Threshold bumped to 500ms for extra forgiveness on slow phones; the click-event detector from the previous commit still fires for both iOS and Android. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…gression in 42dfabe Commit 42dfabe introduced two silent-drop paths on the daemon → server → browser timeline link that together produced the user-reported symptoms: "消息更新断了, 必须手动刷新页面才更新. 打字机效果也没了." 1. Daemon ServerLink: - `scheduleDataPlaneFlush` shifted the queue head and then called `trySend()` without checking the return value. When the WS link was not OPEN (short Wi-Fi handoff, reconnect window) trySend returned false and the message was lost. Switched to peek-then-shift: leave the item on the queue until trySend confirms it landed, otherwise halt the drain and wait for reconnect. - Added `flushDataPlaneAfterReconnect()` and wired it into the WS `open` handler so a queued backlog resumes draining without needing a fresh enqueue to kick the scheduler. - Bumped DEFAULT_DATA_PLANE_SEND_QUEUE_HARD_CAP 512 → 100_000 and DEFAULT_DATA_PLANE_SEND_STALE_MS 30s → 24h. With peek-then-shift the stale GC is now purely a memory-protection upper bound, not a primary correctness mechanism. 2. `timelineStore.readPreferred` (also from 42dfabe) now throws `TimelinePreferredReadError` when the SQLite projection is unavailable instead of returning []. Three callers had no per-call catch: - `lifecycle.ts:599-604` startup backfill loop — one bad session could abort the rest. Now per-session try/catch with JSONL fallback. - `subsession-manager.ts:474` `readSubSessionResponse` — projection blip would reject the RPC. Now falls back to JSONL. - `opencode-watcher.ts:115` — outer poll catch swallowed the throw at debug level. Now logs warn + falls back to JSONL. 3. Server bridge `timelineDataPlaneErrorResponse` emitted error frames without a `recoverable` flag, so the web `useTimeline` hook treated any errorReason as terminal via `hasExplicitTimelineOutcome`. Wired in `isRecoverableTimelineRequestErrorReason()` so transient reasons (queue_full, deadline_exceeded, timeout, unavailable) come back with `recoverable: true`. Also bumped the bridge cap 128 → 4096 and the job deadline 15s → 60s to match the daemon-side ceilings — the strict defaults were tripping on weak links well before any real problem. 4. Web `shouldRetryTimelineHistoryResponse` retries when either: - the server sent `recoverable: true`, OR - the server omitted `recoverable` AND `errorReason` is in the shared allow-list (`isRecoverableTimelineRequestErrorReason`). When the server explicitly sets `recoverable: false` we respect that positive "don't retry" signal — the allow-list only kicks in when the server didn't decide. 5. New shared allow-list `RECOVERABLE_TIMELINE_REQUEST_ERROR_REASONS` + `isRecoverableTimelineRequestErrorReason()` in `shared/timeline-history-errors.ts` so daemon, bridge, and web all agree on which reasons should auto-retry. No string literals for recoverable reasons are written outside the shared module. Test updates: - `server/test/bridge.test.ts` pins `deadlineMs: 15_000` on the one test that simulates wall-clock deadline expiry; production default is now 60s so a hardcoded `now = 16_000` would no longer trip it. Verified: daemon unit (3641 pass), server (561 pass), web (1530 pass), all three typechecks clean, web vite build clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

IM.codes and others added 30 commits May 10, 2026 10:32

Implement smart P2P workflow upgrade

a368875

Fix smart P2P web asset build

91c2683

feat(web): color-code sub-session windows

2bb6cc7

fix attachment reference paths

e9f5d82

fix timeline backfill reconnect churn

3b24eb3

Implement session group cloning

cf7d819

Fix session clone path expectations on macOS

00961cd

IM.codes and others added 29 commits May 12, 2026 08:12

Fix cloned sub-session sync

cce0594

Stabilize session group clone tests

d8c3e8b

Move active brain tab accent to bottom

49d7ead

Upgrade legacy P2P cycle rounds

14e8de8

Fix AtPicker combo rounds expectation

2674654

Move branch summary beside font control

a69d6eb

Gate P2P original request execution with markers

34f3e6c

Fix cron P2P marker integration test

456c320

Scope later P2P rounds to previous outputs

1dc1910

Add daemon latency tracing

de86e99

Trim transport chat history payloads

556c078

Keep P2P bars across scoped status refreshes

511563e

Reduce daemon main thread latency

42dfabe

fix live timeline push priority

aad932e

stabilize windows hook send ci

9403755

fix browser live subscription recovery

2a46b1d

fix live updates after foreground probe

8c34dcc

repair transport live subscriptions without history replay

17ff9bc

stabilize windows hook send ci retry

62dee25

fix p2p mobile dropdown and rounds copy

14af74d

Improve P2P round selector contrast

9dedbc7

fix file browser split tree height

cd8783a

Restore selected text actions

a318ec6

im4codes merged commit 5218117 into master May 14, 2026
40 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dev#19

Dev#19
im4codes merged 91 commits into
masterfrom
dev

im4codes commented May 14, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

im4codes commented May 14, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant