refactor(sandbox): unified daemon across freestyle + docker + k8s#3178
Open
tlgimenes wants to merge 37 commits intotlgimenes/vm-start-hangfrom
Open
refactor(sandbox): unified daemon across freestyle + docker + k8s#3178tlgimenes wants to merge 37 commits intotlgimenes/vm-start-hangfrom
tlgimenes wants to merge 37 commits intotlgimenes/vm-start-hangfrom
Conversation
…pts (#3145) * fix(prompts): derive display title from prompt name when title is absent Prompts registered via the old server.prompt() API don't carry a title field, causing the UI fallback (displayToolName) to display the raw namespaced slug — e.g. "H0jwredec58c… Self Writing Prompts" instead of "Writing Prompts". aggregatePrompts() now sets title to a human-readable Title Case string derived from the original (pre-namespace) prompt name when the upstream prompt has no title. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(ci): fix TS2532 in titleFromName and stabilize flaky jwt expiry test Use charAt(0) instead of [0] to avoid noUncheckedIndexedAccess error. Increase JWT expiry test from 1s/1.5s wait to 2s/3s to avoid false failures on loaded CI runners. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(prompts): add explicit verb-first titles to all guide prompts Switch from server.prompt() to server.registerPrompt() so the title field is included in the MCP response. Each guide prompt now has a clear verb-first title (e.g. "Create Agents", "Update Connections") rather than the garbled fallback derived from the kebab-case name. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
…eation (#3176) Supabase has a DB trigger that auto-creates a profiles row when a new auth user is created. The explicit INSERT was hitting a unique constraint violation (profiles_user_id_key) on the first call, causing a 409/500. Now we check if the profile already exists before attempting to insert. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…acking (#3162) * feat(analytics): integrate PostHog for server-side and client-side event tracking Adds PostHog Node.js SDK (server) and posthog-js (client) with a no-op fallback when POSTHOG_KEY is unset, so self-hosted deployments are unaffected. Instruments key lifecycle events: org creation/join, user auth, connection/API key/automation CRUD, thread creation, topup URL, and AI streaming sessions. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(analytics): expand PostHog event coverage and fix gaps Structured event taxonomy for chat, tools, credits, and settings. Chat hierarchy (renamed for consistency): - chat_started, chat_opened, chat_message_sent/started/completed/failed/ stopped/aborted — per-thread and per-completion granularity - chat_archived, chat_unarchived, chat_deleted — thread lifecycle - chat_picker_opened/closed/item_selected — @/slash picker with abandonment detection (outcome + duration) - chat_model_changed, chat_credential_changed - chat_voice_started (with outcome: started | unsupported | permission_denied) Tool calls: - tool_called fires for both MCP passthrough and built-in tools with tool_source discriminator, annotations (readOnly/destructive/ idempotent/openWorld), latency, and error status Credits & revenue: - credits_topup_clicked (intent), credits_topup_requested (server), credits_topped_up_detected (heuristic via balance delta), credits_exhausted_shown, credits_empty_state_shown/dismissed Organization/team: - organization_created now also fires from Better Auth default-org auto-creation hook (was only domain-setup); closes undercounting gap - organization_member_role_updated, organization_member_removed - ai_provider_key_created, ai_provider_key_deleted - chat_message_aborted for server-side abort visibility Navigation & UI: - nav_item_clicked, settings_nav_clicked, agent_toolbar_toggled - sidebar_agent_pin_clicked, agent_browser_opened, agent_create_new_clicked, agent_import_clicked, agent_template_clicked - mcp_app_opened (real MCP app renderer), vm_preview_loaded Privacy & session replay: - Session recording enabled at PostHog project level (10% sample, 10s min duration) - ph-no-capture class applied to AI provider API keys and connection secrets so they are fully blocked from replays - Frontend exception capture enabled ($exception events) Team analytics: - \$groupidentify fires on organization creation - All server events include groups: { organization: org_id } for team-level filtering and breakdowns Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(analytics): track home page events — tiles, tools popover, connections dialog, recruit modals Wires structured PostHog events for the home-page surface identified in the page-by-page audit: - Home agent tiles (template/existing/recent), Create agent, See all - Chat mode toggles from tools popover + pill-dismiss (plan/gen-image/web-search) - Image/search model selection from tools popover - Prompt insertion from tools popover - Connect-tools banner + dialog-opened (with source across all callers) - Connection add flows (use_existing / clone / connect_new) + OAuth boundaries - Recruit-modal confirmed/failed (site-diagnostics / ai-image / ai-research) - Deco.cx site import started/succeeded/failed * feat(analytics): track agent instructions/connect page events Adds structured PostHog events for the agent detail page (instructions / connections / layout) and the Connect share modal: - agent_subtab_changed — instructions/connections/layout switches - agent_instructions_template_inserted, agent_instructions_improve_clicked - agent_updated — on successful form save, lists dirty field roots and instructions length when dirty - agent_test_clicked, agent_delete_requested, agent_deleted - agent_connect_modal_opened + agent_connect_action (copy_url / install_cursor / install_claude_code / typegen_copy_command / typegen_copy_env) - agent_typegen_key_generated / _failed - agent_connection_removed, agent_connection_settings_opened, agent_connection_instance_switched, agent_connection_new_instance_requested - connection_oauth_succeeded / _failed on agent reauthenticate flow - main_panel_tab_clicked — top Instructions/Connections/Automations/ Layout/pinned-view tabs (with tab_kind + was_active) * feat(analytics): track tasks panel + chat message actions Tasks panel (left column): - tasks_panel_member_filter_changed — all/mine toggle - tasks_panel_filter_changed — all/manual/automation toggle - tasks_panel_new_clicked — pencil icon to create a new task - tasks_panel_task_clicked — row select (dedupes no-op re-clicks) - tasks_panel_task_archived — frontend intent (server-side chat_archived still fires through COLLECTION_THREADS_UPDATE) Chat message actions: - chat_message_copied — assistant message copy-to-clipboard, includes message_id + char count Chat input + model selector events on this surface were already wired in the home-page pass; nothing new to add there. * feat(analytics): track settings pages — general, connections, agents, automations, store, brand, AI providers, monitor, members/roles, SSO, profile Wires PostHog events for every settings screen: General: - organization_settings_updated (dirty fields) - organization_domain_claimed / _cleared - organization_auto_join_toggled Connections list: - connections_page_tab_changed, connections_custom_dialog_opened, connection_custom_created, connection_add_clicked (source=connections_page), connections_community_warning_confirmed, connection_oauth_succeeded/_failed (flow=connections_page_connect), connections_bulk_delete / _status_toggled / _add_to_agent Agents list: - agents_list_template_clicked, agent_create_clicked (source=agents_list/agents_list_empty, method), agent_deleted (source=agents_list) Automations: - automations_list_row_clicked, automations_empty_state_browse_agents_clicked - automation_improve_clicked, automation_updated, automation_test_clicked, automation_trigger_added (cron / event), automation_new_clicked Store: - store_private_registry_added / _removed - store_registry_toggled Brand Context: - brand_created, brand_extract_started / _succeeded - brand_updated, brand_archived / _restored, brand_set_as_default AI Providers: - ai_provider_connect_clicked (method) - ai_provider_oauth_succeeded / _failed - ai_provider_cli_activated / _activate_failed - ai_provider_provision_succeeded / _failed Monitor: - monitoring_tab_changed, monitoring_time_range_changed, monitoring_live_toggled Members: - member_invited, member_removed, member_role_updated, invitation_role_updated - role_created, role_updated, role_deleted, role_members_updated SSO: - sso_configured / _config_updated / _config_removed - sso_enforcement_toggled Profile & Preferences: - profile_updated - preferences_theme_changed, preferences_notifications_toggled / _permission_denied, preferences_sounds_toggled / _previewed, preferences_tool_approval_changed, preferences_experimental_vibecode_toggled * feat(analytics): patch recruit modal + oauth timeout + extract-failed gaps - agent_recruit_confirmed / _failed now also fire from lean-canvas-recruit-modal.tsx and studio-pack-recruit-modal.tsx - ai_provider_oauth_failed fires on the 2-minute OAuth timeout path (was previously silent) - brand_extract_failed fires on BRAND_CONTEXT_EXTRACT error - agent_deleted from virtual-mcp/index.tsx now passes source: "agent_detail" for consistency with agents_list * refactor(analytics): drop credits_topup_requested + session-based agent/automation_updated Removals: - credits_topup_requested: removed from AI_PROVIDER_TOPUP_URL tool handler. It was a near-duplicate of the frontend credits_topup_clicked in the standard UI flow, and neither is an authoritative payment event. Keep credits_topup_clicked as the intent signal. Session-based tracking for agent_updated and automation_updated: - Auto-saves still persist every ~1s (product behavior unchanged). - PostHog now emits one event per edit SESSION, not per save. - A session ends after 30s of quiet OR an explicit flush (sub-tab change / test / improve / delete). - New props on both events: save_count — how many auto-saves occurred during the session edit_duration_ms — Date.now() delta from first save in session 'fields' is now the union of all dirty fields during the session. - Cuts event volume ~10-15x for a typical instructions edit. * docs(analytics): PostHog events catalog, review, and dashboards proposal Temporary reference docs for the PostHog instrumentation review. Three files at repo root so they're easy to share and easy to delete later: - posthog-events-catalog.md — every tracked event with exact trigger + props + misleading- interpretation guards - posthog-events-review.md — T1/T2/T3 triage, trigger-correctness pass, fixed/open gaps - posthog-events-dashboards.md — 14 dashboard proposals + 17 correlation questions + "Do-NOT labels" guardrails These are NOT the Astro docs site — delete them once the dashboards are built and the catalog lives in a better home. * feat(analytics): track signed_out event from both sign-out call sites Fires before authClient.signOut() so the event still carries the user's distinct_id; PostHog reset() then clears identity for the next session. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(analytics): track chat_tools_popover_opened on Tools button click Discovery signal — the inner items already track their own actions (chat_mode_changed, chat_prompt_inserted, chat_image_model_selected, chat_search_model_selected) but opening the popover itself was untracked, so we couldn't measure the open→action funnel. Fires only on the open transition, not on close. Carries chat_mode for segmenting by current mode. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(analytics): add app_name to connection_created/deleted events Lets you break down connection adoption and churn by provider (Linear, Slack, HubSpot, etc.) directly in PostHog without joining against the connections table. Nullable — STDIO/HTTP connections without a registry app will report null. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(analytics): track agent_connection_attached at 5 attach points Authoritative agent-scoped attach signal — fires whenever a connection becomes attached to an agent regardless of whether the connection was brand-new, cloned, or reused. Closes the gap where the existing connection_created (server) only fired for new rows. Modes: existing | clone | new | custom. Carries agent_id, connection_id, app_name (nullable). Threaded via a new agentId prop on AddConnectionDialog (add mode only — browse mode keeps it optional). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(analytics): split ai_provider_oauth_timeout from oauth_failed; suppress race Two bugs at the same site: 1. Race: if the popup posts back and exchangeOAuth (the async token swap) takes longer than the remaining 2-min timeout window, the timeout would fire ai_provider_oauth_failed{error:"timeout"} alongside the eventual ai_provider_oauth_succeeded. User saw an error toast and a failed event even though the connection worked. 2. Semantics: a 2-min "user never came back from popup" timeout is user abandonment, not an OAuth-protocol failure. Mixing both into oauth_failed inflates the failure rate and obscures real exchange failures. Fix: - Local exchangeStarted flag in the effect — set when the popup posts back, checked by the timeout. Once exchange begins, its own onError handler is the authoritative failure signal. - New event ai_provider_oauth_timeout for the popup-abandonment case. - ai_provider_oauth_failed now only fires for actual exchange failures. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(analytics): report React error boundary catches to PostHog PostHog's capture_exceptions: true only sees what bubbles to window.onerror / unhandledrejection. React error boundaries catch render- and commit-phase errors BEFORE they reach the window, so anything that hits a boundary (the "removeChild" class, render crashes, etc.) was previously invisible to PostHog. - Add captureException wrapper to posthog-client (try/catch so an analytics failure never blocks the fallback UI). - Wire both ErrorBoundary and ChunkErrorBoundary componentDidCatch to call it with route + componentStack + boundary tag. The boundary prop ("default" / "chunk_root") lets you split React-boundary catches from autocapture in PostHog dashboards. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore(analytics): remove planning docs from branch Moved to local Downloads folder; these were working notes (events catalog, dashboards proposal, review) that don't belong in the shipped PR. The event changes themselves are in the preceding commits; nothing in the code or dashboards references these files. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(analytics): track member_invite_failed on invite mutation error The success path fired member_invited; the error path only showed a toast, so invite failures were invisible in PostHog. Now captures count, role, and error message on failure. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(analytics): track failure counterparts for silent onError paths Several mutations fired success events but swallowed errors with a toast-only onError, making failures invisible in PostHog. Added matching _failed events mirroring the success event's props + an error field. Covers 8 gaps: - member_remove_failed - member_role_update_failed - invitation_role_update_failed - role_create_failed / role_update_failed / role_members_update_failed - role_delete_failed - organization_settings_update_failed - organization_domain_claim_failed - organization_domain_clear_failed - organization_auto_join_toggle_failed (Already-good paths like deco_site_import_failed, ai_provider_*_failed, brand_extract_failed, agent_recruit_failed are unchanged.) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(analytics): track user_signup_failed + user_signin_failed The auth form's emailPasswordMutation had no tracking at all — neither success nor failure. The server-side user_signed_up fires only AFTER a DB row is created, so pre-insert failures (network, validation, email-already-exists, weak password) were completely invisible in PostHog. Since the same mutation handles both signup and signin, the onError branches on isSignUp to fire the right event: - user_signup_failed - user_signin_failed Success path intentionally left untracked: the authoritative signal is the server-side user_signed_up (signup) or presence of the session cookie on subsequent requests (signin). No client-side duplicate. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(analytics): track password-reset and email-OTP auth flows Fills the tracking gap on the remaining 3 auth mutations in unified-auth-form.tsx. New events: - password_reset_requested + password_reset_request_failed - email_otp_sent + email_otp_send_failed - email_otp_verify_failed Success for sendOtp / password-reset is tracked because those are intermediate states (user stays on the form waiting for email). Success for verifyOtp is NOT tracked — it redirects on success, matching the signin pattern where the session cookie is authoritative. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(analytics): remove unused setOrganizationGroup export Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Contributor
🧪 BenchmarkShould we run the Virtual MCP strategy benchmark for this PR? React with 👍 to run the benchmark.
Benchmark will run on the next push after you react. |
Contributor
Release OptionsSuggested: Patch ( React with an emoji to override the release type:
Current version:
|
…r bugs (#3177) * fix(simple-model-mode): gate on provider availability and fix selector bugs - Disable toggle when no AI provider is connected; clear stale draft when providers are removed so reconnecting a different provider doesn't carry over unavailable model selections - Auto-fill defaults reactively when models finish loading, clearing slots whose keyId no longer exists - Resolve correct provider logo via the key's actual providerId (was hardcoded to "deco") - Add claude-code to FAST_MODEL_PREFERENCES so Haiku is picked as default - Hide Image/Web research selector with "Not available" note when the current provider has no matching models - Fix modal credential switcher reverting selection — slot sync now runs only when slot.keyId actually transitions, not on every render Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * style(simple-model-mode): clean up settings panel layout Remove row dividers, hide Save button when no provider is connected, move spacing so the toggle row has no padding when collapsed, and separate Chat/Other model sections with a single divider instead of per-row borders. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(simple-model-mode): address review feedback - Fix dead guard in model-sync effect: compare chat tiers field-by-field instead of identity-comparing a freshly built object against state, which was always false and caused setDraft on every models/keys change. - Avoid flashing "Not available with current provider" on filtered rows while useAiProviderModels is still loading by gating on isLoading. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * refactor(org-settings): extract SimpleModeConfig zod schemas into shared schema module Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(org-settings): expose and accept simple_mode in ORGANIZATION_SETTINGS_GET/UPDATE Adds simple_mode to the generic org-settings tool schemas (input for UPDATE, output for both) so callers can read/write this field through the same pair of tools as sidebar_items, enabled_plugins, and registry_config. New tests cover round-trip behavior and verify that partial updates do not clobber unrelated fields. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(org-settings): add unified useOrganizationSettings hook with slice wrappers Introduces a single query + mutation hook targeting organization_settings, plus thin named wrappers (useSimpleMode, useUpdateSimpleMode, useRegistryConfig, useUpdateRegistryConfig, useEnabledPlugins) that share one query key and a setQueryData-based write path. Existing callers will migrate in subsequent commits. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * refactor(web): route simple-mode consumers through useOrganizationSettings Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * refactor(web): route registry-config consumers through useOrganizationSettings Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * refactor(web): route plugins form and shell layout through useOrganizationSettings Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * refactor(web): delete use-ai-simple-mode and use-registry-settings hooks Migrates the remaining three registry consumers (use-install-from-registry, use-enabled-registries, use-registry-connections) to the unified useOrganizationSettings hook and its useIsRegistryEnabled / useRegistryConfig wrappers, then deletes the two legacy hook files. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * refactor(query-keys): remove aiSimpleMode and registryConfig keys Both slices now share KEYS.organizationSettings; the dedicated keys are no longer referenced anywhere. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(simple-mode)!: delete AI_SIMPLE_MODE_GET/UPDATE tools Consolidated into ORGANIZATION_SETTINGS_GET/UPDATE in an earlier commit. Drops the dedicated tool files, their exports from the ai-providers registry, their CORE_TOOLS registration, and their entries in the registry-metadata name/description/category maps. BREAKING CHANGE: external callers of AI_SIMPLE_MODE_GET / AI_SIMPLE_MODE_UPDATE must switch to ORGANIZATION_SETTINGS_GET / ORGANIZATION_SETTINGS_UPDATE with the simple_mode field. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore(org-settings): drop unused exports flagged by knip Unexports the internal-only ModelSlotSchema and useOrganizationSettings, and deletes useEnabledPlugins (shell-layout uses the suspense variant instead). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * refactor(simple-mode): migrate SimpleModeSection to react-hook-form Drops the useState<draft> + synced-boolean + JSON.stringify-isDirty state machine in favor of useForm({ values: simpleMode, resolver, mode: onChange }). Each model row is now wrapped in a react-hook-form Controller. The explicit Save button and behavior are preserved — autosave lands in a follow-up commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(simple-mode): autosave form changes with stateless status indicator Replaces the explicit Save button with a 250ms-debounced autosave effect watching form state. The debounce coalesces multi-field writes (toggle-on defaults, stale-key clearing) into a single mutation. A dumb AutosaveStatus component next to the card title shows "Saving…" or "Saved" as a pure derivation of mutation + form state — no local booleans, no timers. On mutation error the form reverts to the last-known-good server value and a toast surfaces the error. The success toast is removed to avoid spamming on every dropdown change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * refactor(chat-context): store ModelRef instead of full AiProviderModel Collapses five localStorage keys to four and drops hundreds of bytes of cached metadata per slot (title/description/logo/capabilities/limits/costs). Makes Simple Mode and regular chat-model resolution mutually exclusive: when Simple Mode is enabled the stored pick is not consulted, eliminating the silent-shadowing fallback chain the UI had no way to communicate to users. credentialId becomes session-only state — its only role is letting the picker browse a credential before the user commits. On commit (setModel) the session override clears and the model's keyId becomes the source of truth. All stored refs now flow through a single findModel validator that clears stale values from localStorage when they reference deleted keys or models. The main chat model used to skip this validation while image and deep-research did it; the asymmetry is gone. chatSimpleModeTier validation resolves the orphan case: a stored tier that is not configured on the server silently falls through to the first configured tier, eliminating stale reactivation when Simple Mode is re-enabled later. LOCALSTORAGE_KEYS.chatSelectedKeyId is removed — it was a pure duplicate of chatSelectedModel.keyId. Existing values in users' localStorage are harmless (~30 bytes, unreferenced from now on). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(chat-context): don't write localStorage during render The initial ref-is-stale cleanup in the validation pass called setStoredChatRef(null) synchronously during render, which is a state-set-during-render anti-pattern the project avoids. Drop the on-read cleanup. Stale refs stay on disk harmlessly: validation returns null, resolution falls through to the default, and the next setModel call overwrites the ref cleanly. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(chat-context): Simple Mode slots synthesize model when key exists findModel was rejecting Simple Mode slots whose model didn't appear in the current credential's model list — which is the common case, since allKeyModels is only fetched for effectiveKeyId, and Simple Mode slots typically reference a different credential. That made selectedModel null and disabled the send button. Restore the old behavior of synthesizing a minimal AiProviderModel from the slot's { keyId, modelId, title } when the key still exists. Still enforce the key-existence check introduced in the refactor — admin-deleted providers still produce null. Pass the slot's title through to the synthesized object so the picker label reflects the configured tier. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(chat-context): fetch models per Simple Mode slot for real capabilities Previously, Simple Mode slots pointing at a different credential than effectiveKeyId resolved via synthesize-from-ref with capabilities: []. That broke UI gates like file upload: the picker thought Sonnet (as a Simple Mode Smart slot) had no file capability, so the attachment UI was disabled. Fetch models per slot keyId via useAiProviderModels — React Query's per- query cache keeps the additional fetches cheap, and each hook short-circuits when the slot is unset (enabled: false). findModel now receives the slot's own key's models list and returns the real AiProviderModel with full capabilities; the synthesize fallback only triggers if the slot's key still exists but that key's model list hasn't loaded yet. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(chat-context): match findModel by modelId only, attach keyId to hit The AiProviderModel objects returned by AI_PROVIDERS_LIST_MODELS don't carry a keyId field — it's a client-side-only marker injected downstream (see selectDefaultModel's withKey helper). My findModel was requiring m.keyId === ref.keyId, which always failed against real API responses, pushing every lookup into the synthesize fallback with capabilities: []. That's why Simple Mode's Sonnet slot resolved with no "vision" capability and the file-upload UI stayed locked out. Match by modelId only within the provided model list (list is already scoped to one credential), then spread the hit with ref.keyId attached. Synthesize only fires when the model truly isn't in the list — list still loading or the user manually corrupted localStorage. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: gimenes <tlgimenes@gmail.com>
Port the in-VM daemon from JS-string scripts to a TypeScript package (@decocms/sandbox) running under Bun.serve with web-standard Request/Response. Drop the mesh-side proxy at /api/sandbox/:handle/_decopilot_vm/* in favor of direct previewUrl access from the web client. Bundle the daemon into the Docker image, remove bearer-token auth on _decopilot_vm/*, and route all traffic through the daemon port. - Rename packages/mesh-plugin-user-sandbox → packages/sandbox - Port daemon modules: config, paths, auth, events (sse/replay/broadcast), process (run-process, dev-autostart, script-discovery), routes (bash, fs, exec, kill, scripts, body-parser, events-stream, health), setup (clone, identity, branch, install, resume, orchestrator), git (branch-status, git-sync), probe, proxy, entry - DaemonHealth contract with bootId; persist daemonBootId from /health - Switch UI + vm-tools to /_decopilot_vm/* with base64 bodies - Direct previewUrl wiring for VmEventsProvider and env.tsx exec/kill - Auto-start owns dev lifecycle; drop explicit /dev/start, /dev/stop - CI: bun-build step, docker smoke job, ripgrep install for e2e - Drop translateDaemonPath, daemon-script.ts, dev-server.ts; tests relocate to packages/sandbox/daemon/ Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Iterate on the GitTab UI for github-linked virtualmcps:
- PrOverview header: title with inline PR # link, author, base ← head
- PrSubTabs: text-style tab bar (Description / Changes {n} / Checks)
with sliding underline indicator that animates between triggers;
drop h-[52px] and border-b chrome
- DescriptionTab: drop duplicate h1 and bordered body card
- ChangesTab and ChecksTab: drop outer padding (owned by page container)
- Real usePrByBranch state machine drives State B / C / D — no mocks
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
e19b7d6 to
a5df48e
Compare
Folds in the original three vm-start-hang commits (port to Bun.serve, sync bun.lock, de-flake e2e + ripgrep). The same content is already present in our squashed feat(sandbox) commit, so this merge introduces no tree changes — only a history-graph link. Conflict resolution: - bun.lock, runner.ts: keep ours (post-bundling daemon) - daemon-script.ts: keep deleted (replaced by packages/sandbox/daemon/) - daemon-script.e2e.test.ts: drop (replaced by daemon.e2e.test.ts) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…llision
Two follow-ups for the slugified-sandbox handle work:
1. Add a test for the adopt-by-label path. `findExisting` now returns a
container *name* (not an ID) since it queries with `--format
{{.Names}}`. The new test asserts that a labeled, already-running
container is adopted via name and that `docker run` is not called
again.
2. Defensively recover from `--name` collisions in `provision()`.
`findExisting` only adopts *running* containers, so a stopped
same-name orphan left behind by a crash that bypassed `--rm`
cleanup will collide on the explicit `--name`. Detect the
"is already in use" error, force-remove the orphan, and retry
`startContainer` once. Covered by a new test.
…factor Bundled commit of in-progress changes across: - chat URL-state cleanup (chat-context, use-chat-navigation, side-panel-chat, use-task-manager) — preparing thread.branch as the single source of truth - agent-shell layout + main-panel tabs reshuffle - vm preview/env panel polish - sandbox docker runner / local-ingress refinements - packages/sandbox README updates Committed as a single checkpoint to keep the upcoming task-creation unification refactor (per docs/superpowers/plans/2026-04-25-task-creation-unification.md) on a clean base.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Look up the vMCP on create, derive branch from githubRepo metadata server-side, and set idempotentHint=true on the tool annotations. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…urning existing Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds an invariant check in COLLECTION_THREADS_UPDATE: if the thread's vMCP has metadata.githubRepo, setting branch=null is rejected with an error. Switching to a different non-null branch remains allowed. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…issing fallback - Remove fallback create path from createMemory; it now throws if thread_id is missing or thread not found - Drop triggerId, virtualMcpId, branch from MemoryConfig (thread row already carries that data) - Remove unused generatePrefixedId import from memory.ts - Add guard in stream-core.ts to surface missing taskId early - Add memory.test.ts covering success and not-found cases Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Thin collection-pattern wrappers backed by COLLECTION_THREADS_* tools, mirroring useConnection/useConnections/useConnectionActions. Task type adapts ThreadEntity to satisfy CollectionEntity (updated_by null→undefined). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Wire /$org/$taskId to a real component that calls useEnsureTask and renders a "Creating task…" boundary while the mutation is in flight, delegating to the surrounding layout once the task is ready. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Remove the `virtualMcpOverride` URL search param and `setVirtualMcpOverride`/`setVirtualMcpId` ephemeral override mechanism. Navigation now uses a single `virtualmcpid` param; automation-detail passes the target agent via `createTaskWithMessage({ virtualMcpId })` instead of calling the now-deleted prefs setter.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… addTaskToCache Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…e-only Route loader now handles task creation; navigating to a fresh id is sufficient. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Remove the "New" button and generateBranchName import from the branch picker. Users wanting a fresh branch should click "+ New Task" instead, which triggers server-side branch generation. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The /$org/$taskId route loader's TaskRoute component never mounted because agent-shell-layout doesn't render <Outlet /> — it composes the chat UI directly. Result: useEnsureTask never ran, threads were never created server-side, and the empty-state branch picker showed "Select branch…" instead of a real branch. Move the create-on-404 gate into AgentInsetProvider (which actually renders) and short-circuit to a "Creating task…" boundary while the mutation is in flight. Drop the now-dead routes/orgs/task-route.tsx. Verified: visiting /<org>/<random-uuid>?virtualmcpid=<vmcp-with-github> creates the thread server-side with branch=deco/<adj>-<noun> and the empty state shows that branch.
The empty-state branch picker reads from chat-context's tasks.find(t => t.id === effectiveTaskId).branch, where tasks comes from useTaskManager's useTasks hook (legacy KEYS.tasksPrefix query). useCollectionActions only invalidates queries shaped [client, scopeKey, "", "collection", ...] — KEYS.tasks doesn't match that predicate, so the list stays stale and the picker shows "Select branch…" until an unrelated SSE event happens to refetch. After a successful create, also invalidate KEYS.tasksPrefix(locator) so the picker reflects the server-generated branch immediately. Verified: clicking "New tasks" from an existing /\$org/\$taskId navigates to a fresh id, the route's create-on-404 fires, and the empty state shows the server-generated branch (e.g. deco/true-fern) without waiting on SSE.
AgentInsetProvider does not unmount across task navigations, so the hook (and the useTaskActions mutation it owns) persists. The boolean createStartedRef and the actions.create.status === \"idle\" guard both stuck after the first successful create — every subsequent \"+ New task\" click sat in the \"Creating task…\" boundary forever because the gate refused to re-fire for the new id. Track the id we last fired for instead. Refs mutate synchronously so the gate stays Strict-Mode safe. Verified: three consecutive \"New tasks\" clicks each produce a fresh server-generated branch (deco/thin-stone → deco/hollow-flint → …) and the empty-state branch picker reflects each one immediately.
The previous version leaned on a useRef gate to dedupe the create
mutation across renders, which the React Compiler can't reason about
the way it can about effects. Refactor:
- Replace the render-time gate + ref with a single useEffect whose
dependency array re-runs on (id, query.isSuccess, query.data,
ensureCreate).
- Own the create mutation locally via useMutation instead of routing
through useTaskActions(). This drops the user-facing
"Item created successfully" toast for ensure-create (the user did
not initiate it) and lets the hook control its own onSuccess
invalidation: the canonical collection cache, the legacy
KEYS.tasksPrefix list, and the local ensure query refetch.
- React 19 Strict Mode dev double-mount stays silent because the
server's INSERT … ON CONFLICT DO NOTHING handles duplicate
requests with no row collision and the private mutation has no
toast.
- Remove the isNotFoundError helper (the COLLECTION_THREADS_GET tool
returns { item: null } on missing, never throws "not found").
Verified live with two back-to-back "+ New task" clicks: each spawns
a fresh server-generated branch (deco/lunar-anchor → deco/olive-sage)
and the empty-state branch picker reflects each one immediately.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What is this contribution about?
Consolidates the two daemon codebases (Freestyle `Bun.serve` generated-string + Docker/K8s `node:http` prebaked) into a single modular TypeScript source at `packages/sandbox/daemon/` that every runner bundles via `bun build` and ships. Renames the package from `mesh-plugin-user-sandbox` → `@decocms/sandbox`. Docker image base moves to `oven/bun:1.3.13-debian` (1.3.11 doesn't exist on Docker Hub; latest in the 1.3 series picked instead). Unifies the wire protocol on `/_decopilot_vm/*` paths + base64-wrapped JSON, and dev-server lifecycle on auto-start. Adds `/health` with `bootId` (captured + persisted by every runner for future restart-detection).
Stacked on top of PR #3175 — that PR's freestyle Bun daemon is what this refactor extracts into modular form. When #3175 lands this rebases onto main cleanly.
Full design + implementation plan in `docs/superpowers/{specs,plans}/2026-04-24-unified-sandbox-daemon*` (local only; `/docs` is gitignored in this repo).
How to Test
Migration Notes
Docker containers running the old `daemon.mjs` expose `/_daemon/*` and return `{ ok: true }` on `/health` (no `bootId`). The updated `probeDaemonHealth()` returns `null` for that shape, signalling incompatible-daemon → the adopt logic force-recreates. No-op for Freestyle (VMs are ephemeral, 1800s idle TTL).
Review Checklist
🤖 Generated with Claude Code
Summary by cubic
Unifies the freestyle, Docker, and k8s sandbox daemons into a single Bun‑bundled TypeScript service and moves all callers to the unified
/_decopilot_vm/*API with direct browser access via each VM’spreviewUrl. Adds idempotent task create‑on‑404 with cache invalidation so server‑generated branches appear immediately; ensure‑create is now effect‑based, Strict‑Mode safe, and reliably handles back‑to‑back navigations.Refactors
packages/sandbox/daemonbundled todaemon/dist/daemon.js; package renamed to@decocms/sandboxand imports updated acrossapps/mesh. Oldmesh-plugin-user-sandboxdaemon/image code and the mesh passthrough route are removed./_decopilot_vm/*(SSE at/_decopilot_vm/events) with base64‑wrapped JSON bodies. The web UI and VM tools now talk to the daemon via each VM’spreviewUrl(no/api/sandbox/<id>proxy); control‑plane endpoints are unauthenticated with CORS.--name; adopt already‑running containers by name and recover from--namecollisions. Local dev domains use<handle>.localhost:7070.bootId; branch‑status and process/script snapshots stream over SSE; autostart discovers and runsdev/start.virtual_mcp_id) with branch derived from GitHub metadata; route‑level create‑on‑404 ensures tasks exist. Memory now requires an existing thread;stream-corehard‑requirestaskId.packages/sandbox/image/Dockerfileonoven/bun:1.3.13-debian.Migration
/_daemon/*are incompatible; runners will recreate containers. Preview domains change to<handle>.<root>(local:<handle>.localhost:7070).@decocms/sandbox, and callpreviewUrl/_decopilot_vm/*with base64 JSON bodies; bearer auth is no longer required for these endpoints.Written for commit 01dbcc8. Summary will update on new commits.