Skip to content

refactor local docker sandbox provisioning#124

Open
markokraemer wants to merge 1070 commits intomainfrom
refactor/local-docker-multi-instance
Open

refactor local docker sandbox provisioning#124
markokraemer wants to merge 1070 commits intomainfrom
refactor/local-docker-multi-instance

Conversation

@markokraemer
Copy link
Contributor

@markokraemer markokraemer commented Mar 7, 2026

Summary

  • make local Docker sandbox provisioning instance-aware so multiple local sandboxes can coexist without container-name or port collisions
  • route preview, websocket, SSH, health, and frontend-managed sandbox actions through the selected local sandbox instead of assuming a single kortix-sandbox
  • allow sandbox/docker-compose.yml to be parameterized with unique project, container, volume, and port names for parallel local stacks

Verification

  • ran pnpm lint in apps/frontend
  • created two local Docker sandboxes concurrently and verified unique container names, unique base URLs, unique SANDBOX_ID values, and reachable health endpoints
  • validated sandbox/docker-compose.yml with distinct COMPOSE_PROJECT_NAME, SANDBOX_CONTAINER_NAME, and SANDBOX_VOLUME_NAME overrides via docker compose config

Ino-Bagaric and others added 30 commits February 27, 2026 17:25
…models only

- New get_tool_output tool lets agent retrieve full original output of pruned tool results
- Store raw outputs as files in /workspace/.kortix/tool-outputs/{obs_id}.txt (Anthropic only)
- Gate applySessionPruning in router to cachingStrategy === 'manual' (Anthropic)
- Inject pruning awareness into synthetic message only for Anthropic models
- 24h cleanup of stale output files on session start
- Add get_tool_output to SKIP_TOOLS in extract.ts
Previously the health endpoint always returned 200 even when OpenCode
was unreachable, making Docker and API health monitors believe the
sandbox was healthy while every actual request failed with 502.

Now returns 503 with status:'starting' until OpenCode is reachable,
and 200 with status:'ok' once ready. Updated all consumers
(sandbox-health monitor, fetchMasterJson, Docker healthcheck,
e2e tests) to handle the new 503 starting state correctly.
…e isn't ready

When /global/health returns 502 (OpenCode unreachable), the frontend now
falls back to /kortix/health to check if the sandbox itself is alive.
If it responds, shows a 'Starting up' state with blue spinner instead of
the misleading 'Unreachable' error.

- New 'starting' status in SandboxConnectionStatus union type
- Full-screen StartingState with blue Loader icon for first connections
- ReconnectBanner shows 'Starting up' with blue dot when reconnecting
- Server selector shows blue 'Starting...' badge for instances
- Toast notification: 'OpenCode is restarting...' on connected->starting
- Polls every 5s while in starting state
The kortix-api proxy returns 500 (not 502) when the sandbox container
itself is unreachable (ECONNREFUSED). The frontend now treats both
500 and 502 as signals to check /kortix/health for the starting state,
covering both 'sandbox starting' and 'OpenCode starting' scenarios.
…e UX overhaul

- Proxy (local-preview + daytona) now injects X-Forwarded-Prefix header with
  the full public base URL (e.g. http://localhost:8008/v1/p/sandbox/8000)
- Sandbox /docs/openapi.json reads the header to set the correct servers URL
  dynamically; falls back to descriptive placeholder when accessed directly
- Removed hardcoded api.kortix.ai curl example from created key dialog
- Removed public key display from all UI surfaces (key cards, sandbox token,
  created dialog, auth guide)
- Rewrote API keys page: clean list rows, rounded-2xl brand styling, compact
  create modal, simplified sandbox token section, usage hint footer
The starting state detection never fires reliably because by the time
the API proxy returns 500, /kortix/health is also unreachable through
the same proxy. The existing reconnect pill handles the boot-up case
fine and auto-recovers once the sandbox comes up.

Backend 503 health endpoint change remains (working correctly).
…tale config

- Add full .env validation with Zod schema in config.ts — server refuses to start
  if required vars (DATABASE_URL, SUPABASE_URL, SUPABASE_SERVICE_ROLE_KEY,
  API_KEY_SECRET) are missing, with conditional checks for Daytona/Docker/Pipedream
- Auto-derive CRON_API_URL from PORT + DOCKER_HOST instead of manual env var
- Remove dead env vars (DAYTONA_SNAPSHOT, SANDBOX_IMAGE, AWS_BEARER_TOKEN_BEDROCK)
- Add SANDBOX_VERSION + GITHUB_TOKEN to schema, update version.ts to use config
- Create services/kortix-api/.env.example documenting all 93 schema keys
- Delete root .env.example monolith
- Remove NX entirely — replace with native pnpm --filter commands (-60 packages)
- Delete stale bun.lock, empty services/opencode/, clean .gitignore/.dockerignore
- Update setup-env.sh and pnpm-workspace.yaml to match
Replace opaque kortix/basic and kortix/power aliases with real model names:
- anthropic/claude-opus-4.6, claude-sonnet-4.6, claude-haiku-4.5
- openai/gpt-5.3-codex, minimax/minimax-m2.5, z-ai/glm-5
- moonshotai/kimi-k2.5, x-ai/grok-4.1-fast

All 8 models verified against OpenRouter. Updated opencode.jsonc,
model registry, cron defaults, slack commands, tests, and plugin.
Ino-Bagaric and others added 28 commits March 5, 2026 22:44
setup/initialize now returns instantly after creating the Stripe
subscription and kicks off sandbox provisioning in the background.

Added GET /billing/setup/status for the frontend to poll sandbox
readiness instead of blocking the HTTP request for 20+ seconds
which caused 502s from Cloudflare proxy timeouts.
The setting-up page was hitting the raw Hetzner IP directly from the
browser. Fixed to use getSandbox() + getSandboxUrl() which routes
through the platform proxy (/p/{externalId}/{port}). Also removed
sandbox_url from the status endpoint since it's not needed.
Render reasoning as a compact collapsible block that stays visible after completion/reload, and prevent sync-store hydration from clobbering reasoning text during streaming updates.
Make the session response code copy control feel cleaner by using a softer border treatment, explicit pointer cursor, and removing harsh focus ring and shadow-like emphasis.
- Use client.auth.remove({ providerID }) as primary disconnect method
- Fallback to auth.set with empty key for backward compatibility (404/405)
- Add confirmation dialog before disconnecting to prevent accidental disconnections
- Fix UI refresh by using refetchQueries instead of invalidateQueries
- Remove unnecessary 'as any' type assertion
- Display provider name in confirmation message
- Style disconnect button as destructive with white text for visibility

Changes:
- Import AlertDialog components for confirmation UX
- Add confirmDisconnect state to track pending disconnection
- Rename handleDisconnect to confirmAndDisconnect for clarity
- Show confirmation dialog with provider name and re-auth warning
- Force refetch providers list after successful disconnect
- Preserve all existing behavior (loading state, dispose, onDirty callback)
- Change AlertDialogDescription from text-sm to text-xs
- Makes the confirmation message more compact and better proportioned
- Careers page: match homepage max-w-2xl editorial layout, add Shackleton image as mantra, remove motion/card UI
- Position pages (AI, Design): full refactor to editorial style with section labels, remove motion/SimpleFooter
- Remove SRE engineer position
- Footer: full-width px-5 layout matching navbar, remove double footers from pricing/tutorials/app pages
- Theme toggle: remove accent color dot indicator
@vercel
Copy link

vercel bot commented Mar 7, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
computer-frontend Ready Ready Preview, Comment Mar 7, 2026 1:36pm

Request Review

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants