Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
121 commits
Select commit Hold shift + click to select a range
7adb7ad
docs: add GSD2 planning baseline for SCC v1 architecture
CCimen Apr 3, 2026
b5f784b
docs: stabilize m005 roadmap and gsd isolation
CCimen Apr 4, 2026
7fb0a59
feat: adopt provider-neutral launch boundary
CCimen Apr 4, 2026
d6c352e
docs: register runtime and safety milestones
CCimen Apr 4, 2026
9705dca
test: Added RuntimeProbe protocol, DockerRuntimeProbe adapter, FakeRunt…
CCimen Apr 4, 2026
589f99e
feat: Replaced three docker.check_docker_available() calls with probe-b…
CCimen Apr 4, 2026
cda11ad
test: Added tokenizer-based guardrail test preventing stale docker.chec…
CCimen Apr 4, 2026
6a2d9cf
test: Add preferred_backend field to RuntimeInfo and rootless detection…
CCimen Apr 4, 2026
93d22a9
feat: Added frozen ImageRef dataclass with full_ref()/image_ref() round…
CCimen Apr 4, 2026
9945383
test: Implemented OciSandboxRuntime adapter using docker create/start/e…
CCimen Apr 4, 2026
a538a98
feat: Wire bootstrap backend selection and start_session image routing…
CCimen Apr 4, 2026
85e3060
test: Added build_egress_plan() and compile_squid_acl() pure functions…
CCimen Apr 4, 2026
a824489
test: Created Squid proxy sidecar image definition, NetworkTopologyMana…
CCimen Apr 4, 2026
35a8825
test: Wire egress topology into OciSandboxRuntime with network-enforcem…
CCimen Apr 4, 2026
fd2a347
test: Added provider destination registry with anthropic-core/openai-co…
CCimen Apr 4, 2026
7327cc1
feat: Wired provider destination sets from SandboxSpec through OCI adap…
CCimen Apr 4, 2026
8dc43c8
feat: Added check_runtime_backend() doctor check and effective_egress s…
CCimen Apr 4, 2026
b5339c2
docs: Updated all stale network-mode vocabulary (isolated→locked-down-w…
CCimen Apr 4, 2026
160b8ce
docs: Added 5 guardrail tests in test_docs_truthfulness.py preventing s…
CCimen Apr 4, 2026
2fdf54a
chore: auto-commit after complete-milestone
CCimen Apr 4, 2026
c7f5920
feat: Added CommandFamily enum, lifted shell tokenizer from plugin into…
CCimen Apr 4, 2026
38da211
feat: Lifted all git safety analyzers from plugin into core with typed…
CCimen Apr 4, 2026
f0d0b69
test: Implemented DefaultSafetyEngine orchestrating shell tokenization…
CCimen Apr 4, 2026
ac92ec0
test: Created standalone scc_safety_eval package (9 files, stdlib-only)…
CCimen Apr 4, 2026
a2debb1
test: Created 7 shell wrappers, updated scc-base Dockerfile with python…
CCimen Apr 4, 2026
5ab8282
test: Added SafetyCheckResult dataclass, SafetyAdapter protocol, and Cl…
CCimen Apr 4, 2026
6fe0f88
test: Wired ClaudeSafetyAdapter and CodexSafetyAdapter into DefaultAdap…
CCimen Apr 4, 2026
d1fc728
test: Added fail-closed typed SafetyPolicy loader in core, doctor safet…
CCimen Apr 4, 2026
9871672
test: Added safety audit reader filtering safety.check events from JSON…
CCimen Apr 4, 2026
9f73f72
feat: Produced ranked maintainability audit with 184 table rows coverin…
CCimen Apr 4, 2026
0b8666b
test: Added 87 characterization tests across 4 files covering top-4 man…
CCimen Apr 4, 2026
15153bb
test: Added 197 characterization tests across 8 new files covering all…
CCimen Apr 4, 2026
45adeb3
fix: Cataloged 63 defects (24 mutable globals, 19 subprocess handling,…
CCimen Apr 4, 2026
e6786d4
feat: Decomposed 1084-line dashboard.py into three focused modules (mod…
CCimen Apr 4, 2026
ac83266
feat: Decomposed three oversized modules (1044, 839, 914 lines) into fo…
CCimen Apr 4, 2026
4216b13
feat: Extracted sandbox runtime functions from docker/launch.py (874→49…
CCimen Apr 4, 2026
1663fa1
feat: Decomposed three HARD-FAIL/MANDATORY-SPLIT command modules (1447,…
CCimen Apr 4, 2026
6610ea0
test: Decomposed five HARD-FAIL/MANDATORY-SPLIT UI modules (1492, 968,…
CCimen Apr 4, 2026
df8800b
test: Decomposed setup.py (1336→794 lines) and eliminated all HARD-FAIL…
CCimen Apr 4, 2026
82f86ff
test: Added 6 frozen model types (ArtifactKind, ArtifactInstallIntent,…
CCimen Apr 4, 2026
8c81427
chore: Added SafetyNetConfig, StatsConfig models and NormalizedOrgConfi…
CCimen Apr 4, 2026
5472724
feat: Converted compute_effective_config and 4 helpers from dict[str,An…
CCimen Apr 4, 2026
cf7be75
feat: Typed StartSessionRequest.org_config as NormalizedOrgConfig | Non…
CCimen Apr 4, 2026
9b3176a
feat: Deferred safety_policy_loader typing per user override — triggeri…
CCimen Apr 4, 2026
75f1ae1
fix: Created pure core bundle resolver (resolve_render_plan), extended…
CCimen Apr 4, 2026
37efb24
test: Created claude_renderer.py with render_claude_artifacts() project…
CCimen Apr 4, 2026
a57f2ad
test: Created codex_renderer.py projecting ArtifactRenderPlan into Code…
CCimen Apr 4, 2026
06bb375
test: Add RendererError exception hierarchy and fail-closed error handl…
CCimen Apr 4, 2026
bc6d652
test: Wire render_artifacts into AgentProvider protocol and launch pipe…
CCimen Apr 4, 2026
6600b7a
test: Added 59 contract tests covering all 9 bundle_resolver.py behavio…
CCimen Apr 4, 2026
a40c5ae
test: Added 40 characterization tests for Claude renderer achieving 100…
CCimen Apr 4, 2026
6076769
test: Added 48 characterization tests for Codex renderer achieving 100%…
CCimen Apr 4, 2026
9107222
test: Added 44 cross-provider pipeline integration tests covering share…
CCimen Apr 4, 2026
4b44a68
feat: Added three doctor checks (team context, bundle resolution, catal…
CCimen Apr 4, 2026
e335295
fix: Fixed four truthfulness gaps: Codex capability_profile, portable-a…
CCimen Apr 4, 2026
314fee6
feat: Removed xfail from function-size guardrail, extracted 4 oversized…
CCimen Apr 4, 2026
25c57e5
feat: Ran full M005 verification gate and validated all exit criteria;…
CCimen Apr 4, 2026
de3e49f
chore: auto-commit after complete-milestone
CCimen Apr 4, 2026
1e2e427
Fix doctor OCI backend reporting
CCimen Apr 4, 2026
d05f790
chore: Added pure provider resolver with CLI > config > default precede…
CCimen Apr 4, 2026
50d2874
feat: Added scc provider show/set commands, --provider flag on scc star…
CCimen Apr 4, 2026
84bb11d
feat: Wired provider resolution into the launch path with dict-based ad…
CCimen Apr 4, 2026
bd12730
test: Created CodexAgentRunner adapter with codex argv and .codex setti…
CCimen Apr 4, 2026
25f80fc
test: Wired CodexAgentRunner into provider dispatch table, threaded run…
CCimen Apr 4, 2026
587944f
test: Made OCI runtime exec command and credential volume mount provide…
CCimen Apr 4, 2026
ad70cb8
feat: Added get_provider_display_name() helper and replaced "Sandboxed…
CCimen Apr 5, 2026
75841b6
feat: Added display_name parameter to show_launch_panel(), show_launch_…
CCimen Apr 5, 2026
c3a9809
test: Swept all 'Claude Code' and 'Sandboxed Claude' references from no…
CCimen Apr 5, 2026
2ef334d
feat: Added provider_id field to SessionRecord, SessionSummary, Session…
CCimen Apr 5, 2026
1e59512
feat: Added provider_id to dry-run JSON, support bundle manifest, sessi…
CCimen Apr 5, 2026
f695e95
feat: Added check_provider_image() doctor check that detects missing pr…
CCimen Apr 5, 2026
87a3040
test: Created 16 coexistence proof tests and passed zero-regression gat…
CCimen Apr 5, 2026
b6082d8
chore: auto-commit after complete-milestone
CCimen Apr 5, 2026
849747a
test: Added ProviderRuntimeSpec frozen dataclass, InvalidProviderError,…
CCimen Apr 5, 2026
d4ea08d
fix: Replaced all 5 scattered provider dicts with PROVIDER_REGISTRY loo…
CCimen Apr 5, 2026
7cecd83
fix: Renamed three Claude-hardcoded helpers to provider-parameterized v…
CCimen Apr 5, 2026
22d1613
test: Added 21 tests covering provider-parameterized session, audit, co…
CCimen Apr 5, 2026
1167b9c
test: Added ProviderNotReadyError, ProviderImageMissingError, AuthReadi…
CCimen Apr 5, 2026
6af9779
test: Wired --provider flag, category assignment, and grouped doctor ou…
CCimen Apr 5, 2026
9d0110c
feat: Localize 9 Claude-specific constants from core/constants.py into…
CCimen Apr 5, 2026
697aafa
feat: Renamed localized Claude constants to _CLAUDE_AGENT_NAME, _CLAUDE…
CCimen Apr 5, 2026
89f52ec
fix: Added guardrail test preventing Claude-specific constants in core/…
CCimen Apr 5, 2026
672fbf1
docs: Updated README title to 'SCC - Sandboxed Code CLI', made pyprojec…
CCimen Apr 5, 2026
5a4ad79
chore: Added workspace-scoped Codex settings path and git-exclude repo…
CCimen Apr 5, 2026
7a84681
feat: Refactored AgentSettings from content:dict to rendered_bytes:byte…
CCimen Apr 5, 2026
7eb253f
feat: CodexAgentRunner now launches `codex --dangerously-bypass-approva…
CCimen Apr 5, 2026
9f621f5
chore: CodexAgentRunner now always injects cli_auth_credentials_store='…
CCimen Apr 5, 2026
f638ab2
feat: Added auth_check() to AgentProvider protocol; Claude and Codex ad…
CCimen Apr 5, 2026
35e68e3
feat: Eliminated silent Claude fallback from all active launch paths; m…
CCimen Apr 5, 2026
31bd3fa
chore: Fresh launch now deterministically writes SCC-managed config lay…
CCimen Apr 5, 2026
13064f2
chore: Added runtime permission normalization step to OCI launch path:…
CCimen Apr 5, 2026
7b7e0e8
test: scc-base now pre-creates both .claude and .codex dirs with 0700/u…
CCimen Apr 5, 2026
8d326ff
test: Added 7 OCI runtime-layer tests proving config persistence is det…
CCimen Apr 5, 2026
236e80d
test: Added 10 decision-reconciliation guardrail tests (D033/D035/D037/…
CCimen Apr 5, 2026
1cab7f3
chore: auto-commit after complete-milestone
CCimen Apr 5, 2026
e5a7def
test: 43 characterization tests capture provider resolution behavior ac…
CCimen Apr 6, 2026
99bd3fb
test: Created commands/launch/preflight.py with typed LaunchReadiness m…
CCimen Apr 6, 2026
72fff1a
feat: Replaced inline _resolve_provider() and _allowed_provider_ids() i…
CCimen Apr 6, 2026
50dc21e
fix: Fixed 26 pre-existing test failures across guardrail, mock compati…
CCimen Apr 6, 2026
ed3d1ea
test: Created structural guardrail tests preventing inline provider res…
CCimen Apr 6, 2026
cc70fc7
fix: Fixed 6 misleading auth-status strings across provider_choice.py,…
CCimen Apr 6, 2026
60c6274
feat: Removed Docker Desktop from active user-facing paths; added lifec…
CCimen Apr 6, 2026
1b4c5f0
test: Consolidated provider adapter dispatch into shared get_agent_prov…
CCimen Apr 6, 2026
e355676
test: All 6 verification checks pass: ruff clean, mypy clean, 5008 test…
CCimen Apr 6, 2026
5844834
test: Added 17 tests verifying workspace provider persistence edge case…
CCimen Apr 6, 2026
391ac94
test: Added 22 resume-after-drift edge case tests and auth bootstrap ex…
CCimen Apr 6, 2026
3ccef53
test: Added 67 tests verifying setup idempotency and error message qual…
CCimen Apr 6, 2026
386cba1
docs: Added legacy-path documentation to all Docker Desktop sandbox mod…
CCimen Apr 6, 2026
f002f70
chore: auto-commit after complete-milestone
CCimen Apr 6, 2026
1f36cee
chore: auto-commit after complete-milestone
CCimen Apr 6, 2026
31397bc
feat: ensure_launch_ready() now calls provider.bootstrap_auth() after s…
CCimen Apr 6, 2026
3c9f8d9
feat: Replaced inline ensure_provider_image + ensure_provider_auth call…
CCimen Apr 6, 2026
eeee4ce
feat: Reduced auth_bootstrap.py to a deprecated redirect delegating to…
CCimen Apr 6, 2026
f8872b3
feat: Replaced inline two-tier status in _render_provider_status with _…
CCimen Apr 6, 2026
2ba7252
chore: auto-commit after complete-milestone
CCimen Apr 6, 2026
d3145f7
chore: ignore and untrack ephemeral GSD state
CCimen Apr 6, 2026
1094ac6
chore: trim generated GSD artifacts from branch
CCimen Apr 6, 2026
8e941f0
docs: refresh README and ignore generated workspace state
CCimen Apr 6, 2026
555aa02
docs: tighten README and improve package discovery metadata
CCimen Apr 6, 2026
251379f
fix: repair PR lint and test regressions
CCimen Apr 6, 2026
2658657
test: make launch preflight tests hermetic
CCimen Apr 6, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
38 changes: 38 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -62,3 +62,41 @@ safetypluginclone/.claude/.state/

# Local refactor checklist
refactor-plan.md

# GSD runtime/state noise (keep curated project docs + milestone docs committed)
.gsd/STATE.md
.gsd/activity/
.gsd/auto.lock
.gsd/completed-units.json
.gsd/completed-units-*.json
.gsd/event-log.jsonl
.gsd/gsd.db
.gsd/gsd.db-*
.gsd/journal/
.gsd/metrics.json
.gsd/OVERRIDES.md
.gsd/PREFERENCES.md
.gsd/recovery/
.gsd/reports/
.gsd/runtime/
.gsd/state-manifest.json
.gsd/worktrees/
.bg-shell/
coverage.json
READMEGSD.md

# ── GSD baseline (auto-generated) ──
.gsd-id
*.code-workspace
.env
.env.*
!.env.example
node_modules/
.next/
*.pyc
target/
vendor/
*.log
coverage/
.cache/
tmp/
168 changes: 168 additions & 0 deletions .gsd/DECISIONS.md

Large diffs are not rendered by default.

100 changes: 100 additions & 0 deletions .gsd/KNOWLEDGE.md

Large diffs are not rendered by default.

114 changes: 114 additions & 0 deletions .gsd/PROJECT.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
# Sandboxed Coding CLI (SCC)

## What the project is
SCC is a governed runtime for coding agents. It lets organizations run approved agents inside portable sandboxes with explicit policy, team-level configuration, safer defaults, and runtime-enforced controls that are explainable to security reviewers.

## What the project is not
- not a new general-purpose coding agent
- not a forever-Claude-only wrapper
- not a Docker Desktop-only product
- not a fake security story built on advisory naming
- not a proprietary skills ecosystem

## Current v1 product target
The v1 target is a clean architecture on top of `scc-sync-1.7.3` that supports Claude Code and Codex through the same provider-neutral core, portable OCI runtimes, enforced web egress, and a shared runtime safety engine.

## Strategic success condition
A security or platform team can approve SCC because its governance model, runtime enforcement, and diagnostics are understandable and inspectable, while developers can switch providers and team contexts without rebuilding their world. The implementation should also become easier to change over time, not more brittle.

## Cross-cutting engineering priority
- Maximize maintainability, clean architecture, and clean code while delivering milestones.
- Prefer smaller cohesive modules, typed seams, and composition-root boundaries over growing central orchestrators.
- When a slice touches a large or fragile file, plan the smallest safe extraction that improves testability and future changeability.
- Pair refactors with characterization or contract tests so maintainability work stays measurable.

## Milestone history

### M001 — Provider-Neutral Launch Boundary ✅
Established typed contracts (core/contracts.py), AgentProvider protocol, and provider-neutral seam for launch, runtime, network, safety, and audit planning.

### M002 — Provider-Neutral Launch Pipeline ✅
Made AgentProvider and AgentLaunchSpec part of the real launch path. Claude settings are adapter-owned. Codex is a first-class provider. Preflight validation, durable JSONL audit sink, and application-owned support-bundle converged. Launch wizard resume extracted to typed helpers.

### M003 — Portable Runtime And Enforced Web Egress ✅
Delivered portable OCI sandbox backend (no Docker Desktop dependency) with topology-enforced web egress via Squid proxy sidecar, provider destination validation, operator diagnostics, and docs truthfulness guardrails. +178 net new tests (3464 total).

### M004 — Cross-Agent Runtime Safety ✅
Delivered shared safety policy and verdict engine, runtime wrapper baseline, provider-specific safety adapters, fail-closed policy loader, safety audit reader, doctor safety-policy check, and `scc support safety-audit` CLI command. +289 net new tests (3790 total).

### M005 — Architecture Quality, Strictness, And Hardening ✅
Delivered comprehensive architecture quality: module decomposition (15 files split), typed governed-artifact model hierarchy, provider-neutral bundle resolution/rendering pipeline, 100% branch coverage on pipeline modules, D023 portable artifact rendering, and 18 truthfulness guardrail tests. Final: 4486 tests.

### M006 — Provider Selection UX and End-to-End Codex Launch ✅
SCC became a genuine multi-provider runtime. Users choose Claude or Codex via config or CLI flag (`scc provider show/set`, `scc start --provider codex`), validated against org/team policy. Provider identity flows through container naming, volume naming, session identity, machine-readable outputs (dry-run JSON, support bundle, session list). CodexAgentRunner adapter with Codex-specific image, settings, and argv. Provider-aware branding ("Sandboxed Coding CLI"), doctor image check with exact build commands, and 16 coexistence proofs. 153 new tests, 4643 total, zero regressions.

### M007 — Provider Neutralization, Operator Truthfulness, and Legacy Claude Cleanup ✅
Eliminated Claude assumptions from shared/core/operator paths. ProviderRuntimeSpec replaces 5 scattered dicts. Settings serialization is provider-owned (rendered_bytes, not dict). Config layering is provider-native (Claude home-scoped, Codex workspace-scoped). Unknown providers fail closed. Auth readiness is adapter-owned via auth_check() on AgentProvider. Runtime permission normalization. Config freshness guarantee on every fresh launch. Doctor is provider-aware with --provider flag and categorized output. Core constants stripped to product-level only. 32 truthfulness guardrail tests. 166 net new tests, 4820 total.

### M008 — Cross-Flow Consistency, Reliability, and Maintainability Hardening ✅
Consolidated five duplicated launch preflight sequences into one shared module. S01: shared preflight module with typed LaunchReadiness model, flow.py and flow_interactive.py migrated, 7 structural guardrail tests. S02: auth vocabulary truthfulness (three-tier distinction), Docker Desktop removed from active paths, provider adapter dispatch consolidated via shared get_agent_provider() helper, 15 new guardrail tests. S03: 106 edge-case and regression-guard tests covering workspace persistence, resume-after-drift, setup idempotency, and error message quality. Auth bootstrap exception wrapping. Legacy Docker Desktop module documentation. 294 net new tests (5114 total), zero regressions.

### M009 — Preflight Convergence and Auth Bootstrap Unification ✅
All five launch sites (flow.py, flow_interactive.py, worktree_commands.py, orchestrator_handlers.py, and the start command) now use collect_launch_readiness() + ensure_launch_ready() through the shared preflight module. ensure_launch_ready() actually calls bootstrap_auth() when auth is missing (silent gap closed). auth_bootstrap.py reduced to deprecated redirect. Auth messaging centralized in preflight._ensure_auth(). Setup's _render_provider_status uses _three_tier_status() so both onboarding panel and completion summary show identical four-state readiness vocabulary. D048 superseded by D049. 3 net new tests (5117 total).

## Next milestone order
1. ~~M001 — Provider-Neutral Launch Boundary~~ ✅
2. ~~M002 — Provider-Neutral Launch Pipeline~~ ✅
3. ~~M003 — Portable Runtime And Enforced Web Egress~~ ✅
4. ~~M004 — Cross-Agent Runtime Safety~~ ✅
5. ~~M005 — Architecture Quality, Strictness, And Hardening~~ ✅
6. ~~M006 — Provider Selection UX and End-to-End Codex Launch~~ ✅
7. ~~M007 — Provider Neutralization, Operator Truthfulness, and Legacy Claude Cleanup~~ ✅
8. ~~M008 — Cross-Flow Consistency, Reliability, and Maintainability Hardening~~ ✅
9. ~~M009 — Preflight Convergence and Auth Bootstrap Unification~~ ✅

## Requirement status
- **R001: maintainability in touched high-churn areas** — ✅ validated. Advanced through all nine milestones.

## Current verification baseline
- `uv run ruff check` ✅
- `uv run mypy src/scc_cli` ✅ (303 files, 0 issues)
- `uv run pytest -q` ✅ (5117 passed, 23 skipped, 2 xfailed)
- Zero files in src/scc_cli/ exceed 1100 lines
- One file in 800–1100 zone justified (compute_effective_config.py at 852, 93% coverage)

## Known deferred items
- Wizard cast cleanup (23 casts in wizard.py/flow_interactive.py) — deferred per D018
- Legacy module coverage (docker_sandbox_runtime 30%, overall 74%) — deprioritized per D017/D021 user overrides
- Portable MCP stdio transport support — requires additional source metadata
- Live bundle registry integration — renderers write metadata references only
- Dashboard provider switching TUI feature (dashboard 'a' key)
- Container labels (scc.provider=<id>) for external tooling discovery
- Image build/push pipeline for scc-agent-codex
- Podman support on the same SandboxRuntime contracts
- `scc auth login/status/logout` commands — model supports them via auth_check()
- Fine-grained volume splitting (auth-only vs ephemeral) for enterprise data-retention (D036)
- start_claude parameter rename to start_agent in worktree_commands.py (deferred from M008/S01)
- WorkContext.provider_id threading through _record_session_and_context (deferred from M008/S01)
- Delete auth_bootstrap.py entirely after updating test consumers to use preflight directly

## Key architecture invariants
- `bootstrap.py` is the sole composition root for adapter symbols consumed outside `scc_cli.adapters`.
- `AgentLaunchSpec.env` stays empty for file-based providers; provider config travels via `artifact_paths`.
- The canonical provider-adapter characterization shape is: capability metadata, clean-spec, settings-artifact, and env-is-clean.
- Adding a provider to `DefaultAdapters` still requires the same four touch points: adapter file, bootstrap wiring, fake adapters factory, and inline test constructions.
- Provider-core destination validation belongs before launch, not as a runtime surprise.
- RuntimeProbe protocol is the canonical detection surface for runtime capabilities; no consumer outside the adapter layer should call docker.check_docker_available() directly.
- Bootstrap probes runtime at construction time and selects OciSandboxRuntime or DockerSandboxRuntime based on preferred_backend.
- OciSandboxRuntime is imported only in bootstrap.py; application layer uses SandboxRuntime protocol.
- Enforced web-egress uses internal Docker network + dual-homed Squid proxy sidecar as the hard enforcement boundary (D014).
- Safety engine is provider-neutral: DefaultSafetyEngine in core orchestrates shell tokenizer + git rules + network tool rules. Fail-closed semantics.
- SafetyPolicy loader is fail-closed: any parse failure → default block policy. Uses raw org config (not NormalizedOrgConfig).
- Provider safety adapters are pure UX/audit wrappers with zero verdict logic — the engine is the single source of safety truth.
- Import boundary guard (test_import_boundaries.py) mechanically enforces layer separation via AST scanning.
- **Launch preflight is fully unified via commands/launch/preflight.py (D046, D049):** resolve_launch_provider() → collect_launch_readiness() → ensure_launch_ready() is the canonical three-function sequence used by all five launch sites. ensure_launch_ready() calls bootstrap_auth() when auth is missing. Auth messaging lives in _ensure_auth() only.
- Renderers return fragment dicts for caller-owned merge — they do not write shared config files (settings.local.json, .mcp.json) directly.
- **ProviderRuntimeSpec** (frozen dataclass in `core/contracts.py`) is the single source of truth for provider runtime details. **PROVIDER_REGISTRY** in `core/provider_registry.py` maps provider_id → spec.
- Unknown, forbidden, or unavailable providers fail closed in active launch logic — never silently fall back to Claude.
- **AgentRunner owns settings serialization format**: `build_settings()` produces `rendered_bytes: bytes` + `path` + `suffix`, not dict.
- **Product name is 'SCC — Sandboxed Coding CLI'** consistently across README, pyproject.toml, CLI branding, D045, and all user-facing surfaces.
- **Auth vocabulary is three-tier truthful**: 'auth cache present' (file exists), 'image available' (container image present), 'launch-ready' (both). No surface uses 'connected' or standalone 'ready' to describe partial state. All setup surfaces (onboarding panel and completion summary) use the single _three_tier_status() helper.
- **Docker Desktop references** are confined to docker/, adapters/, core/errors.py, and doctor/ layers only. Active user-facing commands/ paths use 'Docker' or 'container runtime'.
- **Provider adapter dispatch** uses a shared `get_agent_provider(adapters, provider_id)` helper in dependencies.py — no hardcoded per-site dispatch dicts.
- **40+ guardrail tests** across test_docs_truthfulness.py, test_auth_vocabulary_guardrail.py, test_lifecycle_inventory_consistency.py, and test_launch_preflight_guardrail.py mechanically prevent regression.
- **Auth bootstrap exception wrapping** in ensure_launch_ready/_ensure_auth: raw exceptions from bootstrap_auth() become ProviderNotReadyError with actionable guidance; already-typed ProviderNotReadyError passes through unchanged.
29 changes: 29 additions & 0 deletions .gsd/REQUIREMENTS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
# Requirements

This file is the explicit capability and coverage contract for the project.

## Validated

### R001 — SCC changes must improve maintainability by keeping touched areas cohesive, testable, and easier to change, especially when work crosses oversized or high-churn files.
- Class: non-functional
- Status: validated
- Description: SCC changes must improve maintainability by keeping touched areas cohesive, testable, and easier to change, especially when work crosses oversized or high-churn files.
- Why it matters: Maintainability directly drives testability, consistency, and the long-term cost and safety of future provider/runtime changes.
- Source: user-feedback
- Primary owning slice: architecture
- Supporting slices: M002/S03, M002/S05
- Validation: Proof from M005: Zero files >1100 lines (from 3 at 1665/1493/1336), 15 MANDATORY-SPLIT files decomposed, 3 boundary violations repaired, 31 import boundary tests pass, typed governed-artifact model hierarchy adopted, provider-neutral bundle pipeline with 100% branch coverage (resolver + both renderers), D023 portable artifact rendering implemented, file/function size guardrails pass without xfail, 18 truthfulness tests, 4486 total tests passing. Exit gate: `uv run ruff check` (0 errors), `uv run mypy src/scc_cli` (289 files, 0 issues), `uv run pytest --rootdir "$PWD" -q` (4486 passed, 23 skipped, 2 xfailed).
- Notes: Validated by M002/S05, substantially strengthened by M005. M005 delivered: module decomposition (S02), typed config models (S03), governed-artifact pipeline (S04), 100% pipeline coverage (S05), diagnostics/truthfulness/guardrails (S06), D023 portable artifact rendering (S07). Wizard cast cleanup deferred (D018). Legacy module coverage targets deferred per D017/D021 user overrides directing work toward team-pack architecture.

## Traceability

| ID | Class | Status | Primary owner | Supporting | Proof |
|---|---|---|---|---|---|
| R001 | non-functional | validated | architecture | M002/S03, M002/S05 | Proof from M005: Zero files >1100 lines (from 3 at 1665/1493/1336), 15 MANDATORY-SPLIT files decomposed, 3 boundary violations repaired, 31 import boundary tests pass, typed governed-artifact model hierarchy adopted, provider-neutral bundle pipeline with 100% branch coverage (resolver + both renderers), D023 portable artifact rendering implemented, file/function size guardrails pass without xfail, 18 truthfulness tests, 4486 total tests passing. Exit gate: `uv run ruff check` (0 errors), `uv run mypy src/scc_cli` (289 files, 0 issues), `uv run pytest --rootdir "$PWD" -q` (4486 passed, 23 skipped, 2 xfailed). |

## Coverage Summary

- Active requirements: 0
- Mapped to slices: 0
- Validated: 1 (R001)
- Unmapped active requirements: 0
30 changes: 30 additions & 0 deletions .gsd/RUNTIME.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# RUNTIME.md

## Canonical implementation root
- `scc-sync-1.7.3` is the only writable repo for this work.
- The original dirty `scc` tree is archival and rollback evidence only.

## Runtime assumptions for v1
- Plain OCI backend first.
- Docker Engine / OrbStack / Colima-style Docker CLIs are first runtime targets.
- Podman follows on the same contracts after the first Claude/Codex vertical slice is stable.
- Windows support is WSL-first if needed.

## Verification commands
- `uv run ruff check`
- `uv run mypy src/scc_cli`
- `uv run pytest`

## Expected runtime deliverables
- `scc-base`
- `scc-agent-claude`
- `scc-agent-codex`
- `scc-egress-proxy`

## Enforced egress topology
- agent container on internal-only network
- egress proxy as the only component with internal + external attachment
- no host networking
- deny IP literals by default
- deny loopback, private, link-local, and metadata endpoints by default
- proxy ACL evaluates requested host and resolved IP/CIDR
25 changes: 25 additions & 0 deletions .gsd/milestones/M001-CONTEXT.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
# M001-CONTEXT.md

# Locked decisions for M001

## Non-negotiables
- No long-term backward compatibility in core after the one-time migration.
- No Docker Desktop dependency in the architecture.
- No provider-specific logic in core contracts.
- No fake use of overclaimed enforcement language.
- No widening of effective egress outside org policy and delegated team policy.

## Primary objective
Create the cleanest possible foundation for later runtime and provider work. Do not rush into Podman, Pi, OpenCode, or enterprise dashboards before the baseline and typed architecture are sound.

## Canonical references
- `CONSTITUTION.md`
- `PLAN.md`
- `.gsd/REQUIREMENTS.md`
- `specs/01-repo-baseline-and-migration.md`
- `specs/02-control-plane-and-types.md`
- `specs/03-provider-boundary.md`
- `specs/07-verification-and-quality-gates.md`

## Notes
This milestone is intentionally quality-first. It should reduce ambiguity, provider leakage, and orchestration risk before any major feature expansion.
17 changes: 17 additions & 0 deletions .gsd/milestones/M001-RESEARCH.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# M001-RESEARCH.md

# Baseline findings to preserve during refactor

## Codebase reality from prior review
- Provider abstraction is still too Claude-shaped.
- Error and exit-code contracts need alignment.
- Launch and flow orchestration remain larger than they should be.
- Application/config boundaries still rely too heavily on raw dictionaries.
- Runtime detection is still name-based instead of capability-based.
- Complexity guardrails exist but are not yet enforced strongly enough.

## Why Milestone 0 / M001 must come first
If the codebase moves directly into multi-runtime and multi-provider work without a green synced baseline and typed contracts, the product will accumulate more provider leakage and more misleading security surfaces.

## Research conclusion
The best first step is not new runtime code. It is repo truth, vocabulary cleanup, typed core seams, and characterization coverage.
28 changes: 28 additions & 0 deletions .gsd/milestones/M001-ROADMAP.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
# M001-ROADMAP.md

# Milestone M001 — Baseline Freeze And Typed Foundation

## Outcome
The project has a single authoritative repo root, a green migrated baseline, typed control-plane direction, and the first characterization/contract tests needed for safe refactoring.

## Slices
- [ ] Freeze the archived dirty `scc` tree and make `scc-sync-1.7.3` the only writable root
- [ ] Normalize local docs, configs, tests, and terminology to the new truthful network vocabulary
- [ ] Re-run the full verification gate on the synced repo and capture the baseline
- [ ] Add characterization tests around current Claude launch, resume, config inheritance, and safety-net behavior
- [ ] Define typed core contracts: `AgentProvider`, `AgentLaunchSpec`, `RuntimeInfo`, `NetworkPolicyPlan`, `SafetyPolicy`, `SafetyVerdict`, and `AuditEvent`
- [ ] Align `SCCError`, exit-code mapping, and human/JSON output contracts
- [ ] Record accepted decisions and update specs so follow-on work does not invent hidden compatibility or provider leaks

## Dependencies
- none

## Risk level
High

## Done when
- `scc-sync-1.7.3` is the only implementation root in active use
- no stale compatibility aliases remain in planned core surfaces
- the baseline is green
- characterization coverage exists for the most fragile current behavior
- the typed control-plane contracts are written down and accepted
Loading
Loading