From 8231bed0c4e4a23d1c2ea67d9ff4db46404b9042 Mon Sep 17 00:00:00 2001 From: Mohamed Saleh Zaied Date: Wed, 3 Jun 2026 05:18:23 +0300 Subject: [PATCH] docs(architecture): land rust-core architecture docs + validation reports (Slice 5a) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Brings the original rust-core-native-shells PRD/roadmap/test-spec/ADR, capability matrix, core boundaries, adapter contracts, validation reports + signoff template, and the runtime-validation script from skills-bridge-swift. Pure-additive docs. Deferred to a follow-up: skills/ -> Skills/ casing canonicalization (the lowercase name is hardcoded in pbxproj, the bundle build-phase script, SkillManager/SkillStore, and the ~/.skilly/skills/ user-install path) — needs a scoped decision, not a blind rename. Co-Authored-By: Claude Opus 4.8 (1M context) --- docs/architecture/adapter-contracts.md | 82 ++++++++++++ .../adr-001-rust-core-native-shells.md | 36 +++++ docs/architecture/capability-matrix.md | 25 ++++ docs/architecture/core-boundaries.md | 40 ++++++ .../final-phase-validation-report.md | 53 ++++++++ .../realtime-production-like-traces-report.md | 29 ++++ .../runtime-validation-report-2026-04-18.md | 83 ++++++++++++ .../runtime-validation-signoff-template.md | 75 +++++++++++ .../rust-core-native-shells-prd.md | 78 +++++++++++ .../rust-core-native-shells-roadmap.md | 124 ++++++++++++++++++ .../rust-core-native-shells-test-spec.md | 123 +++++++++++++++++ scripts/README.md | 106 +++++++++------ scripts/create-runtime-validation-report.sh | 36 +++++ 13 files changed, 849 insertions(+), 41 deletions(-) create mode 100644 docs/architecture/adapter-contracts.md create mode 100644 docs/architecture/adr-001-rust-core-native-shells.md create mode 100644 docs/architecture/capability-matrix.md create mode 100644 docs/architecture/core-boundaries.md create mode 100644 docs/architecture/final-phase-validation-report.md create mode 100644 docs/architecture/realtime-production-like-traces-report.md create mode 100644 docs/architecture/runtime-validation-report-2026-04-18.md create mode 100644 docs/architecture/runtime-validation-signoff-template.md create mode 100644 docs/architecture/rust-core-native-shells-prd.md create mode 100644 docs/architecture/rust-core-native-shells-roadmap.md create mode 100644 docs/architecture/rust-core-native-shells-test-spec.md create mode 100755 scripts/create-runtime-validation-report.sh diff --git a/docs/architecture/adapter-contracts.md b/docs/architecture/adapter-contracts.md new file mode 100644 index 00000000..1eb3ea2e --- /dev/null +++ b/docs/architecture/adapter-contracts.md @@ -0,0 +1,82 @@ +# Adapter Contracts: Native Shell Boundaries + +## Purpose +Define the interface boundary between the shared Rust core and platform-native shells. +This contract keeps policy and orchestration deterministic while allowing each OS shell +to implement its own UI, permissions, and device capabilities. + +## Adapter Surfaces + +### 1. Auth Adapter +- Acquire and refresh authenticated user session from platform shell credentials. +- Required fields: + - `workos_user_id` + - `session_token` + - `expires_at` +- Failure must be explicit and recoverable by host UI. + +### 2. Entitlement Adapter +- Fetch entitlement snapshot from worker API and map to `EntitlementState`. +- Required fields: + - entitlement status (`none`, `trial`, `active`, `canceled`, `expired`) + - period boundaries + - cap metadata + +### 3. Capture Adapter +- Capture current screen context for active turn. +- Contract output: + - `display_id` + - image bytes + encoding metadata + - capture timestamp +- Capability gating: + - if unavailable, shell must surface a permission/status message before turn start. + +### 4. Hotkey Adapter +- Listen for global push-to-talk start/end signals. +- Contract events: + - `hotkey_pressed` + - `hotkey_released` + +### 5. Overlay Adapter +- Render pointer and guidance affordances for `[POINT:x,y:label:screen]` directives. +- Contract methods: + - show pointer + - animate to coordinate + - hide pointer + +### 6. Audio Adapter +- Stream input audio chunks to realtime transport and play output audio deltas. +- Contract events: + - microphone stream started/stopped + - playback started/stopped + +### 7. Permission Adapter +- Expose OS permission status used by shell readiness checks. +- Minimum statuses: + - screen capture permission + - accessibility/hotkey permission + - microphone permission + +## Core-to-Shell Handshake +1. Shell acquires auth session. +2. Shell fetches entitlement snapshot. +3. Shell asks Rust policy if turn can start. +4. If allowed, shell starts capture/hotkey/audio and emits realtime events. +5. Rust realtime state machine governs turn lifecycle transitions. + +## Bootstrap Scope (Phase 4) +- Current shell bootstrap crates use mocked host adapters to validate: + - auth session acquisition + - entitlement-driven turn permission + - turn-start lifecycle via `core/realtime` replay + +## Development Status +- Windows shell now routes through explicit adapter modules for: + - capture + - hotkey + - overlay + - audio input/output + - permissions +- Linux shell now routes through explicit adapter modules for the same surfaces. +- Capability-aware gating is enforced before turn start; critical adapter blockers abort flow with explicit reasons. +- Final-phase work focuses on runtime validation on real platform environments. diff --git a/docs/architecture/adr-001-rust-core-native-shells.md b/docs/architecture/adr-001-rust-core-native-shells.md new file mode 100644 index 00000000..ae42b7ce --- /dev/null +++ b/docs/architecture/adr-001-rust-core-native-shells.md @@ -0,0 +1,36 @@ +# ADR-001: Adopt Rust Core + Native Platform Shells + +## Status +Accepted + +## Date +2026-04-16 + +## Decision +Adopt a shared Rust core for deterministic logic (policy, skills, orchestration) while keeping native UI/capability shells per platform. + +## Drivers +1. Reduce logic duplication across future desktop/mobile clients. +2. Preserve platform-specific performance and capability integrations. +3. Minimize release risk by incrementally migrating from existing macOS host. + +## Alternatives Considered +1. Full cross-platform UI rewrite first. +- Rejected because it introduces high migration risk and delays policy/orchestration reuse. + +2. Keep all logic inside each platform shell. +- Rejected because policy/orchestration drift and maintenance cost increase over time. + +3. Backend-only policy with thin clients. +- Rejected because core turn orchestration and offline-ish local behavior still require deterministic client logic. + +## Consequences +1. FFI/versioning contracts become a first-class maintenance concern. +2. Platform shell teams remain responsible for native capture/hotkey/overlay behavior. +3. Migration can proceed in phases with fallback paths to keep shipping continuity. + +## Follow-ups +1. Keep ABI mapping docs in sync between Rust and Swift. +2. Expand fixture-driven parity tests as new modules move to Rust. +3. Define Linux desktop environment support scope before adapter phase. + diff --git a/docs/architecture/capability-matrix.md b/docs/architecture/capability-matrix.md new file mode 100644 index 00000000..095ecaa2 --- /dev/null +++ b/docs/architecture/capability-matrix.md @@ -0,0 +1,25 @@ +# Capability Matrix + +Status key: +- `ready`: implemented in current host +- `planned`: defined in migration plan, not yet implemented +- `unknown`: requires discovery spike + +| Capability | macOS (current Swift host) | Windows shell target | Linux shell target | Rust core role | +| --- | --- | --- | --- | --- | +| Auth session lifecycle | ready | planned | planned | shared contracts + validation | +| Entitlement checks | ready | planned | planned | deterministic policy engine | +| Trial/cap accounting | ready | planned | planned | deterministic transitions | +| Admin allowlist policy | ready | planned | planned | stable WorkOS-user-id decision path | +| Realtime orchestration | ready | planned | planned | shared state machine | +| Prompt composition | ready | planned | planned | shared skill/prompt module | +| Screen capture | ready | planned | planned | capability flag + result normalization | +| Global push-to-talk hotkey | ready | planned | planned | capability flag + event contracts | +| Overlay pointing UI | ready | planned | planned | event contracts only | +| App update channel | ready | planned | planned | release metadata conventions | + +## Notes +- Capture, hotkey, and overlay stay native because they depend on OS-specific primitives. +- Rust core should only define contracts, state transitions, and deterministic behavior. +- Platform adapters should report capability availability so UX can degrade gracefully. + diff --git a/docs/architecture/core-boundaries.md b/docs/architecture/core-boundaries.md new file mode 100644 index 00000000..714abf7d --- /dev/null +++ b/docs/architecture/core-boundaries.md @@ -0,0 +1,40 @@ +# Core Boundaries + +This document defines migration boundaries between the existing Swift host and the new shared Rust core. + +## Goal +- Keep shipping the macOS host while extracting deterministic, reusable logic to Rust. +- Preserve native platform capability integrations in host shells. + +## Ownership Split + +### Remains in Native Shells (Platform-specific) +- Screen capture API integration +- Global hotkey registration/listening +- Overlay/window composition and animation +- Permission prompts and OS dialogs +- Device-level audio session details + +### Moves to Rust Core (Platform-agnostic) +- Entitlement/trial/cap decision logic +- Admin policy behavior +- Skill prompt composition and budget logic +- Realtime lifecycle orchestration state machine +- Shared telemetry event schema and aggregations + +## Contract Model +- Native shell gathers runtime inputs (user ID, entitlement status, usage counters, current capabilities). +- Rust core returns deterministic decisions/events. +- Native shell executes side effects. + +## Migration Sequence +1. Policy extraction +2. Skills/prompt extraction +3. Realtime state extraction +4. New platform shell onboarding + +## Non-goals for Initial Migration +- Replacing macOS UI layer +- Replacing platform capture/hotkey/overlay with cross-platform abstractions in one step +- Forcing Linux/Windows full visual parity before core parity + diff --git a/docs/architecture/final-phase-validation-report.md b/docs/architecture/final-phase-validation-report.md new file mode 100644 index 00000000..5b80c8e5 --- /dev/null +++ b/docs/architecture/final-phase-validation-report.md @@ -0,0 +1,53 @@ +# Final Phase Validation Report + +Date: 2026-04-18 +Branch: `feature/skills-bridge-swift` + +## Scope +Final-phase validation of Rust core migration artifacts with adapter-backed shell binaries. + +Validated areas: +1. Workspace formatting/check/test integrity. +2. FFI build output. +3. Linux and Windows shell baseline turn flow. +4. Release-script static guard. +5. Adapter capability mapping unit tests. + +## Commands Executed +```bash +cargo fmt --all -- --check +cargo check --workspace +cargo test --workspace +cargo build -p skilly-core-ffi +cargo run -p skilly-linux-shell -- --smoke +cargo run -p skilly-windows-shell -- --smoke +bash -n scripts/release.sh +``` + +## Results +- `cargo fmt --all -- --check`: pass +- `cargo check --workspace`: pass +- `cargo test --workspace`: pass + - includes shell adapter unit tests: + - windows: 3 tests passed + - linux: 4 tests passed +- `cargo build -p skilly-core-ffi`: pass +- `cargo run -p skilly-linux-shell -- --smoke`: pass + - completed phase with capability snapshot reported +- `cargo run -p skilly-windows-shell -- --smoke`: pass + - completed phase with capability snapshot reported +- `bash -n scripts/release.sh`: pass + +## Key Outcomes +1. Adapter-backed shells now validate capability gating logic in tests and smoke execution. +2. Rust policy/skills/realtime/ffi modules pass full workspace validation. +3. CI workflow now includes all required gate categories: + - rust core check + - rust core test + - ffi smoke + - shell smoke + - mac release guard + +## Remaining Manual/Platform Work +1. Xcode runtime validation for Swift bridge behavior (policy/skills/realtime) in live app. +2. Native host-app runtime verification on real Windows/Linux desktop environments beyond CLI shell binaries. diff --git a/docs/architecture/realtime-production-like-traces-report.md b/docs/architecture/realtime-production-like-traces-report.md new file mode 100644 index 00000000..a5d89dfe --- /dev/null +++ b/docs/architecture/realtime-production-like-traces-report.md @@ -0,0 +1,29 @@ +# Realtime Production-Like Traces Expansion Report + +Date: 2026-04-18 +Branch: `feature/skills-bridge-swift` + +## Scope +Expanded deterministic replay fixtures to improve production-like lifecycle coverage for `core/realtime`. + +## Changes +- Updated `core/realtime/fixtures/replay_traces.json`. +- Trace corpus expanded from 3 traces to 20 traces. +- Added mixed-path coverage including: + - sequential multi-turn completion + - audio playback transition paths + - error and recovery loops + - reset-heavy flows + - late error after completed phase + - turn restart before commit + +## Verification +```bash +cargo test -p skilly-core-realtime +cargo test --workspace +``` + +Both commands passed with the expanded fixture corpus. + +## Remaining Work +1. Continue appending fixtures from newly observed production telemetry patterns over time. diff --git a/docs/architecture/runtime-validation-report-2026-04-18.md b/docs/architecture/runtime-validation-report-2026-04-18.md new file mode 100644 index 00000000..b3a3ad9d --- /dev/null +++ b/docs/architecture/runtime-validation-report-2026-04-18.md @@ -0,0 +1,83 @@ +# Runtime Validation Report + +Generated (UTC): 2026-04-18 +Branch: `feature/skills-bridge-swift` +Commit: `9b47766` + +Source template: `docs/architecture/runtime-validation-signoff-template.md` + +## Signoff Metadata +- Owner: +- Reviewer: +- Validation date: +- Branch: +- Commit SHA: +- Build identifiers (if applicable): + +## Decision +- Overall status: `approved` | `blocked` +- Blocking notes: +- Follow-up ticket(s): + +## Required Runtime Lanes +| Lane | Status (`pass`/`fail`/`n-a`) | Evidence link (video/log/screenshot) | Notes | +| --- | --- | --- | --- | +| macOS host app (Rust lane via `SKILLY_RUST_CORE_DYLIB_PATH`) | | | | +| macOS host app (Swift fallback lane; Rust dylib disabled) | | | | +| macOS host app (packaged Rust FFI artifact consumed) | | | | +| Windows native host app runtime (real desktop environment) | | | | +| Linux native host app runtime (real desktop environment) | | | | +| iOS host app simulator runtime (generated SDK integration) | | | | +| Android host app emulator runtime (generated SDK integration) | | | | + +## Policy Scenario Parity (Rust lane vs Swift lane) +| Scenario | Rust lane result | Swift lane result | Match (`yes`/`no`) | Notes | +| --- | --- | --- | --- | --- | +| Trial under cap -> allowed | | | | | +| Trial exhausted -> blocked (`trialExhausted`) | | | | | +| Active under cap -> allowed | | | | | +| Active over cap -> blocked (`capReached`) | | | | | +| Admin over cap/expired -> allowed | | | | | +| Canceled valid access under cap -> allowed | | | | | +| Canceled valid access over cap -> blocked (`capReached`) | | | | | + +## Skill Prompt Parity (Rust lane vs Swift lane) +| Scenario | Rust lane result | Swift lane result | Match (`yes`/`no`) | Notes | +| --- | --- | --- | --- | --- | +| Full vocabulary budget | | | | | +| Vocabulary trimming applied | | | | | +| Completed-stage history included | | | | | +| Missing-current-stage fallback | | | | | +| Pointing mode variants (`always`, `when-relevant`, `minimal`) | | | | | + +## Realtime Lifecycle Parity (Rust lane vs Swift lane) +| Scenario | Rust lane result | Swift lane result | Match (`yes`/`no`) | Notes | +| --- | --- | --- | --- | --- | +| `turn_started -> audio_capture_committed -> audio_playback_started -> response_completed` | | | | | +| `turn_started -> audio_capture_committed -> session_error` | | | | | +| `completed -> session_reset` | | | | | +| Invalid ordering rejection (`turn_started -> response_completed`) | | | | | + +## Windows/Linux Host-App Runtime Checklist +- Windows auth + entitlement + turn-start flow validated in native host app: +- Windows capture/hotkey/overlay/audio/permissions behavior validated: +- Linux auth + entitlement + turn-start flow validated in native host app: +- Linux capture/hotkey/overlay/audio/permissions behavior validated: +- Known platform-specific degradations documented: + +## iOS/Android Host-App Runtime Checklist +- iOS host app compiles with generated Swift SDK: +- iOS runtime exercise of policy and realtime API completed: +- Android host app compiles with generated Kotlin SDK: +- Android runtime exercise of policy and realtime API completed: +- ABI/library loading behavior documented for both platforms: + +## Packaged Artifact Consumption Checklist +- `dist/rust-ffi/*.tar.gz` unpack + load test executed: +- `dist/mobile-sdk/*.tar.gz` unpack + sample integration test executed: +- Checksum verification completed: + +## Evidence Attachments +- Command transcript links: +- Screen recordings: +- Crash logs / diagnostics: diff --git a/docs/architecture/runtime-validation-signoff-template.md b/docs/architecture/runtime-validation-signoff-template.md new file mode 100644 index 00000000..acfc11b8 --- /dev/null +++ b/docs/architecture/runtime-validation-signoff-template.md @@ -0,0 +1,75 @@ +## Signoff Metadata +- Owner: +- Reviewer: +- Validation date: +- Branch: +- Commit SHA: +- Build identifiers (if applicable): + +## Decision +- Overall status: `approved` | `blocked` +- Blocking notes: +- Follow-up ticket(s): + +## Required Runtime Lanes +| Lane | Status (`pass`/`fail`/`n-a`) | Evidence link (video/log/screenshot) | Notes | +| --- | --- | --- | --- | +| macOS host app (Rust lane via `SKILLY_RUST_CORE_DYLIB_PATH`) | | | | +| macOS host app (Swift fallback lane; Rust dylib disabled) | | | | +| macOS host app (packaged Rust FFI artifact consumed) | | | | +| Windows native host app runtime (real desktop environment) | | | | +| Linux native host app runtime (real desktop environment) | | | | +| iOS host app simulator runtime (generated SDK integration) | | | | +| Android host app emulator runtime (generated SDK integration) | | | | + +## Policy Scenario Parity (Rust lane vs Swift lane) +| Scenario | Rust lane result | Swift lane result | Match (`yes`/`no`) | Notes | +| --- | --- | --- | --- | --- | +| Trial under cap -> allowed | | | | | +| Trial exhausted -> blocked (`trialExhausted`) | | | | | +| Active under cap -> allowed | | | | | +| Active over cap -> blocked (`capReached`) | | | | | +| Admin over cap/expired -> allowed | | | | | +| Canceled valid access under cap -> allowed | | | | | +| Canceled valid access over cap -> blocked (`capReached`) | | | | | + +## Skill Prompt Parity (Rust lane vs Swift lane) +| Scenario | Rust lane result | Swift lane result | Match (`yes`/`no`) | Notes | +| --- | --- | --- | --- | --- | +| Full vocabulary budget | | | | | +| Vocabulary trimming applied | | | | | +| Completed-stage history included | | | | | +| Missing-current-stage fallback | | | | | +| Pointing mode variants (`always`, `when-relevant`, `minimal`) | | | | | + +## Realtime Lifecycle Parity (Rust lane vs Swift lane) +| Scenario | Rust lane result | Swift lane result | Match (`yes`/`no`) | Notes | +| --- | --- | --- | --- | --- | +| `turn_started -> audio_capture_committed -> audio_playback_started -> response_completed` | | | | | +| `turn_started -> audio_capture_committed -> session_error` | | | | | +| `completed -> session_reset` | | | | | +| Invalid ordering rejection (`turn_started -> response_completed`) | | | | | + +## Windows/Linux Host-App Runtime Checklist +- Windows auth + entitlement + turn-start flow validated in native host app: +- Windows capture/hotkey/overlay/audio/permissions behavior validated: +- Linux auth + entitlement + turn-start flow validated in native host app: +- Linux capture/hotkey/overlay/audio/permissions behavior validated: +- Known platform-specific degradations documented: + +## iOS/Android Host-App Runtime Checklist +- iOS host app compiles with generated Swift SDK: +- iOS runtime exercise of policy and realtime API completed: +- Android host app compiles with generated Kotlin SDK: +- Android runtime exercise of policy and realtime API completed: +- ABI/library loading behavior documented for both platforms: + +## Packaged Artifact Consumption Checklist +- `dist/rust-ffi/*.tar.gz` unpack + load test executed: +- `dist/mobile-sdk/*.tar.gz` unpack + sample integration test executed: +- Checksum verification completed: + +## Evidence Attachments +- Command transcript links: +- Screen recordings: +- Crash logs / diagnostics: diff --git a/docs/architecture/rust-core-native-shells-prd.md b/docs/architecture/rust-core-native-shells-prd.md new file mode 100644 index 00000000..ee69a616 --- /dev/null +++ b/docs/architecture/rust-core-native-shells-prd.md @@ -0,0 +1,78 @@ +# PRD: Rust Core + Native Shells + +## Summary +Skilly will evolve from a macOS-only implementation into a cross-platform architecture with: +- a shared Rust core for deterministic business and orchestration logic +- native shell apps per platform for OS-specific capability integrations + +This PRD defines the product requirements, constraints, success metrics, and release strategy for that migration. + +## Problem +Today, critical behavior is implemented only in the macOS Swift host. This creates: +- slower Windows/Linux enablement due to duplicated logic work +- risk of policy/orchestration drift across future platform clients +- no reusable mobile-facing core API layer + +## Goals +1. Share policy, skill composition, and session orchestration logic across platforms. +2. Preserve native UX and high-performance platform integrations. +3. Keep macOS shipping continuously while migration proceeds. +4. Provide a clear path to iOS/Android SDK reuse from the same core. + +## Non-goals +1. Rewrite current macOS UI in a cross-platform UI framework. +2. Achieve full pixel-perfect parity across all platforms in v1. +3. Replace platform capture/hotkey/overlay APIs with one abstraction immediately. + +## Stakeholders +- Product owner: determines rollout readiness and priority across platforms. +- Core platform team: builds Rust core modules and FFI contracts. +- Shell teams: implement and maintain platform adapters and UX. +- QA/release: validates parity and release gates before ship decisions. + +## Users +- Current Skilly desktop users on macOS. +- Future desktop users on Windows/Linux. +- Future mobile app teams integrating shared logic via SDKs. + +## Requirements + +### Functional Requirements +1. Entitlement decisions are computed from one shared policy engine. +2. Admin allowlist behavior is keyed by stable WorkOS user IDs. +3. Skill prompt composition is generated by shared logic. +4. Realtime session lifecycle behavior is deterministic and replayable. +5. Platform shells can execute no-op and normal turns through shared core contracts. + +### Platform Requirements +1. macOS shell remains primary release channel during migration. +2. Windows/Linux shells support auth + entitlement + turn-start baseline. +3. Shell adapters explicitly report capability availability to UX. + +### Developer Experience Requirements +1. Rust workspace builds/tests in CI and local environments. +2. FFI contract changes are versioned and documented. +3. Parity fixtures are executable and version-controlled. + +## Constraints +1. Do not break macOS release flow while migrating internals. +2. Keep key secrets/API keys on backend worker, not client binaries. +3. Avoid large-bang rewrites; use incremental phase gates. + +## Success Metrics +1. Policy parity: 100% pass rate on agreed fixture set between Swift baseline and Rust engine. +2. Orchestration reliability: replay tests pass for at least 20 representative traces. +3. Desktop bootstrap: Windows/Linux shells can complete auth + entitlement + turn-start smoke tests. +4. Release continuity: no regression in macOS release pipeline. + +## Rollout Strategy +1. Migrate policy first, then skills, then realtime orchestration. +2. Keep Swift fallback paths until Rust paths are proven stable. +3. Add Windows/Linux shells only after core interfaces stabilize. +4. Enable mobile SDK extraction after desktop contracts settle. + +## Open Questions +1. Which Linux desktop environments are in v1 support scope? +2. Minimum Windows version baseline for shell + capture integrations? +3. Which realtime transport path should be canonical long-term in core contracts? + diff --git a/docs/architecture/rust-core-native-shells-roadmap.md b/docs/architecture/rust-core-native-shells-roadmap.md new file mode 100644 index 00000000..b723ad56 --- /dev/null +++ b/docs/architecture/rust-core-native-shells-roadmap.md @@ -0,0 +1,124 @@ +# Roadmap: Rust Core + Native Shells + +## Phase Tracker + +| Phase | Name | Status | Exit Criteria | +| --- | --- | --- | --- | +| 0 | Baseline + Safety Rails | Complete | Boundaries/capability docs + core scaffold + initial fixtures | +| 1 | Policy Core Extraction | Complete | Entitlement gate uses Rust bridge with safe fallback | +| 2 | Skill Prompt Core Extraction | Complete | Skill composition parity fixtures pass on Rust path | +| 3 | Realtime Orchestration Extraction | Complete | Session lifecycle replay suite passes on Rust path | +| 4 | Windows/Linux Shell Bootstrap | Complete | Both shells run auth + entitlement + turn-start smoke flow | +| 5 | Real Platform Adapters | Complete | Capture/hotkey/overlay baseline adapter contracts wired with capability-aware gating and validated by automated checks | +| 6 | Mobile SDK Surface | Complete | Swift/Kotlin SDK bindings and sample integrations available | + +## Detailed Phases + +### Phase 0: Baseline + Safety Rails +Completed: +- core boundaries documented +- capability matrix documented +- adapter contracts documented +- Rust workspace scaffolded (`core/domain`, `core/policy`, `core/skills`, `core/realtime`, `core/ffi`) +- policy fixtures and baseline tests added + +Remaining: +- Execute parity harness runs in Xcode and publish evidence snapshots + +### Phase 1: Policy Core Extraction +Completed: +- Rust policy engine implements can-start-turn decisions +- C ABI entrypoint exposed from `core/ffi` +- Swift `EntitlementManager.canStartTurn()` calls Rust first with Swift fallback +- `TrialTracker` and `UsageTracker` checks route through Rust policy bridge + +Remaining: +- Add Xcode-run integration tests around fallback behavior and bridge availability +- Validate packaged dylib artifact consumption in a full Xcode host-app lane + +### Phase 2: Skill Prompt Core Extraction +Completed: +1. Mirrored skill metadata and prompt composer contracts in `core/skills`. +2. Added fixture-driven parity tests for prompt composition. +3. Routed Swift `SkillPromptComposer` generation through Rust bridge with fallback. +4. Added FFI-level compose prompt parity test in `core/ffi`. + +Exit Criteria: +- Rust prompt outputs match current expected outputs for agreed fixture corpus. + +### Phase 3: Realtime Orchestration Extraction +Completed: +1. Added canonical turn/session state machine in `core/realtime`. +2. Added replay harness with fixture-driven traces. +3. Added `RustRealtimeBridge` and routed Swift `CompanionManager` turn lifecycle events through Rust replay transitions. + +Remaining: +1. Continue appending replay traces from ongoing production telemetry samples. + +Tasks: +1. Define canonical turn/session state machine in `core/realtime`. +2. Add replay harness for event sequence validation. +3. Move Swift orchestration decisions into Rust. + +Exit Criteria: +- Replay suite passes and no known regression in core turn lifecycle behavior. + +### Phase 4: Windows/Linux Shell Bootstrap +Completed: +1. Created shell bootstrap crates (`apps/windows-shell`, `apps/linux-shell`). +2. Implemented mocked auth + entitlement + turn-start smoke flows using shared core. +3. Added CI shell smoke runs on Ubuntu and Windows runners. +4. Wired shell bootstrap binaries to explicit platform adapter contracts for auth, entitlement, capture, hotkey, overlay, audio, and permissions. + +Remaining: +1. Native UI-level runtime verification in full platform host apps (outside Rust shell binaries). + +Tasks: +1. Create shell skeletons and bridge wiring. +2. Implement auth + entitlement + turn-start baseline. +3. Add per-platform smoke tests. + +Exit Criteria: +- Shell apps can complete baseline flow with shared core logic. + +### Phase 5: Real Platform Adapters +Completed: +1. Added Windows capture/hotkey/overlay/audio/permission adapter modules with capability contract outputs. +2. Added Linux capture/hotkey/overlay/audio/permission adapter modules with session-aware capability contract outputs. +3. Added capability-aware turn-start gating with explicit blocker reporting for unavailable critical adapters. +4. Removed mocked-only shell flow by routing both shells through adapter-backed execution path. + +Remaining: +1. Native host-app runtime verification on real Windows/Linux desktop environments (beyond CLI shell binaries). + +Exit Criteria: +- Baseline interactive behavior available on supported platform scope. + +### Phase 6: Mobile SDK Surface +Completed: +1. Added `core/mobile-sdk` UniFFI-exported crate for selected policy + realtime replay APIs. +2. Added reproducible binding generation script: `scripts/generate-mobile-sdk-bindings.sh`. +3. Generated Swift bindings in `sdk/ios/generated` and Kotlin bindings in `sdk/android/generated`. +4. Added integration samples: + - `sdk/ios/sample/PolicyAndRealtimeExample.swift` + - `sdk/android/sample/src/main/kotlin/app/tryskilly/sdk/PolicyAndRealtimeExample.kt` + +Remaining: +1. Wire generated SDK artifacts into full iOS/Android host apps with simulator/device runtime validation. + +Exit Criteria: +- SDK consumers can run policy + selected orchestration flows using shared core. + +## Dependency Graph +1. Phase 0 and 1 are prerequisites for all later phases. +2. Phase 2 and 3 can run in parallel after policy contracts stabilize. +3. Phase 4 depends on stable FFI contracts from phases 1-3. +4. Phase 5 depends on platform shell bootstrap readiness. +5. Phase 6 depends on stable domain/policy/realtime public contracts. + +## Delivery Cadence +1. Phase branches merged independently behind safe fallback paths. +2. Every phase requires: +- explicit acceptance criteria check +- documented risks + mitigation updates +- release impact assessment for macOS pipeline diff --git a/docs/architecture/rust-core-native-shells-test-spec.md b/docs/architecture/rust-core-native-shells-test-spec.md new file mode 100644 index 00000000..eeaa6a86 --- /dev/null +++ b/docs/architecture/rust-core-native-shells-test-spec.md @@ -0,0 +1,123 @@ +# Test Specification: Rust Core + Native Shells Migration + +## Purpose +Define verification coverage and release gates for migration phases so behavior remains stable while logic moves from Swift-only implementation to shared Rust core modules. + +Related references: +- `docs/architecture/swift-rust-fallback-parity-harness.md` +- `docs/architecture/rust-dylib-packaging-strategy.md` + +## Scope +1. Rust core unit and contract tests. +2. Swift bridge integration behavior (Rust-available and fallback modes). +3. Cross-platform shell smoke tests for baseline workflows. +4. Release pipeline safeguards for macOS continuity. +5. Mobile SDK binding generation and sample-usage coverage. + +## Test Layers + +### Layer 1: Rust Unit Tests +Applies to: +- `core/domain` +- `core/policy` +- `core/skills` +- `core/realtime` + +Coverage: +1. Policy decisions across entitlement states and caps. +2. Admin allowlist behavior keyed by WorkOS user IDs. +3. Boundary values (`= max`, `max - 1`, no user id). + +### Layer 2: Rust Contract/Fixture Tests +Input fixtures: +- `core/policy/fixtures/can_start_turn_cases.json` + +Coverage: +1. Every fixture case returns exact expected decision + reason. +2. Fixture schema validation and deterministic ordering. + +### Layer 3: Swift Bridge Integration +Targets: +- `RustPolicyBridge` +- `RustSkillsBridge` +- `RustRealtimeBridge` +- `EntitlementManager.canStartTurn()` +- `CompanionManager` turn lifecycle event tracking + +Scenarios: +1. Rust dylib available -> Rust result used. +2. Rust dylib unavailable -> Swift fallback path used. +3. Reason code mapping remains correct across ABI boundaries. + +### Layer 4: Host Behavior Smoke Tests +macOS: +1. Turn-start behavior remains correct for trial, active, capped, admin users. +2. Existing release pipeline still produces notarized DMG and appcast update. + +Windows/Linux (future phases): +1. Auth session established. +2. Entitlement fetched. +3. Turn-start baseline path executes. + +Windows/Linux (current shell binaries): +1. Capability-aware adapter gating executes before turn start. +2. Turn lifecycle replay reaches completed phase for baseline flow. +3. Explicit blocker reasons are emitted for unavailable critical adapters. + +### Layer 5: Mobile SDK Surface +Targets: +- `core/mobile-sdk` +- `scripts/generate-mobile-sdk-bindings.sh` +- `sdk/ios/generated` +- `sdk/android/generated` + +Coverage: +1. UniFFI-exported policy + realtime replay APIs compile and pass crate tests. +2. Swift and Kotlin bindings generate from a built mobile-sdk library. +3. Sample integration code remains aligned with generated API names and passes runtime consumer validation. + +## Acceptance Gates by Phase + +### Gate A (Phase 1 complete) +1. `cargo check` passes. +2. `cargo test` passes. +3. Rust bridge can produce expected decisions via C ABI smoke call. +4. `EntitlementManager.canStartTurn()` uses Rust first with safe fallback. + +### Gate B (Phase 2 complete) +1. Skill prompt fixture parity passes for selected corpus. +2. No known regression in skill activation and stage progression behavior. +3. FFI compose-prompt parity test passes against shared skills fixture. + +### Gate C (Phase 3 complete) +1. Realtime replay suite passes across representative traces. +2. No known regression in turn lifecycle boundaries. + +### Gate D (Phase 4 complete) +1. Windows and Linux shells pass baseline smoke tests. +2. Shared core decisions are observable in shell logs/traces. + +### Gate E (Phase 5 complete) +1. Platform adapter capabilities function in defined support matrix. +2. Capability fallback behavior is explicit and user-safe. + +### Gate F (Phase 6 complete) +1. `cargo test -p skilly-core-mobile-sdk` passes. +2. `./scripts/generate-mobile-sdk-bindings.sh` generates Swift + Kotlin bindings without errors. +3. `./scripts/validate-mobile-sdk-consumers.sh` passes (Kotlin/JVM runtime + macOS Swift runtime lanes). + +## Required CI Jobs +1. `rust-core-check`: `cargo check` +2. `rust-core-test`: `cargo test` +3. `ffi-smoke`: C ABI smoke test against built dylib +4. `shell-smoke`: windows/linux bootstrap smoke flow +5. `mac-release-guard`: verify release script preconditions and static checks +6. `mobile-sdk-bindings`: verify UniFFI binding generation command and artifacts +7. `mobile-sdk-consumer-validation`: run generated binding runtime validation against sample consumers +8. `mobile-sdk-artifacts`: build/package/publish mobile SDK + Rust FFI release artifacts + +## Known Gaps +1. No terminal `xcodebuild` execution in this repo by policy. +2. Swift bridge compile/runtime validation must be completed inside Xcode workflow. +3. End-to-end native host-app runtime validation across Windows/Linux compositor/audio environments remains pending beyond CLI shell binaries. +4. iOS/Android full simulator/device host-app runtime validation remains required beyond CLI/JVM sample lanes. diff --git a/scripts/README.md b/scripts/README.md index 4c4549d0..61c5abc5 100644 --- a/scripts/README.md +++ b/scripts/README.md @@ -1,8 +1,8 @@ -# Release Scripts +# Scripts ## `release.sh` — Ship a new version of Skilly -Automates the full release pipeline: build → sign → DMG → notarize → Sparkle appcast → GitHub Release. +Automates the full macOS app release pipeline: build -> sign -> DMG -> notarize -> Sparkle appcast -> GitHub Release. ### Quick start @@ -11,8 +11,6 @@ Automates the full release pipeline: build → sign → DMG → notarize → Spa ./scripts/release.sh ``` -The script checks GitHub for the latest release (e.g. `v1.5`, build 6) and automatically bumps to `v1.6`, build 7. You'll see a confirmation prompt before anything runs. - ### Override version or build ```bash @@ -23,40 +21,66 @@ The script checks GitHub for the latest release (e.g. `v1.5`, build 6) and autom ./scripts/release.sh 2.0 10 ``` -### Safety - -- **Duplicate detection**: If the tag already exists on GitHub, the script exits with an error and suggests what to do. -- **Confirmation prompt**: Shows the version, build, and previous release before proceeding. Press `y` to continue. - -### What it does - -1. Fetches the latest release from GitHub to determine version + build -2. Archives the app via `xcodebuild` -3. Exports a signed `.app` with Developer ID -4. Creates a DMG with the drag-to-Applications background -5. Notarizes the DMG with Apple (Gatekeeper compliance) -6. Signs the DMG with the Sparkle EdDSA key -7. Generates `appcast.xml` for Sparkle auto-updates -8. Creates a GitHub Release with the DMG attached -9. Pushes the updated `appcast.xml` to the releases repo - -### One-time setup (prerequisites) - -1. **Xcode** with your Developer ID signing certificate -2. **Homebrew tools**: - ```bash - brew install create-dmg gh - ``` -3. **GitHub CLI auth**: - ```bash - gh auth login - ``` -4. **Apple notarization credentials** (stored in Keychain): - ```bash - xcrun notarytool store-credentials "AC_PASSWORD" \ - --apple-id YOUR_APPLE_ID \ - --team-id YOUR_TEAM_ID - ``` - You'll be prompted for an app-specific password (generate one at [appleid.apple.com](https://appleid.apple.com)). -5. **Sparkle EdDSA key** — already generated and stored in Keychain (done during initial Sparkle setup) -6. **Build the project in Xcode at least once** so SPM downloads Sparkle and the Sparkle CLI tools are available +## `generate-mobile-sdk-bindings.sh` — Regenerate Swift/Kotlin SDK bindings + +Builds `skilly-core-mobile-sdk` and regenerates UniFFI bindings: + +- `sdk/ios/generated/` +- `sdk/android/generated/` + +```bash +./scripts/generate-mobile-sdk-bindings.sh +``` + +## `validate-mobile-sdk-consumers.sh` — Runtime-check mobile SDK consumers + +Runs end-to-end consumer validation against generated bindings: + +1. Regenerates bindings. +2. Builds mobile SDK Rust library. +3. Compiles and runs Kotlin/JVM sample against generated Kotlin bindings. +4. On macOS, compiles and runs Swift sample against generated Swift bindings. + +```bash +./scripts/validate-mobile-sdk-consumers.sh +``` + +## `package-mobile-sdk.sh` — Create distributable SDK artifact + +Packages generated bindings, samples, and the host release dynamic library into `dist/mobile-sdk/`. + +```bash +./scripts/package-mobile-sdk.sh +``` + +Output example: + +- `dist/mobile-sdk/skilly-mobile-sdk-v0.1.0-darwin.tar.gz` +- `dist/mobile-sdk/skilly-mobile-sdk-v0.1.0-darwin.tar.gz.sha256` + +## `package-rust-ffi-dylib.sh` — Create distributable core FFI artifact + +Builds `skilly-core-ffi --release` and packages the release library into `dist/rust-ffi/`. + +```bash +./scripts/package-rust-ffi-dylib.sh +``` + +Output example: + +- `dist/rust-ffi/skilly-core-ffi-v0.1.0-darwin.tar.gz` +- `dist/rust-ffi/skilly-core-ffi-v0.1.0-darwin.tar.gz.sha256` + +## `create-runtime-validation-report.sh` — Scaffold strict sign-off report + +Generates a dated manual-runtime validation report scaffold in `docs/architecture/` using the strict template: + +```bash +./scripts/create-runtime-validation-report.sh +``` + +Custom output path: + +```bash +./scripts/create-runtime-validation-report.sh docs/architecture/runtime-validation-report-custom.md +``` diff --git a/scripts/create-runtime-validation-report.sh b/scripts/create-runtime-validation-report.sh new file mode 100755 index 00000000..0147cb3e --- /dev/null +++ b/scripts/create-runtime-validation-report.sh @@ -0,0 +1,36 @@ +#!/usr/bin/env bash +set -euo pipefail + +REPO_ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)" +cd "$REPO_ROOT" + +TEMPLATE_PATH="$REPO_ROOT/docs/architecture/runtime-validation-signoff-template.md" +if [[ ! -f "$TEMPLATE_PATH" ]]; then + echo "Template not found at $TEMPLATE_PATH" >&2 + exit 1 +fi + +DATE_UTC="$(date -u +%Y-%m-%d)" +BRANCH_NAME="$(git rev-parse --abbrev-ref HEAD 2>/dev/null || echo unknown)" +COMMIT_SHA="$(git rev-parse --short HEAD 2>/dev/null || echo unknown)" + +DEFAULT_OUTPUT_PATH="$REPO_ROOT/docs/architecture/runtime-validation-report-${DATE_UTC}.md" +OUTPUT_PATH="${1:-$DEFAULT_OUTPUT_PATH}" + +if [[ -f "$OUTPUT_PATH" ]]; then + echo "Output file already exists: $OUTPUT_PATH" >&2 + echo "Provide a custom path as the first argument to avoid overwriting existing evidence." >&2 + exit 1 +fi + +{ + printf '# Runtime Validation Report\n\n' + printf 'Generated (UTC): %s\n' "$DATE_UTC" + printf 'Branch: `%s`\n' "$BRANCH_NAME" + printf 'Commit: `%s`\n\n' "$COMMIT_SHA" + printf 'Source template: `docs/architecture/runtime-validation-signoff-template.md`\n\n' +} > "$OUTPUT_PATH" + +cat "$TEMPLATE_PATH" >> "$OUTPUT_PATH" + +echo "Created runtime validation report scaffold: $OUTPUT_PATH"