Skip to content

Latest commit

 

History

History
676 lines (506 loc) · 48.2 KB

File metadata and controls

676 lines (506 loc) · 48.2 KB

REFACTOR.md — Pi Desktop

A first-principles refactor plan. Pristine code, strong boundaries, explicit contracts, and one honest way to do each thing.

This document is written after a full pass over the repository. It is opinionated on purpose. Every item is justified from first principles — not "we should do X because X is popular", but "the code has property P, P causes harm H, doing X removes H without adding cost C."

The plan is phased so each phase is independently mergeable, independently verifiable, and never leaves main broken.

Progress Snapshot

Status as of 2026-04-25, based on the work landed on this branch:

  • Repo-level verification is green: bun run lint, bun run lint:imports, bun run typecheck, bun run test, bun run build, node scripts/perf-bundle.mjs, and node scripts/perf-startup.mjs all pass.
  • Test suite reach: 272 test files (counted via spec-file count (fallback); run bun run test for the authoritative pass/skip breakdown).
  • These numbers are regenerated by bun run refactor:snapshot -- --update-refactor-md; do not hand-edit.

Phase Status

Phase Status Landed So Far Still Open
1. Repo hygiene Partial Single cn; duplicate desktop tooltip / textarea / prompt-input / chat-container removed; README truth pass; lru-map.spec.ts moved next to source icon surface policy, full test-location cleanup, Biome rule audit
2. Contracts + schema IPC Partial packages/contracts/ added; minimal contract-driven snapshot channels; preload/router contract seed; import-boundary test coverage payload-bearing channels, generated renderer IPC client, payload-parsers.ts deletion, compile-time full handler coverage
3. Catalog base Partial DocumentCatalog added under packages/shared/src/catalog; app-preferences, repository-preferences, selection-state, repository, and thread catalogs migrated internally; workspace-session-window-sanitizer.ts extracted from workspace-session-catalog.ts schema/versioned persistence, corruption backup flow in shared base, full workspace-session migration onto shared substrate, relocation under main/catalogs/
4. Bootstrap decomposition Partial application-menu.ts, main-window-lifecycle.ts, main-window.ts, agent-host-connection.ts, oauth-prompt-bridge.ts, active-thread-deletion.ts, workspace-selection-actions.ts, workspace-activation-router.ts, thread-workspace-actions.ts, thread-context-actions.ts, workspace-removal-actions.ts, shell-state-ipc.ts, initial-workspace-activation.ts, and app-lifecycle.ts extracted from main/index.ts SessionStore, DesktopBootstrap, proper dialog-based OAuth flow, IPC registration extraction, final index.ts reduction to target size
5. Git service split Partial git/path-utils.ts, status-parsers.ts, diff-parsers.ts, inspection.ts, inspection-outcomes.ts, repository-meta.ts, git-command-runner.ts, git-command-status.ts, repository-status.ts, repository-status-loader.ts, worktree-git-summary.ts, worktree-inspection.ts, cache-invalidation.ts, file-diff.ts, and worktree-creation.ts extracted; upstream/default-branch/worktree-summary/status assembly, repository-status loading, cache invalidation, diff assembly, worktree-creation planning, checked-command status refresh, and worktree summary inspection now delegated out; GitWorktreeService API preserved major module split into status / worktrees / diff / staging / remote, explicit cache module cleanup, remaining worktree helpers/effect wrapper reduction
6. Renderer decomposition Partial left-sidebar-workspaces-panel.tsx, left-sidebar-workspace-tree.tsx, left-sidebar-menus.tsx, left-sidebar-chrome.tsx, git-panel-commit-composer.tsx, git-panel-change-list.tsx, git-panel-header.tsx, chat-thread-turns.ts, chat-thread-transcript.tsx, chat-thread-transcript-turn.tsx, workspace-shell-fullscreen.ts, workspace-shell-events.ts, workspace-shell-main-pane.tsx, workspace-shell-terminal-aside.tsx, workspace-shell-layout.tsx, workspace-shell-derived-state.ts, workspace-shell-layout-props.tsx, workspace-session-store-migrations.ts, and workspace-session-store-content.ts; targeted render-path getState() cleanup and runtime session-content mutations extracted from the main store workspace-shell still above target size; feature-root error boundaries; broader store topology cleanup; large renderer stores still open
7. UI package design system Partial shared tokens.css and fonts.css; shared Button; duplicate renderer primitives removed for migrated components more primitive promotion, icon wrapper policy, skeleton standardization, font loading moved fully into @pi-desktop/ui
8. Agent host tidy Partial agent-feed split into facade + live/transcript/types; CLI framing/protocol/snapshot/line-handler/pending-request/process-lifecycle/snapshot-refresh/spawn helpers extracted; SDK snapshot/thinking/provider/session-bootstrap/prompt-lifecycle/runtime-snapshot/runtime-settings helpers extracted; live state helper extracted session-server and utility-process unification, final runtime cleanup, remaining file-size reduction
9. Enforcement Partial lint:imports; import-boundary checker; shared electron leak removed; real bundle and startup budget scripts; tracked .githooks/pre-commit plus scripts/pre-commit.mjs stricter Biome unsafe-any rules, tighter budgets once bundle work lands, optional hook installation ergonomics

Current Largest Remaining Files

Measured after the latest landed slices:

File LOC
apps/desktop/src/main/index.ts 722
apps/desktop/src/main/git-worktree-service.ts 715
apps/desktop/src/renderer/src/app.css 702
apps/desktop/src/renderer/src/components/ui/code-block.tsx 566
apps/desktop/src/renderer/src/features/workspace/components/search-replace/search-replace-panel.tsx 525
apps/desktop/src/main/effect/layers.ts 517
apps/desktop/src/renderer/src/components/ui/terminal.tsx 499
apps/desktop/src/renderer/src/features/command-palette/command-palette.tsx 484
apps/desktop/src/main/auto-updater.ts 473
apps/desktop/src/renderer/src/features/workspace/use-app-shell-controller.ts 462
apps/desktop/src/renderer/src/features/workspace/components/git-panel-model.ts 456
apps/desktop/src/renderer/src/stores/workspace-session-store-migrations.ts 454
apps/desktop/src/renderer/src/features/workspace/use-file-tree.ts 443
apps/desktop/src/renderer/src/stores/window-store.ts 433
apps/desktop/src/main/packages/packages-catalog-client.ts 423
apps/desktop/src/renderer/src/components/ui/markdown.tsx 415
apps/desktop/src/renderer/src/features/workspace/components/file-tree-panel.tsx 413
apps/desktop/src/renderer/src/features/workspace/components/media-preview/media-preview.tsx 413
packages/shell-model/src/shell-model.ts 409
apps/desktop/src/renderer/src/stores/workspace-session-store.ts 405

0. Goals & Non-Goals

Goals

  1. Single source of truth per concept. Each domain concept (selection, session, runtime, catalog, context) lives in exactly one module with an unambiguous owner.
  2. Pure core, impure edges. Domain logic is synchronous, deterministic, and free of electron, fs, child_process, and node-pty. Side-effects live on the boundary and are injected.
  3. Typed, validated contracts at every boundary. IPC payloads, persisted JSON, and subprocess messages are parsed — not cast, not trusted.
  4. Small files, small functions, small modules. Break anything above the thresholds in §2.
  5. One Effect style, used consistently. Effect is already a dep; pick one pattern (Services + Layers or free functions), delete the other.
  6. One state story in the renderer. Zustand stores composed deliberately, not ad-hoc getState() calls from React.
  7. Testable without Electron. The bulk of main-process code should run under Vitest in Node with zero electron shims.
  8. Pristine workspace hygiene. No dist/ or out/ in git, no dead branches of code, no "legacy" alongside "new" forever.

Non-Goals

  • Rewriting the Pi agent SDK or CLI protocols.
  • Swapping frameworks (Electron, React, Zustand, Vite, Biome all stay).
  • Feature work. This refactor changes shape, not behavior. Behavioral changes are called out explicitly and gated.

1. Current State — Honest Assessment

1.1 Shape

apps/desktop/src/main         ~45 files, 2 layers deep
apps/desktop/src/preload      3 files (api.ts = 509 LOC)
apps/desktop/src/renderer     200+ files across features/, components/, lib/, stores/, hooks/, app-shell/
packages/shared               flat models/, ipc/, observability/ — clean
packages/agent-host           runtime/, mock/, pi/, session-server/, utility-process/, events/, state/ — clean
packages/shell-model          3 files
packages/ui                   6 components + lib + styles

Largest files (evidence):

File LOC Problem
apps/desktop/src/main/git-worktree-service.ts 1788 God service — parsing, process spawn, caching, status, diff, staging, commit, push/pull all in one class
apps/desktop/src/main/index.ts 1555 Bootstrap + controller + IPC wiring + lifecycle + app menu + uninstall + OAuth + switching + removal + shutdown
apps/desktop/src/renderer/.../left-sidebar.tsx 1075 UI, layout, menus, nav, drag/drop, virtualization all in one component
apps/desktop/src/renderer/.../git-panel.tsx 849 View + business logic + IPC orchestration
packages/agent-host/src/pi/pi-cli-rpc-agent-runtime.ts 848 RPC framing + process lifecycle + event normalization + state
apps/desktop/src/renderer/.../chat-thread-panel.tsx 696 Virtualized list + message rendering + scroll anchor + drag
apps/desktop/src/renderer/.../workspace-shell.tsx 607 Top-level shell composes everything directly
packages/shell-model/src/agent-feed.ts 501 Mixed reducer + derivation + event pump
apps/desktop/src/preload/api.ts 509 Hand-written channel bindings; easy to drift from IPC_CHANNELS

1.2 Concrete Smells

  1. Mutable captured closures as architecture. index.ts holds currentContext, currentTransport, currentHost, unsubscribe as module-scoped lets, then passes them to createContextSwitchController via getter/setter property descriptors. This is a reactive primitive, hand-rolled. It belongs in a typed state machine.
  2. Bootstrap is not a module. bootstrapDesktop() is 1000+ lines of inline closures. It violates every single-responsibility heuristic.
  3. IPC types are nominal strings. IPC_CHANNELS is const-typed, but the payloads are parsed in one direction (payload-parsers.ts) and hand-serialized in the other (preload/api.ts). No single source of truth for request/response schema.
  4. createSanitizingHandle + ipcMain.handle.bind: error sanitization is a wrapper that must be remembered at every registration. Easy to forget. Not enforced by the type system.
  5. as casts and any-shaped guards everywhere in IPC parsers. payload-parsers.ts is the symptom; the cure is parse-don't-validate with schemas.
  6. Effect usage is partial and inconsistent. Some handlers use Effect.tryPromise + Effect.catchAll; others use raw try/catch; others use runEffectVoid for what could be await. No Layers, no Context, no Effect.Service — so Effect is a fancy try/catch here. Either commit or remove.
  7. Renderer reaches into stores via getState() from React. See app.tsxuseSearchReplaceFiles closes over sessionStore.getState() inside a memo that doesn't subscribe. This is a stale-read bug waiting to happen. Every non-subscribed getState() in a component is a bug or latent bug.
  8. Catalog pattern duplicates effort. RepositoryCatalog, ThreadCatalog, WorkspaceSessionCatalog, AppPreferencesCatalog, RepositoryPreferencesCatalog, SelectionState all reimplement "load JSON, validate, update, persist, notify" with tiny variations. One abstraction + schemas can replace ~1500 LOC.
  9. index.ts imports from ../thread-title-defaults — a sibling to main/, not inside it. Why is that file not in main/threads/? Because no one drew the line.
  10. dist/ and out/ are present in working tree. They should never be committed; verify .gitignore and clean.
  11. Mixed concerns in packages/ui: only 6 components live there but duplicates exist in apps/desktop/src/renderer/src/components/ui/ (e.g. tooltip.tsx, textarea.tsx, prompt-input.tsx, chat-container.tsx). Two copies, eventual drift.
  12. @pi-desktop/ui has no tokens or theme. Styling lives in apps/desktop/src/renderer/src/app.css and ad-hoc className concatenation. No design tokens module.
  13. Tests spread across three locations (apps/desktop/**/*.spec.*, packages/**/*.spec.*, tests/integration/). The split between "unit" and "integration" is not enforced — tests/integration/apps-desktop/ contains files that look like unit tests (lru-map.spec.ts).
  14. Icon policy not followed. The project mandates Phosphor Icons, and apps/desktop depends on @phosphor-icons/react. But components/ui/icons.tsx exists and may re-wrap. Verify one icon surface.
  15. class-variance-authority, clsx, tailwind-merge live in apps/desktop deps but cn/variants live in packages/ui/src/lib/utils.ts. Desktop re-implements its own cn in lib/utils.ts. Pick one.
  16. node_modules per workspace is committed into the working tree (visible in ls). Not in git, but worth a clean bun install pass to verify workspace hoisting.
  17. Preload is manually split (index.tsruntime.tsapi.ts) to satisfy Electron's requirement, fine — but api.ts is 509 LOC of hand-written channel-to-method mapping. Should be generated from the IPC contract.
  18. No runtime validation of persisted JSON. Files loaded via PersistentJsonFile are trusted to match a TS type. Corruption, old versions, and hand-edits silently pass.
  19. IPC_CHANNELS and handlers don't cross-check. Nothing fails at compile time if a channel is declared and no handler is registered (or vice versa). ipc-handlers-coverage.spec.ts implies someone felt the pain and wrote a runtime test to patch it.
  20. No error taxonomy at the renderer boundary. PiError codes exist in main, but renderer receives sanitized Error objects and parses message strings.

2. Principles & Hard Limits

These are the rules the refactored code must obey. Deviations require a written justification in the PR.

2.1 File size & complexity

  • No file over 300 LOC (excluding imports/types-only files). The four >500 LOC files are the worst offenders and get split first.
  • No function over 40 LOC, no function over 3 levels of nesting. Extract helpers.
  • No React component over 200 LOC. Split into a presentational component + a hook.
  • No index.ts (barrel) over 40 re-exports and no deep barrel chains. Barrels are for package entry points only.

2.2 Type safety

  • Zero as casts outside schema parsers. Use type predicates or parsed schemas.
  • Zero any. Use unknown and narrow.
  • noUncheckedIndexedAccess already on in tsconfig.base.json — keep and honor it.
  • Every IPC payload has a schema declared once in @pi-desktop/shared and used on both sides.

2.3 Boundaries

  • Renderer must never import from electron, node:*, or any main-process module. Enforced by an ESLint/Biome rule (see §9).
  • Main must never import from react, zustand, or any renderer module.
  • @pi-desktop/shared must depend on nothing runtime-ish. No electron, no effect (or only effect/Schema), no fs.
  • @pi-desktop/agent-host depends on shared + effect + @mariozechner/pi-coding-agent only.
  • @pi-desktop/ui depends on react, @phosphor-icons/react, @radix-ui/*, clsx, tailwind-merge. Nothing else.

2.4 Effect

  • Commit to "Effect at the edge": every async operation that crosses a boundary (IPC handler, process spawn, fs read) is an Effect with typed errors. Business logic under it is plain functions.
  • Use Effect.Service for injectable capabilities: GitService, FsService, ClockService, TerminalService, CatalogService.
  • Use Layer composition in main/index.ts to build the runtime once.
  • No more runEffectVoid(Effect.try(...).pipe(Effect.catchAll(...))) on synchronous code. Either use Effect fully or use a plain function.

2.5 Testing

  • Pure logic: fast Vitest unit tests colocated next to the file.
  • Process/IPC wiring: integration tests under tests/integration/.
  • User flows: Playwright under tests/e2e/ (if/when added).
  • Every catalog, every schema, every reducer has tests.
  • Snapshot/golden tests only for stable output (e.g. markdown renderer), never for component trees.

2.6 Naming & layout

  • kebab-case filenames, matches existing convention. Keep.
  • One exported symbol per file when practical; export-shape: export function foo, export type Foo, no default exports except React components at page roots.
  • Feature folders over layer folders in the renderer. Already partially done under features/ — extend to the rest.
  • Colocation: tests, styles, and hooks sit next to the thing they belong to.

3. Target Architecture

pi-desktop/
├── apps/
│   └── desktop/
│       ├── src/
│       │   ├── main/
│       │   │   ├── bootstrap/         # top-level assembly; <150 LOC entry
│       │   │   ├── ipc/               # contract-driven handler registration
│       │   │   ├── services/          # Effect.Service impls: Git, Fs, Terminal, AgentRuntime
│       │   │   ├── catalogs/          # one module per catalog, one shared base
│       │   │   ├── session/           # ContextSwitchController as a typed state machine
│       │   │   ├── windows/           # window creation, menus, tray, updater
│       │   │   └── index.ts           # <80 LOC — wires layers, starts app
│       │   ├── preload/
│       │   │   └── index.ts           # generated from shared IPC contract
│       │   └── renderer/
│       │       └── src/
│       │           ├── app/           # App.tsx, providers, error boundary
│       │           ├── features/      # feature-sliced: workspace, chat, git, terminal, settings, palette
│       │           ├── state/         # zustand stores + selectors; one barrel per feature
│       │           ├── ipc/           # typed renderer-side IPC client
│       │           └── shared/        # cross-feature primitives (a11y, keyboard, theme, perf, undo)
├── packages/
│   ├── contracts/     # NEW: IPC schemas (Effect Schema or Zod), types, channels; zero deps
│   ├── shared/        # TRIMMED: domain models only (no IPC)
│   ├── agent-host/    # as today, cleaner runtime split
│   ├── shell-model/   # as today, split agent-feed
│   └── ui/            # real design system + tokens, single source
└── tests/
    ├── unit/          # (if not colocated)
    ├── integration/
    └── e2e/           # Playwright (future)

New package: @pi-desktop/contracts. This is the bit the codebase is missing: a single, dependency-free package that declares channels and their request/response schemas. Preload, main handlers, and renderer client all generate from it.


4. Phased Plan

Each phase ends with: bun run lint && bun run typecheck && bun run test && bun run build green, app boots, DMG still builds.

Phase 1 — Repo hygiene (low risk, high signal)

  1. Remove build artifacts from working tree. Verify .gitignore covers dist/, out/, node_modules/, coverage/, playwright-report/, test-results/, .todos/, .pi/, .opencode/. Delete stray directories.
  2. Single cn utility. Keep packages/ui/src/lib/utils.ts. Delete apps/desktop/src/renderer/src/lib/utils.ts. Update imports.
  3. Icon policy enforcement. Audit components/ui/icons.tsx. If it only re-exports Phosphor, delete it and import directly. If it adds value, rename to phosphor-icons.tsx and document.
  4. Delete UI duplicates. components/ui/{tooltip,textarea,prompt-input,chat-container}.tsx in the desktop app re-implement @pi-desktop/ui exports. Delete the desktop copies, import from the package.
  5. Consolidate test locations. Move unit tests to live next to their source (the pattern already exists for most files). Reserve tests/integration/ for tests that require process spawning, electron stubs, or cross-package wiring. Move misplaced files (lru-map.spec.ts, etc.) to their proper homes.
  6. README.md truth pass. The current README advertises a Zustand + Tailwind + Radix stack accurately; fix the monorepo tree to match reality (the README's tree is already close). Replace the handwaved "Catalog Pattern" paragraph with a link to this document.
  7. Biome config audit. noStaticElementInteractions is off — keep only if there's a real reason; otherwise enable. noArrayIndexKey off — enable per-file with biome-ignore where genuinely needed (rare).

Exit criteria: green quality suite; working tree has no build outputs; no duplicate UI components.


Phase 2 — Contracts package and schema-driven IPC

This is the structural keystone. Do this before anything else structural.

  1. Create packages/contracts/. Zero runtime deps. Uses effect/Schema (already transitively available via effect) to declare:
    • Every channel name.
    • Every request schema.
    • Every response schema.
    • Every event (push) schema.
  2. Shape:
    // packages/contracts/src/agent.ts
    import { Schema } from "effect";
    
    export const PromptRequest = Schema.Struct({ text: Schema.String });
    export type PromptRequest = Schema.Schema.Type<typeof PromptRequest>;
    
    export const agentChannels = {
      prompt: {
        channel: "agent:prompt",
        request: PromptRequest,
        response: Schema.Void,
      },
    } as const;
  3. Generate preload API from contracts. Replace the 509-LOC hand-written preload/api.ts with a single factory that walks the contract table and emits typed invoke(channel, payload) methods. Event channels emit typed subscribe(listener).
  4. Main-side router generator. registerIpcHandlers({ agent, git, ... }) becomes a typed dispatcher that, given a contract table and a handler map, registers every channel and validates payloads via Schema.decodeUnknown. A missing handler = type error. An extra handler = type error.
  5. Renderer IPC client. Replace scattered window.piDesktop.xxx call sites with a single typed client (also generated) under renderer/src/ipc/. This enforces the same schemas on the way in.
  6. Delete ipc/payload-parsers.ts. Parsers are redundant once schemas exist.
  7. Migrate in slices. Start with agent:*, then repositories:*, then state:*, then git:*, then terminal:*, then packages:*, then the rest. One PR per domain.

Exit criteria: ipc-handlers-coverage.spec.ts becomes a compile-time guarantee and is deleted.


Phase 3 — Catalog base and schema persistence

  1. Create packages/shared/src/catalog/ with:
    • PersistentCatalog<T>: one class, not many. Takes a schema, a file path, a reducer set, and exposes list(), get(), upsert(), remove(), subscribe().
    • Uses Schema.decodeUnknown on load. On decode failure: log, back up corrupted file, reinitialize with default.
    • Versioned: every schema carries a version: N field; migrations are declared as { from: 1, to: 2, migrate(x) }.
  2. Rewrite each catalog as a thin wrapper:
    • RepositoryCatalog, ThreadCatalog, WorkspaceSessionCatalog, AppPreferencesCatalog, RepositoryPreferencesCatalog, SelectionState all reduce to ~50 LOC each.
  3. Rename & relocate: move all catalogs under apps/desktop/src/main/catalogs/ (one file each). Move persistent-json-file.ts under the same folder as the internal impl.
  4. Concurrency: the new base uses a single in-memory mutator queue per file. Atomic replace via temp file + rename (already the pattern; verify).

Exit criteria: every catalog has a schema, round-trips under unit tests, handles corruption gracefully.


Phase 4 — Bootstrap decomposition

Target: apps/desktop/src/main/index.ts goes from 1555 LOC to under 80.

  1. Extract a SessionStore — a typed state machine replacing the currentContext/currentHost/currentTransport/unsubscribe closure cluster. States: NoWorkspace, WorkspaceWithoutRepo, RepoNoThread, ActiveThread. Transitions return new states; no getters/setters.
  2. Extract ContextSwitchCoordinator — today's createContextSwitchController is already here; upgrade it to consume and produce SessionStore states rather than mutating properties.
  3. Extract AppMenuBuilder (macOS menu + Uninstall action) into windows/app-menu.ts.
  4. Extract OAuthPromptBridge from the executeJavaScript("window.prompt") block — that pattern is ugly and blocks the renderer. Replace with a proper in-app dialog via IPC_CHANNELS.dialog.
  5. Extract DesktopBootstrap.run() that:
    • builds services (Effect Layers),
    • builds catalogs,
    • builds SessionStore,
    • restores prior workspace (resolveInitialWorkspaceTarget),
    • creates main window,
    • registers IPC handlers,
    • wires shutdown.
  6. index.ts becomes:
    import { DesktopBootstrap } from "./bootstrap/desktop-bootstrap";
    import { app } from "electron";
    
    DesktopBootstrap.run().catch((err) => {
      console.error("Fatal:", err);
      app.quit();
    });

Exit criteria: index.ts ≤ 80 LOC; every former module-level let is inside a typed state object; all integration tests pass unmodified.


Phase 5 — Git service split

Target: git-worktree-service.ts (1788 LOC) broken into 5–7 focused modules.

  1. git/porcelain.ts — pure wrappers over git child processes with typed parse functions. No caching, no state.
  2. git/status.ts — repo status, worktree inspection, inspect() / inspectAsync().
  3. git/worktrees.tscreateWorktree, removeWorktree, listWorktrees.
  4. git/diff.ts — diffs (file, staged, range).
  5. git/staging.ts — stage/unstage/discard/commit.
  6. git/remote.ts — fetch/pull/push.
  7. git/index.ts — composes into a GitService Effect.Service consumed by handlers.

Each module < 300 LOC. Each has its own tests. No shared mutable cache — if caching is needed, it goes behind an explicit GitStatusCache with a documented invalidation policy.


Phase 6 — Renderer decomposition

  1. Split left-sidebar.tsx (1075) into:
    • left-sidebar/ folder
    • left-sidebar.tsx (layout shell, ≤150 LOC)
    • repository-list.tsx, worktree-list.tsx, thread-list.tsx, sidebar-header.tsx, sidebar-footer.tsx
    • One use-left-sidebar.ts hook composing use-left-sidebar-layout + use-left-sidebar-menus + data.
  2. Split git-panel.tsx (849) into git-panel.tsx (view), use-git-panel.ts (state), git-panel-model.ts (pure reducers — already exists, absorb more logic into it).
  3. Split chat-thread-panel.tsx (696) into chat-thread-panel.tsx (container), chat-thread-list.tsx (virtualized list), use-chat-thread.ts (scroll + data).
  4. Split workspace-shell.tsx (607) into a layout primitive + a controller hook. The useAppShellController pattern already exists; extend it.
  5. Kill getState()-in-render. Every such call (app.tsx, hooks, feature code) either:
    • uses useStore(selector) with a proper selector, or
    • moves to an event handler / effect where imperative reads are legitimate.
  6. Store topology. One store per bounded context: sessionStore, workspaceStore, uiStore, notificationsStore, snapshotsStore. Cross-store reads go through selectors, not direct imports.
  7. Error boundaries at feature roots: Chat, Git, Terminal, FileTree, Settings — each wrapped with a named boundary so a crash in one panel doesn't nuke the workspace.

Phase 7 — UI package becomes the design system

  1. Move tokens (--color-bg-primary, animation easings from app.tsx's inline style, font sizes, spacing) into packages/ui/src/styles/tokens.css.
  2. Promote primitives to @pi-desktop/ui: Button, Dialog, Popover, Tooltip, Select, Separator, ScrollArea, Collapsible, Avatar, Tabs, Textarea, Card (only if actually used and not nested — see §7.3).
  3. Font integration via Fonttrio. Per house rules, select a distinctive trio for the UI surface; wire via @fontsource-* imports in @pi-desktop/ui/styles/fonts.css.
  4. Icons: single Icon component in @pi-desktop/ui that wraps Phosphor with default size/weight; everything else imports from @phosphor-icons/react directly.
  5. Skeletons: adopt boneyard-js (already a root dep) for every loading state. No ad-hoc pulsing divs.

7.1 Cards

Per house rules: avoid cards in designs; never nest them. Audit components/ui/card.tsx usages. Replace most with plain section + spacing.


Phase 8 — Agent host tidy

  1. Split pi-cli-rpc-agent-runtime.ts (848) into:
    • rpc-framing.ts (newline-delimited JSON framing + parsers)
    • rpc-transport.ts (process spawn + stdin/stdout wiring)
    • runtime.ts (lifecycle + subscribe)
    • event-normalizer.ts (re-use existing events/normalize-agent-session-event.ts)
  2. Split pi-sdk-agent-runtime.ts (484) into runtime.ts + provider-registry.ts (a test already exists for the split).
  3. Unify session-server + utility-process: both wrap the same concept (a runner for a Pi agent); document which is used when, extract the common bits.
  4. Split shell-model/agent-feed.ts (501) into reducer.ts (pure) + stream.ts (subscription).

Phase 9 — Enforcement

Fence the walls we just built so they don't erode.

  1. Import boundaries via a Biome linter rule or a small custom script:
    • apps/desktop/src/renderer/** cannot import from electron, node:*, @pi-desktop/agent-host, or apps/desktop/src/main/**.
    • apps/desktop/src/main/** cannot import from react, zustand, or apps/desktop/src/renderer/**.
    • packages/shared/** cannot import from electron or node:*.
  2. Bundle size budget in scripts/perf-bundle.mjs: fail CI if the renderer bundle grows > 5% without justification.
  3. Startup budget via scripts/perf-startup.mjs: tracked regression test.
  4. noExplicitAny and noUnsafeAssignment in Biome strict mode; CI blocks on violations.
  5. Pre-commit hook runs biome check --fix and bun run typecheck on staged workspaces.

5. Data & Type Contracts

5.1 Persisted data versioning

Every persisted JSON file embeds { schemaVersion: number, data: T }. Load path:

read → decode envelope → if version < current: run migrations → decode T → return

On fail: write *.corrupt-<timestamp>.json beside it, return default, log warning.

5.2 IPC error taxonomy

PiError codes (main) map to a public renderer error shape:

type IpcErrorResponse = {
  code: PiErrorCode;
  message: string;   // safe to show
  details?: Record<string, string>;
};

sanitize-ipc-error.ts becomes toIpcError(e: unknown): IpcErrorResponse, automatically invoked by the router. Renderer wraps every invoke in a Result<T, IpcErrorResponse> helper.


6. Tests

6.1 Organization

  • apps/desktop/src/**/*.spec.ts(x) — unit, colocated.
  • packages/*/src/**/*.spec.ts — unit, colocated.
  • tests/integration/** — cross-package, process, or electron-in-node tests.
  • tests/e2e/** — reserved for Playwright (future phase).

6.2 Coverage targets

  • Catalogs: 100% branch. Corruption + migration round-trips must be tested.
  • IPC schemas: every request/response schema has a round-trip test using Schema.decode(Schema.encode(x)).
  • SessionStore state machine: every transition has a test.
  • Reducers (git-panel-model, agent-feed, shell-model): 100% branch.
  • React components: test observable behavior only (props → rendered output + events). No store-internals assertions.

6.3 Speed

Vitest workspace config is fine. Add --passWithNoTests=false (default). Keep jsdom only for files that need it — main-process tests run in node.


7. Style & Design System

7.1 Typography

  • Use Fonttrio to choose a distinctive pair/trio. Candidate direction: a warm sans for UI, a geometric mono for terminal/code, an editorial serif reserved for empty states and onboarding only (sparingly).
  • Wire via @pi-desktop/ui/styles/fonts.css, imported exactly once from the renderer entry.

7.2 Tokens

packages/ui/src/styles/tokens.css holds CSS custom properties for color, spacing (4/8/12/16/24/32/48), radii, shadows (use sparingly), easings, durations. Renderer consumes these; no component hardcodes a hex or a px.

7.3 Layout rules

  • No cards. No nested cards.
  • Hierarchy via typography and spacing, not via boxes.
  • Shadows used only for overlays (dialogs, popovers).
  • Motion: respect prefers-reduced-motion; keep motion-safe: wrappers (already in use).

7.4 Icons

Phosphor only. Default: weight="light", size="1.25rem". Override per context.

7.5 Skeletons

boneyard-js for every async region.


8. Observability

  1. Structured logging in main. Replace console.error("[ipc] handler error", ...) with a tiny log.ts that emits { level, scope, message, fields } and is redacted via packages/shared/src/observability/redaction.ts (already present).
  2. Perf timers (perf-timer.ts exists) wrap every IPC handler and expose p50/p95 via a dev overlay (lib/perf/perf-overlay.tsx exists — surface it via Ctrl+Shift+P).
  3. Crash reporting hook (noop in dev, real in prod) sits at the Electron app level.

9. Workstream order (concrete)

The phases above are ordered to minimize merge conflicts and keep the app bootable at every step.

# Phase PR count (est.) Risk Unlocks
1 Repo hygiene 3–5 Low Everything else
2 Contracts & schema IPC 6–8 Medium Safe bulk refactors later
3 Catalog base 3–4 Low Bootstrap decomposition
4 Bootstrap decomposition 4–6 Medium Session state machine stable
5 Git service split 3 Low Faster test iteration
6 Renderer decomposition 8–12 Medium All feature work gets easier
7 UI package as design system 4–5 Low Visual consistency
8 Agent host tidy 2–3 Low Dependable runtime
9 Enforcement 2 Low Prevents regression

Each PR: green lint, typecheck, test, build. App boots. DMG builds unchanged.


10. Explicit behavioral changes (gated)

Most of this plan preserves behavior. The exceptions:

  1. OAuth prompt stops using webContents.executeJavaScript("window.prompt(...)"). It becomes a proper in-app dialog. Gated behind a feature flag PI_DESKTOP_NEW_OAUTH_DIALOG during rollout.
  2. Persisted JSON auto-backup on corrupt load: new behavior, user-visible only if a file was actually corrupt. Safe by construction.
  3. IPC errors: renderer may now receive structured {code, message} where it previously got a string. Update error display accordingly; old strings remain the user-visible message.

11. What we are explicitly not doing

  • Not introducing a second state library in the renderer.
  • Not adding a router (no URLs).
  • Not moving off Electron Vite or Bun.
  • Not changing the Pi agent wire protocol.
  • Not introducing a monorepo tool beyond bun workspaces. No turbo, no nx.
  • Not adding a fourth process boundary.
  • Not adopting CSS-in-JS. Tailwind v4 + tokens stays.
  • Not rewriting tests we'll delete; we'll delete them and write new, smaller ones.

12. Done-definition

The refactor is "done" when:

  1. No file in the repo is over 300 LOC of code (type-only files excluded).
  2. apps/desktop/src/main/index.ts is under 80 LOC.
  3. There is exactly one place that declares each IPC channel, schema, and payload.
  4. packages/ui is the only source of shared UI primitives.
  5. bun run lint && bun run typecheck && bun run test && bun run build are all green on a fresh clone with no warnings.
  6. A new contributor can locate the implementation of any user-visible action by reading file names alone.
  7. Removing a feature touches exactly one directory.

This plan is intentionally concrete. Anything vague in it is a bug in the plan; raise it before implementing.


13. Review of Staged Changes (2026-04-24)

This section is an independent review of the work currently staged on styling/fix-contrast-and-typography. It audits the prior agent's claims in §0–§12 against what is actually on disk.

13.1 Verification — what passed

All quality gates were re-run on the working tree:

Command Result
bun run lint green — 752 files checked, no fixes needed
bun run lint:imports green — import boundary check passed
bun run typecheck green — all 6 workspaces pass
bun run test green — 244 test files / 1767 passing / 2 skipped
bun run build green — main, preload, renderer all build
node scripts/perf-bundle.mjs green — main 21.74/24 MiB, renderer 35.52/40 MiB
node scripts/perf-startup.mjs green on the latest re-run — main 735.3/1500 KiB, preload 12.4/192 KiB

The verification claims in §"Progress Snapshot" hold up after the latest re-run. The test count in that table is stale — actuals were higher in this audit — but the current snapshot above now reflects the regenerated branch state.

13.2 LOC table is stale

The "Current Largest Remaining Files" table (lines 33–57) does not match the working tree. Spot check:

File Doc claim Actual Delta
apps/desktop/src/main/index.ts 928 722 -206
apps/desktop/src/main/git-worktree-service.ts 831 715 -116
apps/desktop/src/renderer/src/stores/window-store.ts 702 433 -269
apps/desktop/src/renderer/src/stores/workspace-session-store.ts 664 405 -259
apps/desktop/src/preload/api.ts 450 396 -54
apps/desktop/src/main/shell-catalog-builder.ts 463 not in top 30

The renderer stores have been decomposed substantially since the table was written (see new window-store-create.ts, window-store-reducer.ts, workspace-session-store-{cleanup,content,migrations,persistence,session-operations,window-actions}.ts). The doc should be regenerated — it currently understates how much progress has shipped.

13.3 Phase-by-phase audit

Phase 1 — Repo hygiene

  • apps/desktop/src/renderer/src/lib/utils.ts deleted; single cn lives in @pi-desktop/ui.
  • ✅ Duplicate UI primitives removed: tooltip.tsx, textarea.tsx, prompt-input.tsx, chat-container.tsx all gone from desktop.
  • lru-map.spec.ts moved next to its source.
  • ⚠️ Icon policy NOT enforced. apps/desktop/src/renderer/src/components/ui/icons.tsx still exists. House rules and §2.3/§7.4 require Phosphor as the single icon surface; this file should be audited and either deleted or renamed to phosphor-icons.tsx with documentation.
  • ⚠️ Biome rule audit not done.

Phase 2 — Contracts package and schema-driven IPC

  • packages/contracts/ exists and is wired into the workspace.
  • 🔴 The keystone of Phase 2 is missing: there are no Effect Schemas. packages/contracts/src/index.ts only exports NoPayloadIpcContract<TResponse> markers — bare { channel: string } objects with a phantom type parameter. There is zero runtime validation. grep -rn "Schema\." packages/contracts returns nothing.
  • 🔴 apps/desktop/src/main/ipc/payload-parsers.ts is still present. §4.2 step 6 says "Delete ipc/payload-parsers.ts. Parsers are redundant once schemas exist." Schemas don't exist, so parsers stayed. The package as it stands is decoration, not the keystone described in §3 ("the bit the codebase is missing").
  • ⚠️ Coverage is also tiny: only shell.getSnapshot, agent.getProviders/getSettings/getSnapshot are seeded. Five no-payload reads out of dozens of channels.
  • Recommendation: before declaring Phase 2 partial, decide whether to commit (use effect/Schema for real, migrate one domain end-to-end) or revert. The current state is the worst of both worlds: a new package with no value plus the original parsers.

Phase 3 — Catalog base

  • packages/shared/src/catalog/document-catalog.ts added, tests in tests/integration/shared/document-catalog.spec.ts.
  • 🔴 No Schema-based versioning yet. §3 requires { schemaVersion, data } envelopes, decode-on-load, corrupt-file backup. DocumentCatalog does not yet implement any of this — it is structural plumbing only.
  • ⚠️ Catalogs are not yet relocated under apps/desktop/src/main/catalogs/.
  • workspace-session-window-sanitizer.ts extracted as claimed.

Phase 4 — Bootstrap decomposition

  • ✅ Substantial extraction: 14 new bootstrap/*.ts modules with colocated specs, all real and well-tested.
  • ⚠️ index.ts is at 722 LOC, not the §4 target of ≤80. Real progress (down from 1555), but the doc's "exit criteria" is far from met. Section 9 still calls this PR count 4–6, which is plausible — but the file has not yet hit the target.
  • ⚠️ currentContext / currentTransport / currentHost mutable closures: I sampled index.ts:79 and a let mainWindow: BrowserWindow | null = null; is still present at module scope. The typed SessionStore state machine (§4 step 1) has not landed.
  • ⚠️ OAuth still uses the executeJavaScript("window.prompt") bridge (bootstrap/oauth-prompt-bridge.ts exists as the extraction point, but the dialog-based flow is the gated behavior change in §10 and is not in this branch).

Phase 5 — Git service split

  • ✅ Genuine, large progress. 16 new git/*.ts modules with colocated specs. The split is the cleanest piece of work in the branch.
  • ⚠️ git-worktree-service.ts is still 715 LOC. The God service is shrinking but not yet broken into the 5–7 modules of §4.5 (status, worktrees, diff, staging, remote). It currently delegates rather than disappears.

Phase 6 — Renderer decomposition

  • ✅ Many new modules: left-sidebar-*, git-panel-*, chat-thread-*, workspace-shell-*, workspace-session-store-*. All have colocated tests.
  • ⚠️ left-sidebar-workspace-tree.tsx is 542 LOC, exceeding the 200-LOC component limit in §2.1. One of the new files immediately violates the new rule.
  • ⚠️ workspace-shell.tsx reduction not yet at target (search-replace-panel.tsx is 525 LOC, command-palette 484 LOC, terminal.tsx 499 LOC, code-block.tsx 566 LOC — all unchanged, all over the limit).
  • ⚠️ §4.6.5 ("Kill getState()-in-render"): app.tsx:50, 60, 63, 84 still call sessionStore.getState() from inside React.useMemo callbacks. They are invoked at event time via getActiveSession/applyRestoredSession getters, which is the legitimate pattern. So this is closer to "OK in practice but ugly", not a stale-read bug. Worth a follow-up to lift these into the store with selectors.

Phase 7 — UI package design system

  • packages/ui/src/styles/{tokens.css,fonts.css} added; shared Button promoted.
  • ⚠️ Tokens file exists but renderer-side usage and ad-hoc hex/px elimination is not audited.
  • ⚠️ Skeletons (boneyard-js) standardization not visible in diff.

Phase 8 — Agent host tidy

  • ✅ Massive split done well: agent-feed decomposed into agent-feed-{live-state,live,transcript,types}.ts; pi-cli-rpc-* and pi-sdk-* each split into 6–8 focused modules with specs. This is clean work.
  • ⚠️ Session-server / utility-process unification (§4.8 step 3) not addressed.

Phase 9 — Enforcement

  • scripts/check-import-boundaries.mjs exists, runs clean, is wired as lint:imports.

  • .githooks/pre-commit + scripts/pre-commit.mjs tracked.

  • ✅ Real bundle and startup budget scripts (verified above).

  • ⚠️ "noExplicitAny / noUnsafeAssignment in Biome strict mode" (§4.9 step 4) not enforced. 38 as TypeName casts remain in apps/desktop/src/main, including legitimate-but-still-cast spots like:

    • terminal-manager.ts:138this.loadModule("node-pty") as PtyModule
    • agent-host-socket-transport.ts:33JSON.parse(message) as AgentHostEnvelope
    • packages/packages-catalog-client.ts:289,312await response.json() as ...
    • agent-host-client.ts:57,95,114

    Per AGENTS.md ("Never typecast. Never use as.") and §2.2 ("Zero as casts outside schema parsers"), these are policy violations the contracts package was supposed to obviate. Until Phase 2's schemas land for real, these casts cannot be removed without losing safety.

13.4 Cross-cutting findings

  1. The doc is being used as a live execution log, but the LOC table and test counts are not regenerated when slices land. Either commit to refreshing them at the end of every slice, or move the table to a generated artifact (scripts/refactor-snapshot.mjs).
  2. Phase 2 is the highest-leverage incomplete work. Without Schema-validated contracts, every later phase is doing the right structural moves but cannot drop the as/parser layer it was meant to replace.
  3. The new code introduces violations of the rules it is supposed to enforce. left-sidebar-workspace-tree.tsx (542) is a fresh file that lands above the 200-LOC component cap from §2.1. The cap should be enforced by a script in scripts/check-import-boundaries.mjs or a sibling.
  4. Test coverage growth is real and proportional to the extractions. Almost every new module has a colocated .spec.ts(x). This is the strongest signal in the branch.
  5. No catalogs have been migrated to versioned envelopes yet. That is the riskiest behavioral change in the plan; it is appropriate that it has not been rushed, but it should be flagged as the next-up gate before more catalog refactors land.
  6. renamed: tests/integration/apps-desktop/lru-map.spec.ts → apps/desktop/src/main/lru-map.spec.ts — good. There are still other integration-named files that look like unit tests (e.g. tests/integration/apps-desktop/bootstrap-helpers.spec.ts). §4.1 step 5 not finished.
  7. Behavior preservation: the staged changes do not appear to change user-visible behavior. The styling-branch name suggests UX polish, but the diff is overwhelmingly structural. If contrast/typography fixes were intended for this branch, they are buried under the refactor and should be called out separately, or split into a sibling PR for reviewability.

13.5 Recommended next slices (in order)

  1. Refresh §"Progress Snapshot" and "Current Largest Remaining Files" with current numbers, then automate it.
  2. Decide Phase 2. Either:
    • (a) port a single domain (agent:*) end-to-end onto effect/Schema-validated contracts and delete its parser, proving the keystone works; or
    • (b) revert packages/contracts/ and acknowledge the phase as not-yet-started.
  3. Add a LOC ceiling check to scripts/check-import-boundaries.mjs (or a new scripts/check-file-sizes.mjs) so the rules in §2.1 cannot regress silently.
  4. Cut left-sidebar-workspace-tree.tsx (542) — the freshest violation — into the component + hook split prescribed in §4.6.1.
  5. Migrate one catalog to versioned envelopes (suggest AppPreferencesCatalog — smallest blast radius) to validate the §3 design before touching WorkspaceSessionCatalog.

13.6 Verdict

The branch is honest, additive, and reversible. The structural extractions in Phases 4, 5, 6, and 8 are the strongest work — large files becoming many tested smaller ones. The two real gaps are:

  • Phase 2's keystone (Schemas) is missing, so the type-safety promises of the plan are not yet delivered.
  • The plan document itself drifts — its tables understate progress and its rules already have one new-file violator.

Neither is a blocker for landing. Both should be addressed before the next refactor PR claims more "Partial" status against the same phase numbers.


Review by reviewing-agent, 2026-04-24. No code was changed; this is a read-only audit.