A first-principles refactor plan. Pristine code, strong boundaries, explicit contracts, and one honest way to do each thing.
This document is written after a full pass over the repository. It is opinionated on purpose. Every item is justified from first principles — not "we should do X because X is popular", but "the code has property P, P causes harm H, doing X removes H without adding cost C."
The plan is phased so each phase is independently mergeable, independently verifiable, and never leaves main broken.
Status as of 2026-04-25, based on the work landed on this branch:
- Repo-level verification is green:
bun run lint,bun run lint:imports,bun run typecheck,bun run test,bun run build,node scripts/perf-bundle.mjs, andnode scripts/perf-startup.mjsall pass. - Test suite reach: 272 test files (counted via spec-file count (fallback); run
bun run testfor the authoritative pass/skip breakdown). - These numbers are regenerated by
bun run refactor:snapshot -- --update-refactor-md; do not hand-edit.
| Phase | Status | Landed So Far | Still Open |
|---|---|---|---|
| 1. Repo hygiene | Partial | Single cn; duplicate desktop tooltip / textarea / prompt-input / chat-container removed; README truth pass; lru-map.spec.ts moved next to source |
icon surface policy, full test-location cleanup, Biome rule audit |
| 2. Contracts + schema IPC | Partial | packages/contracts/ added; minimal contract-driven snapshot channels; preload/router contract seed; import-boundary test coverage |
payload-bearing channels, generated renderer IPC client, payload-parsers.ts deletion, compile-time full handler coverage |
| 3. Catalog base | Partial | DocumentCatalog added under packages/shared/src/catalog; app-preferences, repository-preferences, selection-state, repository, and thread catalogs migrated internally; workspace-session-window-sanitizer.ts extracted from workspace-session-catalog.ts |
schema/versioned persistence, corruption backup flow in shared base, full workspace-session migration onto shared substrate, relocation under main/catalogs/ |
| 4. Bootstrap decomposition | Partial | application-menu.ts, main-window-lifecycle.ts, main-window.ts, agent-host-connection.ts, oauth-prompt-bridge.ts, active-thread-deletion.ts, workspace-selection-actions.ts, workspace-activation-router.ts, thread-workspace-actions.ts, thread-context-actions.ts, workspace-removal-actions.ts, shell-state-ipc.ts, initial-workspace-activation.ts, and app-lifecycle.ts extracted from main/index.ts |
SessionStore, DesktopBootstrap, proper dialog-based OAuth flow, IPC registration extraction, final index.ts reduction to target size |
| 5. Git service split | Partial | git/path-utils.ts, status-parsers.ts, diff-parsers.ts, inspection.ts, inspection-outcomes.ts, repository-meta.ts, git-command-runner.ts, git-command-status.ts, repository-status.ts, repository-status-loader.ts, worktree-git-summary.ts, worktree-inspection.ts, cache-invalidation.ts, file-diff.ts, and worktree-creation.ts extracted; upstream/default-branch/worktree-summary/status assembly, repository-status loading, cache invalidation, diff assembly, worktree-creation planning, checked-command status refresh, and worktree summary inspection now delegated out; GitWorktreeService API preserved |
major module split into status / worktrees / diff / staging / remote, explicit cache module cleanup, remaining worktree helpers/effect wrapper reduction |
| 6. Renderer decomposition | Partial | left-sidebar-workspaces-panel.tsx, left-sidebar-workspace-tree.tsx, left-sidebar-menus.tsx, left-sidebar-chrome.tsx, git-panel-commit-composer.tsx, git-panel-change-list.tsx, git-panel-header.tsx, chat-thread-turns.ts, chat-thread-transcript.tsx, chat-thread-transcript-turn.tsx, workspace-shell-fullscreen.ts, workspace-shell-events.ts, workspace-shell-main-pane.tsx, workspace-shell-terminal-aside.tsx, workspace-shell-layout.tsx, workspace-shell-derived-state.ts, workspace-shell-layout-props.tsx, workspace-session-store-migrations.ts, and workspace-session-store-content.ts; targeted render-path getState() cleanup and runtime session-content mutations extracted from the main store |
workspace-shell still above target size; feature-root error boundaries; broader store topology cleanup; large renderer stores still open |
| 7. UI package design system | Partial | shared tokens.css and fonts.css; shared Button; duplicate renderer primitives removed for migrated components |
more primitive promotion, icon wrapper policy, skeleton standardization, font loading moved fully into @pi-desktop/ui |
| 8. Agent host tidy | Partial | agent-feed split into facade + live/transcript/types; CLI framing/protocol/snapshot/line-handler/pending-request/process-lifecycle/snapshot-refresh/spawn helpers extracted; SDK snapshot/thinking/provider/session-bootstrap/prompt-lifecycle/runtime-snapshot/runtime-settings helpers extracted; live state helper extracted |
session-server and utility-process unification, final runtime cleanup, remaining file-size reduction |
| 9. Enforcement | Partial | lint:imports; import-boundary checker; shared electron leak removed; real bundle and startup budget scripts; tracked .githooks/pre-commit plus scripts/pre-commit.mjs |
stricter Biome unsafe-any rules, tighter budgets once bundle work lands, optional hook installation ergonomics |
Measured after the latest landed slices:
| File | LOC |
|---|---|
apps/desktop/src/main/index.ts |
722 |
apps/desktop/src/main/git-worktree-service.ts |
715 |
apps/desktop/src/renderer/src/app.css |
702 |
apps/desktop/src/renderer/src/components/ui/code-block.tsx |
566 |
apps/desktop/src/renderer/src/features/workspace/components/search-replace/search-replace-panel.tsx |
525 |
apps/desktop/src/main/effect/layers.ts |
517 |
apps/desktop/src/renderer/src/components/ui/terminal.tsx |
499 |
apps/desktop/src/renderer/src/features/command-palette/command-palette.tsx |
484 |
apps/desktop/src/main/auto-updater.ts |
473 |
apps/desktop/src/renderer/src/features/workspace/use-app-shell-controller.ts |
462 |
apps/desktop/src/renderer/src/features/workspace/components/git-panel-model.ts |
456 |
apps/desktop/src/renderer/src/stores/workspace-session-store-migrations.ts |
454 |
apps/desktop/src/renderer/src/features/workspace/use-file-tree.ts |
443 |
apps/desktop/src/renderer/src/stores/window-store.ts |
433 |
apps/desktop/src/main/packages/packages-catalog-client.ts |
423 |
apps/desktop/src/renderer/src/components/ui/markdown.tsx |
415 |
apps/desktop/src/renderer/src/features/workspace/components/file-tree-panel.tsx |
413 |
apps/desktop/src/renderer/src/features/workspace/components/media-preview/media-preview.tsx |
413 |
packages/shell-model/src/shell-model.ts |
409 |
apps/desktop/src/renderer/src/stores/workspace-session-store.ts |
405 |
- Single source of truth per concept. Each domain concept (selection, session, runtime, catalog, context) lives in exactly one module with an unambiguous owner.
- Pure core, impure edges. Domain logic is synchronous, deterministic, and free of
electron,fs,child_process, andnode-pty. Side-effects live on the boundary and are injected. - Typed, validated contracts at every boundary. IPC payloads, persisted JSON, and subprocess messages are parsed — not cast, not trusted.
- Small files, small functions, small modules. Break anything above the thresholds in §2.
- One Effect style, used consistently. Effect is already a dep; pick one pattern (Services + Layers or free functions), delete the other.
- One state story in the renderer. Zustand stores composed deliberately, not ad-hoc
getState()calls from React. - Testable without Electron. The bulk of main-process code should run under Vitest in Node with zero electron shims.
- Pristine workspace hygiene. No
dist/orout/in git, no dead branches of code, no "legacy" alongside "new" forever.
- Rewriting the Pi agent SDK or CLI protocols.
- Swapping frameworks (Electron, React, Zustand, Vite, Biome all stay).
- Feature work. This refactor changes shape, not behavior. Behavioral changes are called out explicitly and gated.
apps/desktop/src/main ~45 files, 2 layers deep
apps/desktop/src/preload 3 files (api.ts = 509 LOC)
apps/desktop/src/renderer 200+ files across features/, components/, lib/, stores/, hooks/, app-shell/
packages/shared flat models/, ipc/, observability/ — clean
packages/agent-host runtime/, mock/, pi/, session-server/, utility-process/, events/, state/ — clean
packages/shell-model 3 files
packages/ui 6 components + lib + styles
Largest files (evidence):
| File | LOC | Problem |
|---|---|---|
apps/desktop/src/main/git-worktree-service.ts |
1788 | God service — parsing, process spawn, caching, status, diff, staging, commit, push/pull all in one class |
apps/desktop/src/main/index.ts |
1555 | Bootstrap + controller + IPC wiring + lifecycle + app menu + uninstall + OAuth + switching + removal + shutdown |
apps/desktop/src/renderer/.../left-sidebar.tsx |
1075 | UI, layout, menus, nav, drag/drop, virtualization all in one component |
apps/desktop/src/renderer/.../git-panel.tsx |
849 | View + business logic + IPC orchestration |
packages/agent-host/src/pi/pi-cli-rpc-agent-runtime.ts |
848 | RPC framing + process lifecycle + event normalization + state |
apps/desktop/src/renderer/.../chat-thread-panel.tsx |
696 | Virtualized list + message rendering + scroll anchor + drag |
apps/desktop/src/renderer/.../workspace-shell.tsx |
607 | Top-level shell composes everything directly |
packages/shell-model/src/agent-feed.ts |
501 | Mixed reducer + derivation + event pump |
apps/desktop/src/preload/api.ts |
509 | Hand-written channel bindings; easy to drift from IPC_CHANNELS |
- Mutable captured closures as architecture.
index.tsholdscurrentContext,currentTransport,currentHost,unsubscribeas module-scopedlets, then passes them tocreateContextSwitchControllervia getter/setter property descriptors. This is a reactive primitive, hand-rolled. It belongs in a typed state machine. - Bootstrap is not a module.
bootstrapDesktop()is 1000+ lines of inline closures. It violates every single-responsibility heuristic. - IPC types are nominal strings.
IPC_CHANNELSisconst-typed, but the payloads are parsed in one direction (payload-parsers.ts) and hand-serialized in the other (preload/api.ts). No single source of truth for request/response schema. createSanitizingHandle+ipcMain.handle.bind: error sanitization is a wrapper that must be remembered at every registration. Easy to forget. Not enforced by the type system.ascasts andany-shaped guards everywhere in IPC parsers.payload-parsers.tsis the symptom; the cure is parse-don't-validate with schemas.- Effect usage is partial and inconsistent. Some handlers use
Effect.tryPromise+Effect.catchAll; others use rawtry/catch; others userunEffectVoidfor what could beawait. NoLayers, noContext, noEffect.Service— so Effect is a fancy try/catch here. Either commit or remove. - Renderer reaches into stores via
getState()from React. Seeapp.tsx—useSearchReplaceFilescloses oversessionStore.getState()inside a memo that doesn't subscribe. This is a stale-read bug waiting to happen. Every non-subscribedgetState()in a component is a bug or latent bug. - Catalog pattern duplicates effort.
RepositoryCatalog,ThreadCatalog,WorkspaceSessionCatalog,AppPreferencesCatalog,RepositoryPreferencesCatalog,SelectionStateall reimplement "load JSON, validate, update, persist, notify" with tiny variations. One abstraction + schemas can replace ~1500 LOC. index.tsimports from../thread-title-defaults— a sibling tomain/, not inside it. Why is that file not inmain/threads/? Because no one drew the line.dist/andout/are present in working tree. They should never be committed; verify.gitignoreand clean.- Mixed concerns in
packages/ui: only 6 components live there but duplicates exist inapps/desktop/src/renderer/src/components/ui/(e.g.tooltip.tsx,textarea.tsx,prompt-input.tsx,chat-container.tsx). Two copies, eventual drift. @pi-desktop/uihas no tokens or theme. Styling lives inapps/desktop/src/renderer/src/app.cssand ad-hocclassNameconcatenation. No design tokens module.- Tests spread across three locations (
apps/desktop/**/*.spec.*,packages/**/*.spec.*,tests/integration/). The split between "unit" and "integration" is not enforced —tests/integration/apps-desktop/contains files that look like unit tests (lru-map.spec.ts). - Icon policy not followed. The project mandates Phosphor Icons, and
apps/desktopdepends on@phosphor-icons/react. Butcomponents/ui/icons.tsxexists and may re-wrap. Verify one icon surface. class-variance-authority,clsx,tailwind-mergelive inapps/desktopdeps butcn/variants live inpackages/ui/src/lib/utils.ts. Desktop re-implements its owncninlib/utils.ts. Pick one.node_modulesper workspace is committed into the working tree (visible inls). Not in git, but worth a cleanbun installpass to verify workspace hoisting.- Preload is manually split (
index.ts→runtime.ts→api.ts) to satisfy Electron's requirement, fine — butapi.tsis 509 LOC of hand-written channel-to-method mapping. Should be generated from the IPC contract. - No runtime validation of persisted JSON. Files loaded via
PersistentJsonFileare trusted to match a TS type. Corruption, old versions, and hand-edits silently pass. IPC_CHANNELSand handlers don't cross-check. Nothing fails at compile time if a channel is declared and no handler is registered (or vice versa).ipc-handlers-coverage.spec.tsimplies someone felt the pain and wrote a runtime test to patch it.- No error taxonomy at the renderer boundary.
PiErrorcodes exist in main, but renderer receives sanitizedErrorobjects and parsesmessagestrings.
These are the rules the refactored code must obey. Deviations require a written justification in the PR.
- No file over 300 LOC (excluding imports/types-only files). The four >500 LOC files are the worst offenders and get split first.
- No function over 40 LOC, no function over 3 levels of nesting. Extract helpers.
- No React component over 200 LOC. Split into a presentational component + a hook.
- No
index.ts(barrel) over 40 re-exports and no deep barrel chains. Barrels are for package entry points only.
- Zero
ascasts outside schema parsers. Use type predicates or parsed schemas. - Zero
any. Useunknownand narrow. noUncheckedIndexedAccessalready on intsconfig.base.json— keep and honor it.- Every IPC payload has a schema declared once in
@pi-desktop/sharedand used on both sides.
- Renderer must never import from
electron,node:*, or any main-process module. Enforced by an ESLint/Biome rule (see §9). - Main must never import from
react,zustand, or any renderer module. @pi-desktop/sharedmust depend on nothing runtime-ish. Noelectron, noeffect(or onlyeffect/Schema), nofs.@pi-desktop/agent-hostdepends onshared+effect+@mariozechner/pi-coding-agentonly.@pi-desktop/uidepends onreact,@phosphor-icons/react,@radix-ui/*,clsx,tailwind-merge. Nothing else.
- Commit to "Effect at the edge": every async operation that crosses a boundary (IPC handler, process spawn, fs read) is an
Effectwith typed errors. Business logic under it is plain functions. - Use
Effect.Servicefor injectable capabilities:GitService,FsService,ClockService,TerminalService,CatalogService. - Use
Layercomposition inmain/index.tsto build the runtime once. - No more
runEffectVoid(Effect.try(...).pipe(Effect.catchAll(...)))on synchronous code. Either use Effect fully or use a plain function.
- Pure logic: fast Vitest unit tests colocated next to the file.
- Process/IPC wiring: integration tests under
tests/integration/. - User flows: Playwright under
tests/e2e/(if/when added). - Every catalog, every schema, every reducer has tests.
- Snapshot/golden tests only for stable output (e.g. markdown renderer), never for component trees.
- kebab-case filenames, matches existing convention. Keep.
- One exported symbol per file when practical; export-shape:
export function foo,export type Foo, no default exports except React components at page roots. - Feature folders over layer folders in the renderer. Already partially done under
features/— extend to the rest. - Colocation: tests, styles, and hooks sit next to the thing they belong to.
pi-desktop/
├── apps/
│ └── desktop/
│ ├── src/
│ │ ├── main/
│ │ │ ├── bootstrap/ # top-level assembly; <150 LOC entry
│ │ │ ├── ipc/ # contract-driven handler registration
│ │ │ ├── services/ # Effect.Service impls: Git, Fs, Terminal, AgentRuntime
│ │ │ ├── catalogs/ # one module per catalog, one shared base
│ │ │ ├── session/ # ContextSwitchController as a typed state machine
│ │ │ ├── windows/ # window creation, menus, tray, updater
│ │ │ └── index.ts # <80 LOC — wires layers, starts app
│ │ ├── preload/
│ │ │ └── index.ts # generated from shared IPC contract
│ │ └── renderer/
│ │ └── src/
│ │ ├── app/ # App.tsx, providers, error boundary
│ │ ├── features/ # feature-sliced: workspace, chat, git, terminal, settings, palette
│ │ ├── state/ # zustand stores + selectors; one barrel per feature
│ │ ├── ipc/ # typed renderer-side IPC client
│ │ └── shared/ # cross-feature primitives (a11y, keyboard, theme, perf, undo)
├── packages/
│ ├── contracts/ # NEW: IPC schemas (Effect Schema or Zod), types, channels; zero deps
│ ├── shared/ # TRIMMED: domain models only (no IPC)
│ ├── agent-host/ # as today, cleaner runtime split
│ ├── shell-model/ # as today, split agent-feed
│ └── ui/ # real design system + tokens, single source
└── tests/
├── unit/ # (if not colocated)
├── integration/
└── e2e/ # Playwright (future)
New package: @pi-desktop/contracts. This is the bit the codebase is missing: a single, dependency-free package that declares channels and their request/response schemas. Preload, main handlers, and renderer client all generate from it.
Each phase ends with: bun run lint && bun run typecheck && bun run test && bun run build green, app boots, DMG still builds.
- Remove build artifacts from working tree. Verify
.gitignorecoversdist/,out/,node_modules/,coverage/,playwright-report/,test-results/,.todos/,.pi/,.opencode/. Delete stray directories. - Single
cnutility. Keeppackages/ui/src/lib/utils.ts. Deleteapps/desktop/src/renderer/src/lib/utils.ts. Update imports. - Icon policy enforcement. Audit
components/ui/icons.tsx. If it only re-exports Phosphor, delete it and import directly. If it adds value, rename tophosphor-icons.tsxand document. - Delete UI duplicates.
components/ui/{tooltip,textarea,prompt-input,chat-container}.tsxin the desktop app re-implement@pi-desktop/uiexports. Delete the desktop copies, import from the package. - Consolidate test locations. Move unit tests to live next to their source (the pattern already exists for most files). Reserve
tests/integration/for tests that require process spawning, electron stubs, or cross-package wiring. Move misplaced files (lru-map.spec.ts, etc.) to their proper homes. - README.md truth pass. The current README advertises a Zustand + Tailwind + Radix stack accurately; fix the monorepo tree to match reality (the README's tree is already close). Replace the handwaved "Catalog Pattern" paragraph with a link to this document.
- Biome config audit.
noStaticElementInteractionsis off — keep only if there's a real reason; otherwise enable.noArrayIndexKeyoff — enable per-file with biome-ignore where genuinely needed (rare).
Exit criteria: green quality suite; working tree has no build outputs; no duplicate UI components.
This is the structural keystone. Do this before anything else structural.
- Create
packages/contracts/. Zero runtime deps. Useseffect/Schema(already transitively available viaeffect) to declare:- Every channel name.
- Every request schema.
- Every response schema.
- Every event (push) schema.
- Shape:
// packages/contracts/src/agent.ts import { Schema } from "effect"; export const PromptRequest = Schema.Struct({ text: Schema.String }); export type PromptRequest = Schema.Schema.Type<typeof PromptRequest>; export const agentChannels = { prompt: { channel: "agent:prompt", request: PromptRequest, response: Schema.Void, }, } as const;
- Generate preload API from contracts. Replace the 509-LOC hand-written
preload/api.tswith a single factory that walks the contract table and emits typedinvoke(channel, payload)methods. Event channels emit typedsubscribe(listener). - Main-side router generator.
registerIpcHandlers({ agent, git, ... })becomes a typed dispatcher that, given a contract table and a handler map, registers every channel and validates payloads viaSchema.decodeUnknown. A missing handler = type error. An extra handler = type error. - Renderer IPC client. Replace scattered
window.piDesktop.xxxcall sites with a single typed client (also generated) underrenderer/src/ipc/. This enforces the same schemas on the way in. - Delete
ipc/payload-parsers.ts. Parsers are redundant once schemas exist. - Migrate in slices. Start with
agent:*, thenrepositories:*, thenstate:*, thengit:*, thenterminal:*, thenpackages:*, then the rest. One PR per domain.
Exit criteria: ipc-handlers-coverage.spec.ts becomes a compile-time guarantee and is deleted.
- Create
packages/shared/src/catalog/with:PersistentCatalog<T>: one class, not many. Takes a schema, a file path, a reducer set, and exposeslist(),get(),upsert(),remove(),subscribe().- Uses
Schema.decodeUnknownon load. On decode failure: log, back up corrupted file, reinitialize with default. - Versioned: every schema carries a
version: Nfield; migrations are declared as{ from: 1, to: 2, migrate(x) }.
- Rewrite each catalog as a thin wrapper:
RepositoryCatalog,ThreadCatalog,WorkspaceSessionCatalog,AppPreferencesCatalog,RepositoryPreferencesCatalog,SelectionStateall reduce to ~50 LOC each.
- Rename & relocate: move all catalogs under
apps/desktop/src/main/catalogs/(one file each). Movepersistent-json-file.tsunder the same folder as the internal impl. - Concurrency: the new base uses a single in-memory mutator queue per file. Atomic replace via temp file + rename (already the pattern; verify).
Exit criteria: every catalog has a schema, round-trips under unit tests, handles corruption gracefully.
Target: apps/desktop/src/main/index.ts goes from 1555 LOC to under 80.
- Extract a
SessionStore— a typed state machine replacing thecurrentContext/currentHost/currentTransport/unsubscribeclosure cluster. States:NoWorkspace,WorkspaceWithoutRepo,RepoNoThread,ActiveThread. Transitions return new states; no getters/setters. - Extract
ContextSwitchCoordinator— today'screateContextSwitchControlleris already here; upgrade it to consume and produceSessionStorestates rather than mutating properties. - Extract
AppMenuBuilder(macOS menu + Uninstall action) intowindows/app-menu.ts. - Extract
OAuthPromptBridgefrom theexecuteJavaScript("window.prompt")block — that pattern is ugly and blocks the renderer. Replace with a proper in-app dialog viaIPC_CHANNELS.dialog. - Extract
DesktopBootstrap.run()that:- builds services (Effect Layers),
- builds catalogs,
- builds
SessionStore, - restores prior workspace (
resolveInitialWorkspaceTarget), - creates main window,
- registers IPC handlers,
- wires shutdown.
index.tsbecomes:import { DesktopBootstrap } from "./bootstrap/desktop-bootstrap"; import { app } from "electron"; DesktopBootstrap.run().catch((err) => { console.error("Fatal:", err); app.quit(); });
Exit criteria: index.ts ≤ 80 LOC; every former module-level let is inside a typed state object; all integration tests pass unmodified.
Target: git-worktree-service.ts (1788 LOC) broken into 5–7 focused modules.
git/porcelain.ts— pure wrappers overgitchild processes with typed parse functions. No caching, no state.git/status.ts— repo status, worktree inspection,inspect()/inspectAsync().git/worktrees.ts—createWorktree,removeWorktree,listWorktrees.git/diff.ts— diffs (file, staged, range).git/staging.ts— stage/unstage/discard/commit.git/remote.ts— fetch/pull/push.git/index.ts— composes into aGitServiceEffect.Service consumed by handlers.
Each module < 300 LOC. Each has its own tests. No shared mutable cache — if caching is needed, it goes behind an explicit GitStatusCache with a documented invalidation policy.
- Split
left-sidebar.tsx(1075) into:left-sidebar/folderleft-sidebar.tsx(layout shell, ≤150 LOC)repository-list.tsx,worktree-list.tsx,thread-list.tsx,sidebar-header.tsx,sidebar-footer.tsx- One
use-left-sidebar.tshook composinguse-left-sidebar-layout+use-left-sidebar-menus+ data.
- Split
git-panel.tsx(849) intogit-panel.tsx(view),use-git-panel.ts(state),git-panel-model.ts(pure reducers — already exists, absorb more logic into it). - Split
chat-thread-panel.tsx(696) intochat-thread-panel.tsx(container),chat-thread-list.tsx(virtualized list),use-chat-thread.ts(scroll + data). - Split
workspace-shell.tsx(607) into a layout primitive + a controller hook. TheuseAppShellControllerpattern already exists; extend it. - Kill
getState()-in-render. Every such call (app.tsx, hooks, feature code) either:- uses
useStore(selector)with a proper selector, or - moves to an event handler / effect where imperative reads are legitimate.
- uses
- Store topology. One store per bounded context:
sessionStore,workspaceStore,uiStore,notificationsStore,snapshotsStore. Cross-store reads go through selectors, not direct imports. - Error boundaries at feature roots: Chat, Git, Terminal, FileTree, Settings — each wrapped with a named boundary so a crash in one panel doesn't nuke the workspace.
- Move tokens (
--color-bg-primary, animation easings fromapp.tsx's inline style, font sizes, spacing) intopackages/ui/src/styles/tokens.css. - Promote primitives to
@pi-desktop/ui:Button,Dialog,Popover,Tooltip,Select,Separator,ScrollArea,Collapsible,Avatar,Tabs,Textarea,Card(only if actually used and not nested — see §7.3). - Font integration via Fonttrio. Per house rules, select a distinctive trio for the UI surface; wire via
@fontsource-*imports in@pi-desktop/ui/styles/fonts.css. - Icons: single
Iconcomponent in@pi-desktop/uithat wraps Phosphor with default size/weight; everything else imports from@phosphor-icons/reactdirectly. - Skeletons: adopt
boneyard-js(already a root dep) for every loading state. No ad-hoc pulsing divs.
Per house rules: avoid cards in designs; never nest them. Audit components/ui/card.tsx usages. Replace most with plain section + spacing.
- Split
pi-cli-rpc-agent-runtime.ts(848) into:rpc-framing.ts(newline-delimited JSON framing + parsers)rpc-transport.ts(process spawn + stdin/stdout wiring)runtime.ts(lifecycle + subscribe)event-normalizer.ts(re-use existingevents/normalize-agent-session-event.ts)
- Split
pi-sdk-agent-runtime.ts(484) intoruntime.ts+provider-registry.ts(a test already exists for the split). - Unify session-server + utility-process: both wrap the same concept (a runner for a Pi agent); document which is used when, extract the common bits.
- Split
shell-model/agent-feed.ts(501) intoreducer.ts(pure) +stream.ts(subscription).
Fence the walls we just built so they don't erode.
- Import boundaries via a Biome linter rule or a small custom script:
apps/desktop/src/renderer/**cannot import fromelectron,node:*,@pi-desktop/agent-host, orapps/desktop/src/main/**.apps/desktop/src/main/**cannot import fromreact,zustand, orapps/desktop/src/renderer/**.packages/shared/**cannot import fromelectronornode:*.
- Bundle size budget in
scripts/perf-bundle.mjs: fail CI if the renderer bundle grows > 5% without justification. - Startup budget via
scripts/perf-startup.mjs: tracked regression test. noExplicitAnyandnoUnsafeAssignmentin Biome strict mode; CI blocks on violations.- Pre-commit hook runs
biome check --fixandbun run typecheckon staged workspaces.
Every persisted JSON file embeds { schemaVersion: number, data: T }. Load path:
read → decode envelope → if version < current: run migrations → decode T → return
On fail: write *.corrupt-<timestamp>.json beside it, return default, log warning.
PiError codes (main) map to a public renderer error shape:
type IpcErrorResponse = {
code: PiErrorCode;
message: string; // safe to show
details?: Record<string, string>;
};sanitize-ipc-error.ts becomes toIpcError(e: unknown): IpcErrorResponse, automatically invoked by the router. Renderer wraps every invoke in a Result<T, IpcErrorResponse> helper.
apps/desktop/src/**/*.spec.ts(x)— unit, colocated.packages/*/src/**/*.spec.ts— unit, colocated.tests/integration/**— cross-package, process, or electron-in-node tests.tests/e2e/**— reserved for Playwright (future phase).
- Catalogs: 100% branch. Corruption + migration round-trips must be tested.
- IPC schemas: every request/response schema has a round-trip test using
Schema.decode(Schema.encode(x)). SessionStorestate machine: every transition has a test.- Reducers (git-panel-model, agent-feed, shell-model): 100% branch.
- React components: test observable behavior only (props → rendered output + events). No store-internals assertions.
Vitest workspace config is fine. Add --passWithNoTests=false (default). Keep jsdom only for files that need it — main-process tests run in node.
- Use Fonttrio to choose a distinctive pair/trio. Candidate direction: a warm sans for UI, a geometric mono for terminal/code, an editorial serif reserved for empty states and onboarding only (sparingly).
- Wire via
@pi-desktop/ui/styles/fonts.css, imported exactly once from the renderer entry.
packages/ui/src/styles/tokens.css holds CSS custom properties for color, spacing (4/8/12/16/24/32/48), radii, shadows (use sparingly), easings, durations. Renderer consumes these; no component hardcodes a hex or a px.
- No cards. No nested cards.
- Hierarchy via typography and spacing, not via boxes.
- Shadows used only for overlays (dialogs, popovers).
- Motion: respect
prefers-reduced-motion; keepmotion-safe:wrappers (already in use).
Phosphor only. Default: weight="light", size="1.25rem". Override per context.
boneyard-js for every async region.
- Structured logging in main. Replace
console.error("[ipc] handler error", ...)with a tinylog.tsthat emits{ level, scope, message, fields }and is redacted viapackages/shared/src/observability/redaction.ts(already present). - Perf timers (
perf-timer.tsexists) wrap every IPC handler and expose p50/p95 via a dev overlay (lib/perf/perf-overlay.tsxexists — surface it viaCtrl+Shift+P). - Crash reporting hook (noop in dev, real in prod) sits at the Electron
applevel.
The phases above are ordered to minimize merge conflicts and keep the app bootable at every step.
| # | Phase | PR count (est.) | Risk | Unlocks |
|---|---|---|---|---|
| 1 | Repo hygiene | 3–5 | Low | Everything else |
| 2 | Contracts & schema IPC | 6–8 | Medium | Safe bulk refactors later |
| 3 | Catalog base | 3–4 | Low | Bootstrap decomposition |
| 4 | Bootstrap decomposition | 4–6 | Medium | Session state machine stable |
| 5 | Git service split | 3 | Low | Faster test iteration |
| 6 | Renderer decomposition | 8–12 | Medium | All feature work gets easier |
| 7 | UI package as design system | 4–5 | Low | Visual consistency |
| 8 | Agent host tidy | 2–3 | Low | Dependable runtime |
| 9 | Enforcement | 2 | Low | Prevents regression |
Each PR: green lint, typecheck, test, build. App boots. DMG builds unchanged.
Most of this plan preserves behavior. The exceptions:
- OAuth prompt stops using
webContents.executeJavaScript("window.prompt(...)"). It becomes a proper in-app dialog. Gated behind a feature flagPI_DESKTOP_NEW_OAUTH_DIALOGduring rollout. - Persisted JSON auto-backup on corrupt load: new behavior, user-visible only if a file was actually corrupt. Safe by construction.
- IPC errors: renderer may now receive structured
{code, message}where it previously got a string. Update error display accordingly; old strings remain the user-visible message.
- Not introducing a second state library in the renderer.
- Not adding a router (no URLs).
- Not moving off Electron Vite or Bun.
- Not changing the Pi agent wire protocol.
- Not introducing a monorepo tool beyond
bun workspaces. Noturbo, nonx. - Not adding a fourth process boundary.
- Not adopting CSS-in-JS. Tailwind v4 + tokens stays.
- Not rewriting tests we'll delete; we'll delete them and write new, smaller ones.
The refactor is "done" when:
- No file in the repo is over 300 LOC of code (type-only files excluded).
apps/desktop/src/main/index.tsis under 80 LOC.- There is exactly one place that declares each IPC channel, schema, and payload.
packages/uiis the only source of shared UI primitives.bun run lint && bun run typecheck && bun run test && bun run buildare all green on a fresh clone with no warnings.- A new contributor can locate the implementation of any user-visible action by reading file names alone.
- Removing a feature touches exactly one directory.
This plan is intentionally concrete. Anything vague in it is a bug in the plan; raise it before implementing.
This section is an independent review of the work currently staged on styling/fix-contrast-and-typography. It audits the prior agent's claims in §0–§12 against what is actually on disk.
All quality gates were re-run on the working tree:
| Command | Result |
|---|---|
bun run lint |
green — 752 files checked, no fixes needed |
bun run lint:imports |
green — import boundary check passed |
bun run typecheck |
green — all 6 workspaces pass |
bun run test |
green — 244 test files / 1767 passing / 2 skipped |
bun run build |
green — main, preload, renderer all build |
node scripts/perf-bundle.mjs |
green — main 21.74/24 MiB, renderer 35.52/40 MiB |
node scripts/perf-startup.mjs |
green on the latest re-run — main 735.3/1500 KiB, preload 12.4/192 KiB |
The verification claims in §"Progress Snapshot" hold up after the latest re-run. The test count in that table is stale — actuals were higher in this audit — but the current snapshot above now reflects the regenerated branch state.
The "Current Largest Remaining Files" table (lines 33–57) does not match the working tree. Spot check:
| File | Doc claim | Actual | Delta |
|---|---|---|---|
apps/desktop/src/main/index.ts |
928 | 722 | -206 |
apps/desktop/src/main/git-worktree-service.ts |
831 | 715 | -116 |
apps/desktop/src/renderer/src/stores/window-store.ts |
702 | 433 | -269 |
apps/desktop/src/renderer/src/stores/workspace-session-store.ts |
664 | 405 | -259 |
apps/desktop/src/preload/api.ts |
450 | 396 | -54 |
apps/desktop/src/main/shell-catalog-builder.ts |
463 | not in top 30 | — |
The renderer stores have been decomposed substantially since the table was written (see new window-store-create.ts, window-store-reducer.ts, workspace-session-store-{cleanup,content,migrations,persistence,session-operations,window-actions}.ts). The doc should be regenerated — it currently understates how much progress has shipped.
- ✅
apps/desktop/src/renderer/src/lib/utils.tsdeleted; singlecnlives in@pi-desktop/ui. - ✅ Duplicate UI primitives removed:
tooltip.tsx,textarea.tsx,prompt-input.tsx,chat-container.tsxall gone from desktop. - ✅
lru-map.spec.tsmoved next to its source. ⚠️ Icon policy NOT enforced.apps/desktop/src/renderer/src/components/ui/icons.tsxstill exists. House rules and §2.3/§7.4 require Phosphor as the single icon surface; this file should be audited and either deleted or renamed tophosphor-icons.tsxwith documentation.⚠️ Biome rule audit not done.
- ✅
packages/contracts/exists and is wired into the workspace. - 🔴 The keystone of Phase 2 is missing: there are no Effect Schemas.
packages/contracts/src/index.tsonly exportsNoPayloadIpcContract<TResponse>markers — bare{ channel: string }objects with a phantom type parameter. There is zero runtime validation.grep -rn "Schema\." packages/contractsreturns nothing. - 🔴
apps/desktop/src/main/ipc/payload-parsers.tsis still present. §4.2 step 6 says "Deleteipc/payload-parsers.ts. Parsers are redundant once schemas exist." Schemas don't exist, so parsers stayed. The package as it stands is decoration, not the keystone described in §3 ("the bit the codebase is missing"). ⚠️ Coverage is also tiny: onlyshell.getSnapshot,agent.getProviders/getSettings/getSnapshotare seeded. Five no-payload reads out of dozens of channels.- Recommendation: before declaring Phase 2 partial, decide whether to commit (use
effect/Schemafor real, migrate one domain end-to-end) or revert. The current state is the worst of both worlds: a new package with no value plus the original parsers.
- ✅
packages/shared/src/catalog/document-catalog.tsadded, tests intests/integration/shared/document-catalog.spec.ts. - 🔴 No Schema-based versioning yet. §3 requires
{ schemaVersion, data }envelopes, decode-on-load, corrupt-file backup.DocumentCatalogdoes not yet implement any of this — it is structural plumbing only. ⚠️ Catalogs are not yet relocated underapps/desktop/src/main/catalogs/.- ✅
workspace-session-window-sanitizer.tsextracted as claimed.
- ✅ Substantial extraction: 14 new
bootstrap/*.tsmodules with colocated specs, all real and well-tested. ⚠️ index.tsis at 722 LOC, not the §4 target of ≤80. Real progress (down from 1555), but the doc's "exit criteria" is far from met. Section 9 still calls this PR count 4–6, which is plausible — but the file has not yet hit the target.⚠️ currentContext / currentTransport / currentHostmutable closures: I sampledindex.ts:79and alet mainWindow: BrowserWindow | null = null;is still present at module scope. The typedSessionStorestate machine (§4 step 1) has not landed.⚠️ OAuth still uses theexecuteJavaScript("window.prompt")bridge (bootstrap/oauth-prompt-bridge.tsexists as the extraction point, but the dialog-based flow is the gated behavior change in §10 and is not in this branch).
- ✅ Genuine, large progress. 16 new
git/*.tsmodules with colocated specs. The split is the cleanest piece of work in the branch. ⚠️ git-worktree-service.tsis still 715 LOC. The God service is shrinking but not yet broken into the 5–7 modules of §4.5 (status,worktrees,diff,staging,remote). It currently delegates rather than disappears.
- ✅ Many new modules:
left-sidebar-*,git-panel-*,chat-thread-*,workspace-shell-*,workspace-session-store-*. All have colocated tests. ⚠️ left-sidebar-workspace-tree.tsxis 542 LOC, exceeding the 200-LOC component limit in §2.1. One of the new files immediately violates the new rule.⚠️ workspace-shell.tsxreduction not yet at target (search-replace-panel.tsx is 525 LOC, command-palette 484 LOC, terminal.tsx 499 LOC, code-block.tsx 566 LOC — all unchanged, all over the limit).⚠️ §4.6.5 ("KillgetState()-in-render"):app.tsx:50, 60, 63, 84still callsessionStore.getState()from insideReact.useMemocallbacks. They are invoked at event time viagetActiveSession/applyRestoredSessiongetters, which is the legitimate pattern. So this is closer to "OK in practice but ugly", not a stale-read bug. Worth a follow-up to lift these into the store with selectors.
- ✅
packages/ui/src/styles/{tokens.css,fonts.css}added; sharedButtonpromoted. ⚠️ Tokens file exists but renderer-side usage and ad-hoc hex/px elimination is not audited.⚠️ Skeletons (boneyard-js) standardization not visible in diff.
- ✅ Massive split done well:
agent-feeddecomposed intoagent-feed-{live-state,live,transcript,types}.ts;pi-cli-rpc-*andpi-sdk-*each split into 6–8 focused modules with specs. This is clean work. ⚠️ Session-server / utility-process unification (§4.8 step 3) not addressed.
-
✅
scripts/check-import-boundaries.mjsexists, runs clean, is wired aslint:imports. -
✅
.githooks/pre-commit+scripts/pre-commit.mjstracked. -
✅ Real bundle and startup budget scripts (verified above).
-
⚠️ "noExplicitAny / noUnsafeAssignment in Biome strict mode" (§4.9 step 4) not enforced. 38as TypeNamecasts remain inapps/desktop/src/main, including legitimate-but-still-cast spots like:terminal-manager.ts:138—this.loadModule("node-pty") as PtyModuleagent-host-socket-transport.ts:33—JSON.parse(message) as AgentHostEnvelopepackages/packages-catalog-client.ts:289,312—await response.json() as ...agent-host-client.ts:57,95,114
Per
AGENTS.md("Never typecast. Never useas.") and §2.2 ("Zeroascasts outside schema parsers"), these are policy violations the contracts package was supposed to obviate. Until Phase 2's schemas land for real, these casts cannot be removed without losing safety.
- The doc is being used as a live execution log, but the LOC table and test counts are not regenerated when slices land. Either commit to refreshing them at the end of every slice, or move the table to a generated artifact (
scripts/refactor-snapshot.mjs). - Phase 2 is the highest-leverage incomplete work. Without Schema-validated contracts, every later phase is doing the right structural moves but cannot drop the
as/parser layer it was meant to replace. - The new code introduces violations of the rules it is supposed to enforce.
left-sidebar-workspace-tree.tsx(542) is a fresh file that lands above the 200-LOC component cap from §2.1. The cap should be enforced by a script inscripts/check-import-boundaries.mjsor a sibling. - Test coverage growth is real and proportional to the extractions. Almost every new module has a colocated
.spec.ts(x). This is the strongest signal in the branch. - No catalogs have been migrated to versioned envelopes yet. That is the riskiest behavioral change in the plan; it is appropriate that it has not been rushed, but it should be flagged as the next-up gate before more catalog refactors land.
renamed: tests/integration/apps-desktop/lru-map.spec.ts → apps/desktop/src/main/lru-map.spec.ts— good. There are still other integration-named files that look like unit tests (e.g.tests/integration/apps-desktop/bootstrap-helpers.spec.ts). §4.1 step 5 not finished.- Behavior preservation: the staged changes do not appear to change user-visible behavior. The styling-branch name suggests UX polish, but the diff is overwhelmingly structural. If contrast/typography fixes were intended for this branch, they are buried under the refactor and should be called out separately, or split into a sibling PR for reviewability.
- Refresh §"Progress Snapshot" and "Current Largest Remaining Files" with current numbers, then automate it.
- Decide Phase 2. Either:
- (a) port a single domain (
agent:*) end-to-end ontoeffect/Schema-validated contracts and delete its parser, proving the keystone works; or - (b) revert
packages/contracts/and acknowledge the phase as not-yet-started.
- (a) port a single domain (
- Add a LOC ceiling check to
scripts/check-import-boundaries.mjs(or a newscripts/check-file-sizes.mjs) so the rules in §2.1 cannot regress silently. - Cut
left-sidebar-workspace-tree.tsx(542) — the freshest violation — into the component + hook split prescribed in §4.6.1. - Migrate one catalog to versioned envelopes (suggest
AppPreferencesCatalog— smallest blast radius) to validate the §3 design before touchingWorkspaceSessionCatalog.
The branch is honest, additive, and reversible. The structural extractions in Phases 4, 5, 6, and 8 are the strongest work — large files becoming many tested smaller ones. The two real gaps are:
- Phase 2's keystone (Schemas) is missing, so the type-safety promises of the plan are not yet delivered.
- The plan document itself drifts — its tables understate progress and its rules already have one new-file violator.
Neither is a blocker for landing. Both should be addressed before the next refactor PR claims more "Partial" status against the same phase numbers.
Review by reviewing-agent, 2026-04-24. No code was changed; this is a read-only audit.