North Implementation Details

This document describes the current implementation state and module architecture.

Milestone Status

Milestone	Status
1: Chat UI + streaming	✅ Complete
2: Read/search tools	✅ Complete
3: Deterministic edits + diff review	✅ Complete
4: Persistent PTY shell + approvals	✅ Complete
4.5: Slash commands + model switching	✅ Complete
4.6: Conversation save + resume	✅ Complete
5: Memory + project card cache	Not started
6: UX polish	Not started

Last verified: 2025-12-10

Project Structure

src/
├── index.ts              # CLI entry point, arg parsing, app bootstrap
├── commands/
│   ├── index.ts          # Command exports and registry factory
│   ├── types.ts          # Command type definitions
│   ├── models.ts         # Shared model list (alias, pinned, display)
│   ├── registry.ts       # Command registry implementation
│   ├── parse.ts          # Span-based command tokenizer
│   └── commands/
│       ├── quit.ts       # /quit - exit application
│       ├── new.ts        # /new - reset chat
│       ├── help.ts       # /help - list commands
│       ├── model.ts      # /model - switch Claude model
│       ├── mode.ts       # /mode - switch conversation mode (ask/agent)
│       ├── summarize.ts  # /summarize - summarize and trim transcript
│       ├── learn.ts      # /learn - learn or relearn project codebase
│       ├── conversations.ts # /conversations - picker to switch conversations
│       └── resume.ts     # /resume <id> - switch to conversation by ID
├── logging/
│   └── index.ts          # Append-only JSON-lines logger
├── orchestrator/
│   └── index.ts          # Conversation state, message flow, tool loop, commands, reviews
├── provider/
│   ├── index.ts          # Provider factory, selects provider by model
│   ├── anthropic.ts      # Claude streaming client (Anthropic Messages API)
│   └── openai.ts         # GPT streaming client (OpenAI Responses API)
├── rules/
│   ├── index.ts          # Rules module exports
│   └── cursor.ts         # Cursor rules loader (.cursor/rules/*.mdc)
├── shell/
│   └── index.ts          # Persistent PTY service with sentinel-based output parsing
├── storage/
│   ├── allowlist.ts      # Per-project shell command allowlist (.north/allowlist.json)
│   ├── autoaccept.ts     # Per-project edit auto-accept settings
│   ├── config.ts         # Global config (~/.config/north/config.json)
│   ├── conversations.ts  # Conversation persistence (event log + index)
│   ├── costs.ts          # Global cost tracking (~/.north/costs.json)
│   └── profile.ts        # Per-project learning profile storage
├── profile/
│   └── learn.ts          # Project learning orchestration and discovery topics
├── tools/
│   ├── index.ts          # Tool exports and registry factory
│   ├── types.ts          # Tool type definitions (including edit and shell types)
│   ├── registry.ts       # Tool registry implementation with approval policy
│   ├── list_root.ts      # List repo root entries
│   ├── find_files.ts     # Glob pattern file search
│   ├── search_text.ts    # Text/regex search (ripgrep or fallback, supports file+range)
│   ├── read_file.ts      # File content reader with ranges and head/tail inclusion
│   ├── get_line_count.ts # Quick file size checker
│   ├── get_file_symbols.ts # Symbol extraction (functions, classes, types)
│   ├── get_file_outline.ts # Hierarchical file structure outline
│   ├── read_readme.ts    # README finder and reader
│   ├── detect_languages.ts # Language composition detector
│   ├── hotfiles.ts       # Frequently modified files (git or fallback)
│   ├── edit_replace_exact.ts  # Exact text replacement
│   ├── edit_insert_at_line.ts # Insert at line number
│   ├── edit_after_anchor.ts   # Insert after anchor text
│   ├── edit_before_anchor.ts  # Insert before anchor text
│   ├── edit_replace_block.ts  # Replace content between anchors
│   ├── edit_create_file.ts    # Create or overwrite file
│   ├── edit_apply_batch.ts    # Atomic batch edits
│   ├── expand_output.ts       # Retrieve cached digested outputs
│   ├── find_code_block.ts     # Find code blocks containing text
│   ├── read_around.ts         # Context window around anchor
│   ├── find_blocks.ts         # Structural map without content
│   ├── edit_by_anchor.ts      # Unified anchor-based editing
│   └── shell_run.ts      # Shell command execution (requires approval)
├── ui/
│   ├── App.tsx            # Root Ink component, SIGINT handling, review wiring
│   ├── Composer.tsx       # Multiline input with slash command and @ file autocomplete
│   ├── CommandReview.tsx  # Interactive picker for commands (e.g., model selection)
│   ├── DiffReview.tsx     # Inline diff viewer with accept/reject
│   ├── ShellReview.tsx    # Shell command approval with run/always/deny
│   ├── LearningPrompt.tsx # Project learning Y/N prompt
│   ├── LearningProgress.tsx # Learning progress indicator
│   ├── StatusLine.tsx     # Model name, mode indicator, project path display
│   ├── Transcript.tsx     # User/assistant/tool/review/command entry rendering
│   ├── ConversationList.tsx # Conversation list for north conversations
│   └── ConversationPicker.tsx # Conversation picker for north resume
└── utils/
    ├── repo.ts           # Repo root detection
    ├── ignore.ts         # Gitignore parsing and file walking
    ├── editing.ts        # Diff computation and atomic file writes
    ├── tokens.ts         # Token estimation for context tracking
    ├── retry.ts          # Transient error retry with exponential backoff
    ├── fileindex.ts      # File index for @ mention autocomplete
    ├── filepreview.ts    # File preview + outline generation for context
    ├── fileblock.ts      # NORTH_FILE streaming parser with events
    ├── filesession.ts    # Streaming file writer with auto-resume
    ├── digest.ts         # Tool output digesting for context efficiency
    └── pricing.ts        # Model pricing data and cost calculation

tests/
└── openai-provider.test.ts  # OpenAI provider unit tests

Module Responsibilities

index.ts (Entry Point)

Parses CLI args and subcommands
Supported subcommands:
- north - start new conversation
- north resume <id> - resume conversation by ID
- north resume - open conversation picker
- north conversations or north list - list recent conversations
Flags: --path, --log-level
Detects repo root from start directory
Initializes logger
Renders Ink app (or list/picker components for subcommands)
Handles clean exit via waitUntilExit()
Wires tool logging callbacks
Generates conversation ID on new conversations
Loads conversation state on resume

commands/ (Slash Command System)

Registry-driven command system with span-based parsing, cursor-aware autocomplete, and interactive pickers.

commands/types.ts

Defines core types:

Mode: "ask" | "agent" - conversation mode type
CommandDefinition: name, description, usage, execute function
CommandContext: orchestrator methods available to commands
ParsedArgs: positional args and flags from parsing
PickerOption: id, label, hint for interactive selection
CommandReviewStatus: "pending" | "selected" | "cancelled"
StructuredSummary: goal, decisions, constraints, openTasks, importantFiles

commands/models.ts

Centralized model list shared by /model command and Composer autocomplete:

ProviderType: "anthropic" | "openai"
MODELS: array of { alias, pinned, display, contextLimitTokens, provider, supportsThinking?, thinkingBudget? }
resolveModelId(input): maps alias or pinned ID to pinned ID (supports both Claude and GPT prefixes)
getModelDisplay(modelId): returns human-readable name
getModelContextLimit(modelId): returns context limit in tokens
getModelProvider(modelId): returns provider type for model
getModelThinkingConfig(modelId): returns thinking config if model supports extended thinking
isThinkingModel(modelId): returns true if model ID ends with -thinking suffix
DEFAULT_MODEL: default pinned model ID (Claude Sonnet 4)

Extended Thinking Model Selection:

Thinking is now selected at model-choice time, not toggled separately
Each Anthropic model exists twice: once without thinking, once with -thinking suffix
Example: claude-sonnet-4-20250514 (fast) vs claude-sonnet-4-20250514-thinking (with extended thinking)
When you select a model with -thinking suffix, extended thinking is enabled automatically
OpenAI models don't have thinking variants (reasoning built into those models)

Supported Model Variants:

Anthropic:
- sonnet-4, sonnet-4-thinking
- opus-4, opus-4-thinking
- opus-4-1, opus-4-1-thinking
- sonnet-4-5, sonnet-4-5-thinking
- haiku-4-5, haiku-4-5-thinking
- opus-4-5, opus-4-5-thinking
OpenAI: gpt-5.1, gpt-5.1-codex, gpt-5.1-codex-mini, gpt-5.1-codex-max, gpt-5, gpt-5-mini, gpt-5-nano

commands/registry.ts

In-process registry mapping command name -> definition
register(command): add command to registry
has(name): check if command exists (used by parser)
list(): get all commands (used by /help and autocomplete)
execute(name, ctx, args): run command with error handling

commands/parse.ts

Span-based tokenizer for reliable command extraction:

parseCommandInvocations(input, registry): returns { invocations, remainingText }
Each invocation has name, args, span (start/end indices)
Parsing rules:
- /name must be preceded by start-of-line or whitespace
- Args stop at next /name token (unless inside quotes)
- Supports --flag value and -f short flags
- Quoted strings preserve whitespace
remainingText computed by slicing out spans in reverse order
getTokenAtCursor(value, cursorPos): for autocomplete

commands/commands/*.ts

Command	Usage	Purpose
`/quit`	`/quit`	Exit North cleanly
`/new`	`/new`	Reset chat (clears transcript + summary, keeps PTY)
`/help`	`/help`	List available commands
`/model`	`/model [alias]`	Switch model (with picker if no arg)
`/mode`	`/mode [ask\|agent]`	Switch conversation mode (with picker if no arg)
`/summarize`	`/summarize [--keep-last N]`	Summarize conversation, trim transcript
`/costs`	`/costs`	Show cost breakdown dialog by model/provider
`/learn`	`/learn`	Learn or relearn project codebase
`/conversations`	`/conversations`	Picker to switch conversations
`/resume`	`/resume <id>`	Switch to conversation by ID

logging/index.ts

Writes to ~/.local/state/north/north.log
JSON-lines format (one JSON object per line)
Events: app_start, user_prompt, model_request_start, model_request_complete, tool_call_start, tool_call_complete, write_review_shown, write_review_decision, write_apply_start, write_apply_complete, shell_review_shown, shell_review_decision, shell_run_start, shell_run_complete, app_exit
Silent fail on write errors (logging must not crash the app)

orchestrator/index.ts

Owns transcript (array of TranscriptEntry)
Owns isProcessing, pendingReviewId, currentModel, rollingSummary
Owns contextUsedTokens, contextLimitTokens, contextUsage for context tracking
Receives cursorRulesText in context (loaded once at startup)
Owns command registry via createCommandRegistryWithAllCommands()
Preprocesses user input for slash commands before sending to Claude
Accepts mode parameter in sendMessage(content, mode) to filter available tools
In Plan mode only: enforces plan requirement (write tools blocked until plan is accepted)
Implements tool call loop:
1. Parse and execute any slash commands in input
2. Add command_executed entry for each command
3. If remainingText non-empty, append user entry to transcript
4. Create assistant entry with isStreaming: true
5. Build messages and estimate token usage
6. If context usage >= 92%, auto-summarize conversation
7. Send messages to Claude with tool schemas and current model
8. Stream response text (throttled at ~32ms)
9. If stopReason === "tool_use":
  - Execute each tool via registry
  - For approvalPolicy: "write": create diff_review entry, block for user decision
  - For approvalPolicy: "shell": check allowlist, create shell_review if not allowed
  - On accept/run: apply edits or execute command, send result to Claude
  - On reject/deny: send rejection/denial to Claude
10. Continue until Claude stops requesting tools
Streaming throttle: buffer chunks, flush every 32ms or on complete
Emits state changes via onStateChange callback (includes currentModel)
buildMessagesForClaude(): excludes command_review and command_executed entries
Prepends cursorRulesText as first context block if present
Prepends rollingSummary as second context block if present
Exposes resolveWriteReview(reviewId, decision) for UI to signal accept/reject
Exposes resolveShellReview(reviewId, decision) for UI to signal run/always/deny
Exposes resolveCommandReview(reviewId, decision) for UI to signal selection/cancel
Exposes getCommandRegistry() for Composer autocomplete
Exposes cancel() for interrupting ongoing operations (CTRL+C during processing)
Exposes stop() for clean exit (CTRL+C when idle)
Exposes isProcessing() for checking if an operation is in progress

rules/cursor.ts

Loads Cursor project rules from .cursor/rules/ directory
Walks directory recursively, collecting all *.mdc files
Parses optional YAML frontmatter, extracts body content
Returns stable order (sorted by relativePath)
Hard cap at 30KB total size, truncates with [truncated] marker
API: loadCursorRules(repoRoot) returns LoadedCursorRules | null
LoadedCursorRules: { rules, text, truncated }
CursorRule: { name, relativePath, body }

shell/index.ts

Simple shell command execution using Bun's built-in Bun.spawn() API
No external dependencies - works in standalone compiled binaries
Each command spawns a fresh bash process (no persistent session)
Uses bash -c for command execution
Timeout handling: kills process after timeout (default 60s)
Cancellation support: accepts AbortSignal option to kill running commands on CTRL+C
Properly separates stdout and stderr streams
Per-project service caching for consistent interface
API: getShellService(repoRoot, logger) returns service with run(command, options) and dispose()
Run options: cwd, timeoutMs, signal (AbortSignal for cancellation)
disposeAllShellServices() cleans up all cached services on exit

storage/allowlist.ts

Per-project shell command allowlist at .north/allowlist.json
Simple JSON format: { "allowedCommands": ["pnpm test", "bun test"] }

storage/config.ts

Global configuration at ~/.config/north/config.json
Stores user preferences that persist across sessions
Currently stores: selectedModel (persisted model selection)
API: getSavedModel() returns saved model ID or null, saveSelectedModel(modelId) persists selection
API: isCommandAllowed(repoRoot, command), allowCommand(repoRoot, command), getAllowedCommands(repoRoot)
Exact string matching only (no patterns)
Creates .north/ directory on first write
Test isolation: Respects NORTH_CONFIG_DIR environment variable to override config directory for testing (prevents tests from modifying user's actual config)

storage/autoaccept.ts

Per-project auto-accept settings at .north/autoaccept.json
JSON format: { "editsAutoAccept": boolean, "shellAutoApprove": boolean }
Edit API: isEditsAutoAcceptEnabled(repoRoot), enableEditsAutoAccept(repoRoot), disableEditsAutoAccept(repoRoot)
Shell API: isShellAutoApproveEnabled(repoRoot), enableShellAutoApprove(repoRoot), disableShellAutoApprove(repoRoot)
When edits auto-accept enabled, all edit tool results are automatically applied without user confirmation
When shell auto-approve enabled, all shell commands run automatically without individual approval
Creates .north/ directory on first write

storage/costs.ts

Global API cost tracking at ~/.north/costs.json
JSON format: { "allTimeCostUsd": number, "byModel": Record<string, ModelCost>, "lastUpdated": number }
ModelCost: { inputTokens: number, outputTokens: number, costUsd: number }
API: getAllTimeCost(), getCostBreakdown(), addCostByModel(), resetAllTimeCost()
addCostByModel(modelId, inputTokens, outputTokens, costUsd) accumulates per-model and updates total
getCostBreakdown() returns full breakdown for /costs dialog
Creates ~/.north/ directory on first write
Test isolation: Respects NORTH_DATA_DIR environment variable to override data directory

storage/profile.ts

Per-project learning profile storage at ~/.north/projects/<hash>/profile.md
Hash-based project identification using SHA-256 of repo root path (16 chars)
Profile stored in markdown format with H2 sections for each discovery topic
Declined state tracked via declined.json marker file
API: hasProfile(repoRoot), loadProfile(repoRoot), saveProfile(repoRoot, content)
API: hasDeclined(repoRoot), markDeclined(repoRoot), clearDeclined(repoRoot)
getProjectHash(repoRoot) generates stable hash for directory identification
Storage location keeps repos clean (no commits of generated content)

storage/conversations.ts

Conversation persistence at ~/.north/conversations/
Each conversation identified by 6-char hex ID (e.g., abc123)
Event log format: <id>.jsonl (append-only JSONL for crash safety)
Optional snapshot: <id>.snapshot.json (full state for fast resume)
Index file: index.json (conversation metadata for listing)
Event types: conversation_started, entry_added, entry_updated, model_changed, rolling_summary_set, conversation_ended
API: generateConversationId(), startConversation(), loadConversation(), listConversations()
API: logEntryAdded(), logEntryUpdated(), logModelChanged(), logRollingSummarySet(), logConversationEnded()
Stores both repoRoot (path) and repoHash (stable ID) for portability
Resume validates repoRoot exists, warns if missing

profile/learn.ts

Project learning orchestration with 10 discovery topics
Topics: summary, architecture, conventions, vocabulary, data flow, dependencies, workflow, hotspots, playbook, safety
Runs sequential LLM sessions with read-only tools for each topic
Uses custom system prompt focused on concise exploration
Progress callback for UI updates (percent + topic name)
Tool filtering: only read-only tools available during learning
Returns complete markdown profile with H2 sections
Maximum 5 tool use iterations per topic to prevent infinite loops
Error handling: continues to next topic on failure

provider/index.ts (Provider Factory)

Exports createProviderForModel(modelId): creates correct provider based on model prefix
getModelProvider(modelId): returns "anthropic" or "openai" based on model
Re-exports common types: Provider, Message, StreamCallbacks, ToolCall, etc.
Orchestrator uses this to dynamically switch providers when /model changes

provider/anthropic.ts (Anthropic Provider)

Wraps @anthropic-ai/sdk
Default model: claude-sonnet-4-20250514
Handles -thinking suffix in model IDs:
- Model IDs may end with -thinking suffix (e.g., claude-sonnet-4-20250514-thinking)
- Provider automatically strips suffix before API call
- Passes thinking config separately via thinking option
Streaming via client.messages.stream() (Messages API)
Supports tool definitions and tool_use blocks
Per-request options: model, tools, systemOverride, signal (AbortSignal), thinking (ThinkingConfig)
systemOverride replaces default system prompt (used for summarization)
Callbacks: onChunk, onToolCall, onThinking, onComplete, onError
Abort support: checks signal during stream loop, returns stopReason: "cancelled"
Helpers for building tool result and assistant messages

Extended Thinking Support:

ThinkingConfig: { type: "enabled", budget_tokens: number } enables Claude's thinking mode
Handles thinking_delta and signature_delta events during streaming
ThinkingBlock: contains summarized thinking text and signature (for API continuity)
Thinking blocks must be preserved and passed back unmodified during tool loops
buildAssistantMessage() includes thinking blocks when provided
StreamResult includes thinkingBlocks array

provider/openai.ts (OpenAI Provider)

Uses native fetch with SSE streaming (no SDK dependency)
Endpoint: https://api.openai.com/v1/responses (Responses API)
Default model: gpt-5.1
Note: OpenAI models do not have -thinking variants; reasoning is intrinsic to their models
Streaming via SSE events: response.output_text.delta, response.function_call_arguments.delta
Tool format converted to OpenAI function format: { type: "function", function: { name, description, parameters } }
Tool results sent as function_call_output items with matching call_id
Per-request options: same interface as Anthropic provider
Abort support: passes AbortSignal to fetch, returns stopReason: "cancelled"
Env var: OPENAI_API_KEY required

Supported OpenAI Models:

Alias	Model ID	Description
gpt-5.1	gpt-5.1	GPT-5.1 flagship
gpt-5.1-codex	gpt-5.1-codex	Optimized for coding
gpt-5.1-codex-mini	gpt-5.1-codex-mini	Faster coding variant
gpt-5.1-codex-max	gpt-5.1-codex-max	Maximum capability coding
gpt-5	gpt-5	GPT-5 flagship
gpt-5-mini	gpt-5-mini	Faster GPT-5 variant
gpt-5-nano	gpt-5-nano	Fastest/cheapest GPT-5

tools/registry.ts

In-process registry mapping tool name -> definition
Each tool has: name, description, inputSchema, approvalPolicy, execute()
getSchemas() returns tool definitions for Claude API
getApprovalPolicy() returns "none", "write", or "shell" for a tool
execute() runs tool and returns structured result

tools/*.ts (Tool Implementations)

All tools follow the pattern:

Input validation
Operation scoped to repoRoot
Structured result with ok, data, error

Tool	Purpose	Key Features
`list_root`	List repo root entries	Respects .gitignore
`find_files`	Glob pattern search	Case-insensitive, limit
`search_text`	Text/regex search	Uses ripgrep if available, supports file+line range scope, optional contextLines (1-5)
`read_file`	Read file content	Line ranges, head/tail inclusion; use read_around for text search
`get_line_count`	Check file size	Quick stats before reading large files
`get_file_symbols`	Extract symbols	Functions, classes, types, interfaces (TS/JS/Py/Rust/Go/Java); redirects to find_blocks for HTML/CSS
`get_file_outline`	File structure outline	Hierarchical view with line numbers (TS/JS/Py/HTML/CSS)
`read_readme`	Read README	Auto-detect README.*
`detect_languages`	Language composition	By extension and size
`hotfiles`	Important files	Git history or fallback
`find_code_block`	Find code blocks	Locate functions/classes containing text, deduplicates nested HTML blocks
`expand_output`	Retrieve full output	Access cached digested tool outputs
`edit_replace_exact`	Replace exact text	Requires approval, enhanced failure diagnostics (whitespace, near-miss)
`edit_insert_at_line`	Insert at line	1-based, requires approval
`edit_after_anchor`	Insert after anchor	Anchor-based insertion, handles multiple matches
`edit_before_anchor`	Insert before anchor	Anchor-based insertion, handles multiple matches
`edit_replace_block`	Replace between anchors	Replace content between two text markers
`edit_create_file`	Create/overwrite file	Requires approval
`edit_apply_batch`	Atomic batch edits	All-or-nothing, requires approval
`shell_run`	Execute shell command	Persistent PTY, requires approval or allowlist, stderr merged into stdout
`read_around`	Context window	Asymmetric before/after lines around anchor, occurrence handling
`find_blocks`	Structural map	Block coordinates without content (html_section, css_rule, js_ts_symbol, csharp_symbol, php_symbol, java_symbol)
`edit_by_anchor`	Unified anchor edit	Four modes: insert_before, insert_after, replace_line, replace_between

Tool Output Digesting

North implements a context-efficient digesting layer that stores full tool outputs locally but forwards only condensed summaries to the model:

Digest Strategies by Tool:

Tool	Digest Format
`read_file`	First 50 lines + "... N more lines" + last 10 lines
`search_text`	First 10 matches with context, total count
`find_files`	First 20 files + "... N more"
`shell_run`	First 20 lines + last 10 lines of stdout
Others	Pass through (already compact)

Cache Behavior:

Full outputs are cached per conversation turn
Cache is cleared at the start of each sendMessage()
Use expand_output tool to retrieve full cached output
Digested outputs include outputId and digestNote for retrieval

Implementation:

src/utils/digest.ts: digestToolOutput() function with per-tool strategies
src/tools/expand_output.ts: Tool to retrieve cached full outputs
Orchestrator integrates digest layer in executeToolCall()

Anchor-Based Editing

North provides anchor-based edit tools that address content by text patterns instead of brittle line numbers:

Tools:

edit_after_anchor: Insert content after a line containing anchor text
edit_before_anchor: Insert content before a line containing anchor text
edit_replace_block: Replace content between two anchor markers

Behavior:

If anchor appears once: operation proceeds
If anchor appears multiple times without occurrence specified: returns candidates list
Candidates include line number and preview for disambiguation
Anchor-based edits are more reliable than line numbers across file changes

Example:

// Instead of: edit_insert_at_line({ path, line: 42, content })
// Use: edit_after_anchor({ path, anchor: "function setupApp() {", content })

Edit Failure Diagnostics

edit_replace_exact provides enhanced failure diagnostics when text is not found:

Whitespace Detection:

Tab vs space indentation mismatches
CRLF vs LF line ending differences
Trailing whitespace mismatches

Near-Miss Candidates:

Uses Levenshtein distance to find lines similar to the search text
Reports character-level differences (e.g., "differs at position 12: 'a' vs 'e'")
Shows line numbers for near matches

Actionable Hints:

Suggests read_around for verification
Recommends anchor-based editing as alternative

Example error output:

Text not found in file.

Possible whitespace issues:
  - Your search uses tabs but file uses spaces for indentation

Near matches found:
  - Line 42: "const myVariable = 1;"
    (differs at position 12: 'a' vs 'e')

Hint: Use read_around to see exact content, or use anchor-based editing (edit_by_anchor).

Find Code Block Tool

find_code_block enables "jump to place" navigation without multiple search/read cycles:

Input:

path: File to search
query: Text to find within blocks
kind: Optional filter - "function", "class", "method", "block", "any"

Output:

matches: Array of blocks containing the query
Each match includes: startLine, endLine, snippet (first 5 lines), kind, name
hint: Helpful tip when no blocks match but text exists (HTML/CSS files suggest find_blocks)

Supported Languages:

TypeScript/JavaScript: functions, classes, methods
Python: functions, classes (indentation-based)
CSS/SCSS: selectors, @media queries, @keyframes animations
HTML: semantic sections, embedded <style> blocks with CSS rules, embedded <script> blocks with JS symbols
Generic: brace-delimited blocks

Helpful Hints: When searching HTML/CSS files and no code blocks contain the query (but the text exists in the file), the tool returns a hint suggesting find_blocks for better structural navigation of CSS selectors, @media queries, and embedded blocks.

Large File Navigation Strategy

The tool system includes specialized tools to efficiently navigate and understand large files without reading entire contents:

Tool Chain for Large Files:

Check size first: Use get_line_count to determine file size before reading
Understand structure: Use get_file_symbols or get_file_outline to see what's in the file
Find targets: Use search_text with file+lineRange to locate specific content
Read strategically: Use read_file with specific line ranges and optional context

Symbol Extraction (get_file_symbols):

Regex-based parsing (fast, no dependencies)
Supported languages: TypeScript, JavaScript, Python, Rust, Go, Java
Extracts: functions, classes, interfaces, types, enums, methods
Returns: symbol name, type, line number, signature preview
Use case: "Where is function X defined?" or "What classes are in this file?"

File Outline (get_file_outline):

Hierarchical structure with line ranges
TypeScript/JavaScript: imports, symbols, exports
Python: imports, classes (with methods), functions
HTML: major sections (head, body, main, section), elements with IDs, embedded content parsing
CSS/SCSS/Less: selectors with line ranges, media queries, keyframes
Generic fallback: 50-line chunks
Use case: "Show me the overall structure of this 1000-line file"

HTML Embedded Block Parsing: For HTML files, get_file_outline now parses embedded <style> and <script> blocks:

<style> blocks: Shows CSS rules inside with nested indicator (└─ .selector)
<script> blocks: Shows JS symbols (functions, classes) with nested indicator
Example output includes: <style>, └─ .card, └─ @keyframes fadeIn, <script>, └─ function init

Enhanced Search (search_text):

New file parameter: search within a specific file only
New lineRange parameter: search within specific line range
New contextLines parameter: include 1-5 lines of context before/after each match
Language hints in description: "For TypeScript: search for 'export function'"
Use case: "Find all uses of X within lines 100-200 of file.ts"
Context use case: search_text({ query: "target", contextLines: 2 }) reduces follow-up read_around calls

Read File (read_file):

range: read specific line range (1-indexed)
includeHeadTail: always include first 10 and last 10 lines for orientation
Use read_around tool instead for text-based searching with context

System Prompt Guidance: The provider system prompts now explicitly instruct the LLM to:

Check file size before reading files >200 lines
Use symbols/outline tools to understand structure first
Never read entire files when only one section is needed
Chain tools strategically: outline → search → targeted read

Expected Impact:

60-80% token reduction when working with large files
Faster symbol lookups without full reads
Better targeting: LLM reads only what's needed
Clearer guidance through concrete strategies

read_around Tool

read_around provides a focused context window around an anchor string:

Input:

path: File to read
anchor: Text to find
before: Lines before match (default: 12)
after: Lines after match (default: 20)
occurrence: Which occurrence (1-based, required if multiple matches)

Output:

totalLines: File length
matchCount: How many times anchor appears
occurrenceUsed: Which occurrence was returned
matchLine: Line number of the match
content: Lines with line numbers, match line marked with >

Behavior:

0 matches: error suggesting search_text
Multiple matches without occurrence: error listing candidates with previews
Single call replaces "search → read range" pattern

find_blocks Tool

find_blocks returns a structural map with coordinates but no content:

Input:

path: File to map
kind: Filter - html_section, css_rule, js_ts_symbol, or all (default: auto-detect)

Output:

totalLines: File length
blocks: Array of { id, label, startLine, endLine }

Supported Kinds:

html_section: <section>, <article>, <nav>, elements with IDs
css_rule: selectors, @media, @keyframes
js_ts_symbol: functions, classes, interfaces, types, React components
csharp_symbol: namespaces, classes, structs, interfaces, methods, properties, enums
php_symbol: namespaces, classes, interfaces, traits, functions, methods
java_symbol: packages, classes, interfaces, enums, methods

Mixed HTML Support:

For HTML files with embedded <style> and <script> blocks, find_blocks automatically detects and parses both:

Returns the <style> block itself with line range
Parses CSS rules inside the style block (selectors, @media, @keyframes)
Returns the <script> block itself with line range
Parses JS/TS symbols inside the script block (functions, classes)

Example output for mixed HTML:

blocks: [
  { id: "html-0", label: "<header>", startLine: 5, endLine: 20 },
  { id: "style-0", label: "<style> (lines 22-45)", startLine: 22, endLine: 45 },
  { id: "style0-css-0", label: ".site-footer", startLine: 24, endLine: 28 },
  { id: "script-0", label: "<script> (lines 50-80)", startLine: 50, endLine: 80 },
  { id: "script0-js-0", label: "function initApp", startLine: 52, endLine: 65 }
]

Use case: Get coordinates in one call, then use read_around for targeted reading. For mixed HTML files, use this to locate specific CSS rules or JS functions before editing.

edit_by_anchor Tool

edit_by_anchor provides unified anchor-based editing with four modes:

Input:

path: File to edit
mode: insert_before, insert_after, replace_line, or replace_between
anchor: Primary anchor text
anchorEnd: End anchor (required for replace_between)
content: Content to insert/replace
occurrence: Which occurrence (1-based, required if multiple matches)
inclusive: For replace_between - replace anchor lines too (default: false)

Mode Behaviors:

Mode	Effect
`insert_before`	Insert content before anchor line
`insert_after`	Insert content after anchor line
`replace_line`	Replace the anchor line with content
`replace_between`	Replace content between two anchors

Safety:

0 matches: error
Multiple matches without occurrence: error listing candidates
replace_line mode is new capability (replaces the anchor line itself)

utils/ignore.ts

Parses .gitignore patterns
Always ignores common directories (node_modules, .git, etc.)
createIgnoreChecker() returns checker with isIgnored(path, isDir)
walkDirectory() recursively walks repo respecting ignores
listRootEntries() lists root level entries

ui/App.tsx

Root Ink component, wires orchestrator to UI state
Uses alternate screen buffer via useAlternateScreen() hook (like htop/less)
Tracks terminal dimensions via useTerminalSize() hook for viewport calculations
CTRL+C handling via useInput: cancel if processing, exit if idle
Requires exitOnCtrlC: false in render options to prevent Ink's default exit behavior
Delegates review decisions to orchestrator methods (write, shell, command, plan)
Tracks isProcessing, pendingReviewId, nextMode, and scrollOffset for UI state
Passes mode to orchestrator on message submission
Auto-resets scroll to bottom when transcript changes
Layout: ScrollableTranscript (viewport-height top), Composer (fixed bottom), StatusLine (fixed bottom)

ui/useAlternateScreen.ts

Custom hook that switches terminal to alternate screen buffer on mount
Uses ANSI escape codes: \x1b[?1049h (enter) and \x1b[?1049l (exit)
Hides cursor during render, shows on exit
Alternate screen means transcript is not in terminal scrollback after exit
Similar behavior to less, htop, vim

ui/useTerminalSize.ts

Custom hook that tracks terminal dimensions (rows and columns)
Listens to stdout "resize" events for dynamic updates
Returns { rows, columns } object
Used by App to calculate viewport height for ScrollableTranscript

ui/StatusLine.tsx

Full-width status bar using width="100%" and justifyContent="space-between"
Left side: project name with truncation for long names (wrap="truncate")
Right side: scroll indicator, current model name, context usage, and cost display
Scroll indicator: yellow [SCROLL] badge when scrollOffset > 0 (not at bottom)
Model display: shows the selected model (including -thinking suffix if thinking is enabled)
Context display: color-coded circle (green < 60%, yellow 60-85%, red > 85%) + token count + percentage
Token count formatted as K/M for readability (e.g., "42.5K (21%)")
Cost display: session cost (green) / all-time cost (blue) in USD
Updates in real-time as context fills and costs accumulate

ui/CostsDialog.tsx

Centered modal dialog showing cost breakdown
Triggered by /costs command via showCostsDialog() context method
Displays two sections: Session Costs and All-Time Costs
Groups costs by provider (Anthropic, OpenAI) then by model
Shows input/output token counts and USD cost per model
Provider subtotals and section totals displayed
Press Esc or Q to close dialog
Reads all-time breakdown from ~/.north/costs.json via getCostBreakdown()
Session costs passed from orchestrator state

ui/Composer.tsx

Multiline input with Ctrl+J or Shift+Enter for newlines
Paste support: multi-character input and newlines are detected and inserted directly
Dynamic height: grows as content is added, reports line count to parent
Shows "Ctrl+C to cancel" hint when disabled/waiting
Mode cycling with Tab key (when no autocomplete suggestions):
- Cycles: ask → agent → ask
- Mode applies to next message only (per-request mode)
- Visual indicator shows current mode in top-right
Cursor-aware slash command autocomplete:
- Detects / tokens at cursor position
- Queries command registry for suggestions
- Shows dropdown with command name + description
- Tab to insert (when suggestions present), Up/Down to navigate, Esc to close
Model argument autocomplete for /model command:
- Detects when cursor follows /model
- Shows model aliases with display names
File mention autocomplete with @:
- Detects @ tokens at cursor position
- Fuzzy matches against project files (respecting .gitignore)
- Shows dropdown with filename + full path hint
- Tab/Enter to attach file, Space/Esc to cancel (treat @ as literal)
- Attached files tracked in component state
- Visual indicator shows count of attached files
- On submit, attached files passed to orchestrator for context injection
Smart space insertion: only adds space after completion if needed
Clamps selection index when suggestions change

ui/ScrollableTranscript.tsx

Renders conversation history with in-app scrolling (no terminal scrollback dependency)
Pre-computes wrapped lines with ANSI color codes using wrap-ansi
Renders only visible lines based on viewport height and scroll offset
User messages: cyan label
Assistant messages: magenta label
Tool messages: yellow ⚡ icon, gray text
Command executed messages: blue ⚙ icon with result
Interactive entries (diff_review, shell_review, command_review) rendered at bottom only when pending
Resolved interactive entries convert to compact text lines that flow with transcript
Keyboard navigation for scrolling (when composer not active):
- Up/Down: scroll one line
- PageUp/PageDown: scroll viewport height
- G: jump to bottom (follow mode)
Auto-scrolls to bottom when new content arrives
Animation hooks disabled when transcript exceeds 100 entries

ui/Transcript.tsx (legacy)

Legacy transcript renderer using Ink's <Static> component pattern
Kept for reference but replaced by ScrollableTranscript

ui/CommandReview.tsx

Renders interactive picker for commands needing selection
Used by /model when no argument provided
Shows list of options with labels and hints
Keyboard shortcuts: Up/Down navigate, Enter select, Esc cancel
Status badges: pending (yellow), selected (green), cancelled (red)

ui/DiffReview.tsx

Renders inline diff with syntax highlighting
Green for additions, red for deletions, cyan for hunk headers
Truncates diffs over 100 lines with indicator
Shows file stats (+lines/-lines)
Keyboard shortcuts: a accept, y always (auto-accept all future edits), r reject
Status badges: pending (pulsing yellow border), accepted (green), always/auto-applied (cyan), rejected (red)
Animation: border color pulses when status is pending to draw attention
"Always" option: enables auto-accept for all future edit operations in this project

ui/ShellReview.tsx

Renders shell command approval prompt
Shows command and optional cwd
Keyboard shortcuts: r run, a always (adds to allowlist), y auto all (approves all future commands), d deny
Status badges: pending (pulsing yellow border), ran/always/auto (green), denied (red)
Animation: border color pulses when status is pending to draw attention
"Auto All" option: enables global auto-approve for all future shell commands in this project

utils/editing.ts

resolveSafePath(): validates paths stay within repo root with symlink resolution
- First checks normalized path is within repo
- Resolves symlinks using realpathSync() to prevent path traversal attacks
- Verifies resolved real path is still within repo boundary
- For non-existent files (during creation), validates parent directory instead
readFileContent(): safe file reading with error handling
preserveTrailingNewline(): ensures trailing newline consistency after edits
computeUnifiedDiff(): generates unified diff format
computeCreateFileDiff(): generates diff for new files
applyEditsAtomically(): writes to temp files then renames for safety; handles cross-filesystem scenarios (EXDEV) via copy+unlink fallback

utils/tokens.ts

Token estimation for context tracking:

estimatePromptTokens(systemPrompt, messages): estimates total tokens in request
Uses character-based heuristic (3.5 chars per token)
Applies 10% safety margin to reduce overflow risk
Returns structured breakdown: system, messages, overhead
Handles both string and structured message content (tool results, etc.)

utils/pricing.ts

Model pricing data and cost calculation:

ModelPricing: interface for per-model pricing (input, output, cached input, cache read/write)
TokenUsage: interface for token counts (input, output, cached, cache read/write)
getModelPricing(modelId): returns pricing data for a model (falls back to defaults for unknown models)
calculateCost(modelId, usage): computes USD cost from token usage
formatCost(cost): formats cost as string (e.g., "$0.123", "$1.50")

Anthropic Pricing (per 1M tokens):

Model	Input	Output	Cache Write	Cache Hit
claude-sonnet-4-*	$3.00	$15.00	$3.75	$0.30
claude-opus-4-*	$15.00	$75.00	$18.75	$1.50
claude-opus-4-1-*	$15.00	$75.00	$18.75	$1.50
claude-sonnet-4-5-*	$3.00	$15.00	$3.75	$0.30
claude-haiku-4-5-*	$1.00	$5.00	$1.25	$0.10
claude-opus-4-5-*	$5.00	$25.00	$6.25	$0.50

OpenAI Pricing (per 1M tokens):

Model	Input	Output	Cached Input
gpt-5.1	$1.25	$10.00	$0.125
gpt-5.1-codex	$1.25	$10.00	$0.125
gpt-5.1-codex-mini	$0.25	$2.00	$0.025
gpt-5.1-codex-max	$1.25	$10.00	$0.125
gpt-5	$1.25	$10.00	$0.125
gpt-5-mini	$0.25	$2.00	$0.025
gpt-5-nano	$0.05	$0.40	$0.005

utils/fileindex.ts

File index for @ mention autocomplete in Composer:

getFileIndex(repoRoot): returns cached list of all non-ignored files
Uses walkDirectory() from ignore.ts with 5000 file cap
fuzzyMatchFiles(query, files, limit): fuzzy match files against query
Scoring: exact filename > prefix match > contains > subsequence
Cache per repoRoot for performance
clearFileIndexCache(repoRoot?): clear cache when needed

utils/filepreview.ts

File preview generation for attached file context:

generateFilePreview(repoRoot, filePath): returns preview + outline
Preview: first 30 lines or 2KB (whichever is smaller)
Outline: extracted symbols (functions, classes, types) with line numbers
Supports TypeScript/JavaScript and Python symbol extraction
formatAttachedFilesContext(repoRoot, filePaths): formats multiple files for injection
Output format: markdown with code blocks and symbol outlines
Limited to 15 symbols per file with "more" indicator

Data Flow

Startup Flow

main()
    │
    ├──► parseArgs()
    ├──► detectRepoRoot()
    ├──► initLogger()
    │
    ▼
loadCursorRules(projectPath)
    │
    ├──► Walk .cursor/rules/ for *.mdc files
    ├──► Parse frontmatter, extract body
    ├──► Sort by relativePath
    ├──► Concatenate into single text block
    │
    ▼
render(App, { cursorRulesText, ... })
    │
    ▼
Orchestrator created with cursorRulesText in context

User Input with Commands

User Input (may contain /commands)
    │
    ▼
Composer.onSubmit(content)
    │
    ▼
App.handleSubmit(content)
    │
    ├──► logger.info("user_prompt", { length })
    │
    ▼
orchestrator.sendMessage(content)
    │
    ▼
parseCommandInvocations(content, commandRegistry)
    │
    ├──► For each command invocation:
    │       │
    │       ├──► Execute command via registry
    │       ├──► If picker needed: create command_review entry, block for selection
    │       └──► Add command_executed entry with result
    │
    ▼
If remainingText.trim() non-empty:
    │
    ├──► Push user entry to transcript
    │
    ▼
┌─► provider.stream(messages, { tools, model })
│       │
│       ├──► onChunk: buffer chunks, throttled emit
│       ├──► onToolCall: add tool intent to transcript
│       │
│       ▼
│   Stream completes
│       │
│       ├──► If tool calls requested:
│       │       │
│       │       ├──► Execute each tool
│       │       ├──► Log tool_call_start/complete
│       │       ├──► Update tool entry with result
│       │       ├──► Build tool result message
│       │       └──► Continue loop ─────────────────┐
│       │                                           │
│       └──► No tools: exit loop                    │
│                                                   │
└───────────────────────────────────────────────────┘
    │
    ▼
Transcript re-renders with all content

CTRL+C Signal Flow

SIGINT (Ctrl+C)
    │
    ▼
App.handleSigint()
    │
    ├──► If orchestrator.isProcessing():
    │       │
    │       ▼
    │   orchestrator.cancel()
    │       │
    │       ├──► currentAbortController.abort()
    │       ├──► Resolve pending reviews (reject/deny/cancel)
    │       ├──► Set cancelled = true
    │       └──► Return to input
    │
    └──► If not processing:
            │
            ▼
        orchestrator.stop()
            │
            ├──► disposeAllShellServices()
            └──► exit()

Key Implementation Details

Path Security

All file operations use symlink-aware path validation to prevent path traversal attacks:

Security layers:

Normalization: Resolve .. and . segments in paths
Boundary check: Verify normalized path is within repo root
Symlink resolution: Use realpathSync() to resolve symlinks
Final verification: Ensure resolved real path is still within repo boundary

Implementation sites:

resolveSafePath() in utils/editing.ts - Used by all write operations
resolvePath() in tools/read_file.ts - Used by read operations

Non-existent file handling:

When file doesn't exist (e.g., during creation), recursively validates parent directories
Walks up the directory tree until finding an existing directory, then validates it
Ensures at least one ancestor directory exists and resolves within repo
Prevents creating files via symlink directory chains that escape repo
Supports creating files in deeply nested directories that don't exist yet (e.g., deep/nested/dir/file.txt)

Attack prevented: A symlink inside the repo pointing to /etc/passwd or other sensitive files would fail validation because the real path would resolve outside the repo boundary.

Slash Command Execution

When user input contains slash commands:

parseCommandInvocations() tokenizes input, finds registered commands
Each command has span (start/end indices) for clean removal
Commands execute sequentially via registry
If command needs picker (e.g., /model without arg):
- Creates command_review transcript entry
- Blocks until user selects or cancels
- Updates entry with selection status
After execution, command_executed entry added with result message
remainingText (input with commands removed) sent to Claude if non-empty
command_review and command_executed entries are excluded from buildMessagesForClaude()

Rolling Summary

The /summarize command:

Calls generateSummary() which prompts Claude for structured JSON
Uses systemOverride with minimal prompt (no tool guidance noise)
Returns StructuredSummary or null on failure
On success: sets rolling summary, trims transcript
trimTranscript(keepLast) preserves chronological order:
- Keeps last N user/assistant entries
- Preserves non-pending diff_review and shell_review outcomes
- Filters original array (no reordering)
Rolling summary prepended to Claude context as structured block

Model Switching

Model selection via /model:

With argument: resolveModelId() maps alias → pinned ID
Without argument: shows picker with all models (both Anthropic and OpenAI)
getModelProvider(modelId) determines which provider to use
createProviderForModel() creates appropriate provider instance
Switching between providers (e.g., Claude → GPT) recreates provider
Extended thinking is now selected at model time, not via separate toggle:
- Select a model with -thinking suffix to enable extended thinking
- Example: claude-sonnet-4-5-thinking enables thinking for Sonnet 4.5
- OpenAI models do not have thinking variants (reasoning is intrinsic to those models)
Anthropic provider automatically strips -thinking suffix before API call and passes thinking config
isThinkingModel(modelId) checks if thinking is enabled for current model
currentModel stored in orchestrator state (includes -thinking suffix if enabled)
thinkingEnabled computed from whether currentModel ends with -thinking
Context limit updates automatically on model change
Selection persisted globally to ~/.config/north/config.json
On startup, loads saved model or defaults to Claude Sonnet 4

Assistant Name Display:

getAssistantName(modelId) returns "Assistant"
Transcript displays correct name for current model

Environment Variables:

ANTHROPIC_API_KEY: required for Claude models
OPENAI_API_KEY: required for GPT models

System Prompt Structure

Both providers (anthropic.ts and openai.ts) use identical system prompts with a Cursor-inspired structured format using XML-like sections:

Sections:

<communication> - Tone, formatting, honesty rules (no lying, no guessing paths). Includes operational workflow: "If you need a file, find it first."
<tool_calling> - Schema adherence, batch-level narration (not per-call), batching etiquette (1-2 info rounds before edits, no re-reading same ranges)
<planning> - Micro-planning for 2+ file tasks (2-5 bullet plan, then execute immediately)
<search_and_reading> - Question-first search methodology, formulation checklist (broad → narrow → minimal reads), bias toward self-discovery, tool selection by file type (HTML/CSS → find_blocks, JS/TS/Python → get_file_outline), optimal tool chain for HTML/CSS (find_blocks → search_text → read_around → edit)
<making_code_changes> - Default workflow (locate → confirm → atomic write → verify), read before edit, one edit per turn or atomic batch, no large pastes
<verification> - Mandatory verification after edits, fix duplication/malformed structure immediately
<mixed_files> - Strategy for HTML with embedded style/script: use find_blocks first, target by coordinates, pre-check selectors
<tool_churn_limits> - After 2 reads + 1 write without success, switch to structure-first and atomic edits
<debugging> - Edit only if confident, retry logic (re-read once on mismatch, max 3 lint loops)
<calling_external_apis> - Only when explicitly requested
<long_running_commands> - Never start dev servers or processes needing CTRL+C to stop
<conversation> - Session UX rules (end with "Next I would: ...", acknowledge session resumption)

Key behaviors enforced:

"If you did not read it, do not claim it exists"
Never guess file paths or symbol names; find files/symbols before describing behavior
Describe actions in natural language ("I'll search the repo") not tool names
Before any batch of tool calls, write one sentence explaining the batch goal (not per-call)
Prefer 1-2 rounds of info gathering before any edits; edit in the same turn when ready
Plan briefly (2-5 bullets) for 2+ file tasks, then execute immediately
Phrase search needs as questions first, then translate to exact patterns
Retry once on edit mismatch, then ask for clarification
Prefer surgical edits over large rewrites; break large content into chunks
For new files >200 lines: create skeleton first, then add content in subsequent edits
Avoid generating >300 lines in a single tool call
End longer responses with "Next I would: ..." to signal continuation
Default workflow: LOCATE → CONFIRM → ATOMIC WRITE → VERIFY
Verification mandatory: After every edit, read the edited region to confirm
Mixed files: For HTML with embedded CSS/JS, use find_blocks to get structural map first
Tool churn limits: After 2 reads + 1 write on same file, switch to structure-first atomic edits

Mode System

North supports two conversation modes that control tool availability:

Mode Types:

Ask Mode: Read-only - only read tools available (read_file, search_text, find_files, list_root, read_readme, detect_languages, hotfiles, get_line_count, get_file_symbols, get_file_outline, expand_output, find_code_block, read_around, find_blocks)
Agent Mode: Full access - all tools available including write and shell tools

Mode Selection:

Mode is per-request, not global state
Set via /mode command (with optional argument or interactive picker)
Cycle with Tab key in Composer: ask → agent → ask
Tab cycles mode only when no autocomplete suggestions are present
Current mode shown in Composer badge and StatusLine

Tool Filtering:

Orchestrator's sendMessage(content, mode) accepts mode parameter
Tools filtered via filterToolsForMode(mode, allSchemas) before sending to Claude
Only tools allowed by current mode are included in API request

Cost Tracking

North tracks API costs in real-time, displaying both session and all-time totals.

How it works:

Providers capture actual token usage from API responses (usage field in StreamResult)
After each successful API request, orchestrator calculates cost using calculateCost()
Session cost accumulated in memory, all-time cost persisted to ~/.north/costs.json
StatusLine displays both costs: $session / $all-time

Token usage sources:

Anthropic:

message_start and message_delta events contain usage object
input_tokens: non-cached, non-cache-write input tokens (already excludes cached)
output_tokens: output tokens (includes extended thinking)
cache_read_input_tokens: tokens served from cache (charged at reduced cache hit rate)
cache_creation_input_tokens: tokens written to cache (charged at cache write rate)

OpenAI:

response.completed event contains response.usage object
input_tokens: total input tokens
output_tokens: output tokens (includes reasoning tokens)
input_tokens_details.cached_tokens: tokens served from prompt cache (charged at reduced rate)

Cost calculation (additive model):

Anthropic (fields are additive, not subtractive):

Base input cost = inputTokens × inputRate
Cache hit cost = cacheReadTokens × cacheHitRate
Cache write cost = cacheWriteTokens × cacheWriteRate
Output cost = outputTokens × outputRate

OpenAI (cachedInputTokens subtracted from total):

Non-cached input cost = (inputTokens - cachedInputTokens) × inputRate
Cached input cost = cachedInputTokens × cachedInputRate
Output cost = outputTokens × outputRate

Note: Currently only supports 5-minute cache duration pricing for Anthropic. 1-hour cache has higher write rates not yet modeled.

Display format:

Session cost: green color, shows cost since app started
All-time cost: blue color, shows cumulative cost across all sessions
Format: $0.00 to $0.001 (3 decimals for small), $0.12 to $99.99 (2 decimals)

Context Tracking & Auto-Summarization

North tracks context usage in real-time to prevent overflow:

Token Estimation (before each request):
- Builds outgoing messages payload (system + transcript + injected rules)
- Estimates tokens using character-based heuristic (3.5 chars/token)
- Applies 10% safety margin
- Updates contextUsedTokens, contextLimitTokens, contextUsage
Visual Indicator (StatusLine):
- Green circle: < 60% usage
- Yellow circle: 60-85% usage
- Red circle: > 85% usage
- Shows numeric percentage
Auto-Summarization (at 92% threshold):
- Automatically calls generateSummary() before sending request
- Replaces older transcript with structured summary
- Keeps last 10 messages verbatim
- Preserves injected rules and context
- Recomputes usage after compaction
- Proceeds with request normally
Per-Model Limits:
- All current Claude models: 200K tokens
- Limit updates automatically on model switch
- Usage recalculated with new limit

Tool Call Loop

When Claude requests tools:

Stream completes with stopReason: "tool_use"
Orchestrator executes each tool via registry
Results are JSON-stringified and sent back as tool_result blocks
Claude processes results and may request more tools or respond

Tool Result Consistency

The API requires every tool_use block to have a corresponding tool_result. To ensure this:

Write tool ID tracking: Tool IDs are only added to writeToolCallIds after the tool succeeds and a diff_review entry is created. Failed write tools have their results sent via the normal tool entry path.
Recovery mechanism: If the API returns an "orphaned tool_use" error (tool_use without tool_result), the orchestrator:
- Extracts the orphaned tool ID from the error message
- Removes it from writeToolCallIds and shellToolCallIds sets
- Removes the incomplete assistant entry from transcript
- Retries the request once
- Logs the recovery event for debugging

Transient Error Retry

The orchestrator automatically retries API requests that fail due to transient errors:

Retryable errors:

Network errors: ECONNREFUSED, ECONNRESET, ETIMEDOUT, ENETUNREACH, socket hang up, fetch failed
Rate limits: HTTP 429, "rate limit", "too many requests"
Server errors: HTTP 5xx, "overloaded", "service unavailable", "internal server error"
Incomplete streams: "incomplete tool call", "possible timeout" (stream ended mid-tool-generation)

Non-retryable errors (fail immediately):

Authentication errors (401, 403)
Bad request errors (400)
Cancellation/abort

API Request Timeout:

Both providers configured with 10-minute timeout for large responses
Anthropic: timeout option passed to SDK client constructor
OpenAI: AbortSignal.timeout() combined with user abort signal via AbortSignal.any()

Incomplete Stream Detection:

Anthropic: If currentToolId is set when stream ends, throws error (tool was mid-generation)
OpenAI: If stopReason === null and toolCallsInProgress has entries, throws error
These errors are detected by isRetryableError() and trigger automatic retry

Retry behavior:

Maximum 3 retry attempts per conversation turn
Exponential backoff with jitter: ~1s, ~2s, ~4s (capped at 30s)
Counter resets after successful request
Silent retry (no UI change, request resumes after delay)
Logs api_retry_attempt event with attempt count, delay, and error message

Write Approval Flow

When Claude requests an edit tool (approvalPolicy: "write"):

In Plan mode only: Orchestrator checks if acceptedPlan exists - if not, returns PLAN_REQUIRED error
Orchestrator checks if auto-accept is enabled (.north/autoaccept.json)
If auto-accept enabled: edits applied immediately, status set to "always", Claude continues
If auto-accept disabled:
- Tool executes in "prepare" mode - computes diff but doesn't write
- Orchestrator creates a diff_review transcript entry with status "pending"
- Tool loop blocks, waiting for user decision
- DiffReview component renders inline diff with Accept/Always/Reject options
- User presses a (accept), y (always), or r (reject)
- On Accept: edits applied atomically (temp files then rename)
- On Always: enables auto-accept for future edits, applies current edits
- On Reject: nothing written, status set to "rejected"
Tool result sent to Claude with outcome (applied: true/false)
Claude continues processing

NORTH_FILE Protocol

File creation uses a streaming-to-disk protocol where the model outputs file contents as plain assistant text. Content is written directly to disk as it streams, with automatic continuation on provider timeouts.

Format:

<NORTH_FILE path="relative/path/to/file.ts">
...file contents...
</NORTH_FILE>

Continuation format (auto-generated on timeout):

<NORTH_FILE path="relative/path/to/file.ts" mode="append">
...continuation content...
</NORTH_FILE>

Why streaming-to-disk:

Provider timeouts (~90s) can interrupt large file generation
Tool calls buffer in memory and lose all content on timeout
Direct-to-disk streaming preserves partial content
Auto-continuation resumes from last written line

Flow:

Model outputs <NORTH_FILE path="..."> tag in response
StreamingFileBlockParser detects open tag, emits session_start event
FileWriteSession created, opens file at final path (creates parent dirs if needed)
Content chunks written directly to disk as they stream
Session tracks linesWritten and maintains 30-line trailing window for context
When </NORTH_FILE> closes: session finalized, diff review triggered
On accept: file already written, nothing more to do
On reject: file deleted from disk

Auto-continuation on timeout:

Stream ends without close tag (provider timeout ~90s)
Orchestrator detects incomplete session
Sends continuation prompt with trailing window context
Model responds with <NORTH_FILE mode="append"> block
Content appended to existing file
Repeats until complete or max retries (3) exceeded
On max retries: partial file preserved, error surfaced to user

Implementation (src/utils/):

fileblock.ts:
- StreamingFileBlockParser - event-based streaming parser
- Events: session_start, session_content, session_complete, display_text
- Parses mode="append" attribute for continuation blocks
filesession.ts:
- FileWriteSession - streaming file writer with line tracking
- startSession(repoRoot, path) - creates new file
- appendToSession(...) - continues from existing state
- getResumeInfo() - returns lines written + trailing window

Tool Input Size Guard:

All tool inputs checked against 50KB limit before execution
Prevents large payloads from being sent via tools
Error message directs model to use NORTH_FILE protocol instead

Shell Approval Flow

When Claude requests shell_run (approvalPolicy: "shell"):

Orchestrator checks if global auto-approve is enabled (.north/autoaccept.json)
If auto-approve enabled: execute immediately, status set to "auto", return result
If not auto-approved, check if command is in .north/allowlist.json
If allowed: execute immediately, status set to "always", return result
If not allowed: create shell_review transcript entry with status "pending"
Tool loop blocks, waiting for user decision
ShellReview component renders command with Run/Always/Auto All/Deny options
User presses r (run), a (always), y (auto all), or d (deny)
On Run: execute command, status set to "ran"
On Always: add to allowlist, execute command, status set to "always"
On Auto All: enable global auto-approve, execute command, status set to "auto"
On Deny: return { denied: true } to Claude, status set to "denied"
Tool result sent to Claude with outcome
Claude continues processing

Approval Priority: Global auto-approve (step 1) takes precedence over command allowlist (step 3). Once auto-approve is enabled, all commands run automatically without checking the allowlist.

File Mentions (@ Autocomplete)

North supports @ file mentions similar to Cursor and Claude Code. Users can attach files to their messages for automatic context injection.

User Flow:

User types @ in the Composer
Autocomplete shows fuzzy-matched project files (respecting .gitignore)
User can:
- Tab/Enter: Accept suggestion, file becomes attached
- Space/Escape: Dismiss autocomplete, @ treated as literal text
Attached files shown as badge in Composer (e.g., "📎 2 files attached")
On message submit, attached files passed to orchestrator

Context Injection:

Orchestrator receives attachedFiles: string[] in sendMessage()
In buildMessagesForClaude(), attached files injected as context block
Position: after cursor rules, after project profile, before rolling summary
Format per file:
- Markdown header with file path
- Code block with first 30 lines (or 2KB)
- Symbol outline (functions, classes, types with line numbers)

Example Injected Context:

# Attached Files

## src/ui/Composer.tsx

```typescript
import React, { useState, useMemo, useEffect } from "react";
import { Box, Text, useInput } from "ink";
... [27 more lines]

Outline (Composer.tsx):

interface Suggestion (line 7)
interface ComposerProps (line 14)
function Composer (line 197)


**File Index:**
- Built lazily on first `@` autocomplete
- Cached per repoRoot for performance
- Respects .gitignore via `walkDirectory()`
- Capped at 5000 files

**Fuzzy Matching:**
- Exact filename matches score highest
- Prefix matches score next
- Subsequence matching for partial queries
- Results sorted by score, limited to 10 suggestions

### Cursor Rules Loading

North automatically loads Cursor project rules at startup:

1. **Loading** (in `index.ts`):
   - Calls `loadCursorRules(projectPath)` once before rendering
   - Walks `.cursor/rules/` recursively for `*.mdc` files
   - Parses YAML frontmatter (if present), keeps body content
   - Sorts by relativePath for deterministic order
   - Enforces 30KB hard cap, truncates if exceeded

2. **Storage** (in orchestrator context):
   - `cursorRulesText` passed through App to orchestrator context
   - Stored as plain string or null

3. **Injection** (in `buildMessagesForClaude()`):
   - If `cursorRulesText` is non-empty, prepends to every request
   - Format: `# Cursor Project Rules (.cursor/rules)` header
   - Each rule: `## relativePath` followed by rule body
   - Injected before rolling summary, ensuring rules always apply

4. **Format of injected rules**:

Cursor Project Rules (.cursor/rules)

path/to/rule.mdc

another-rule.mdc

```

Project Learning

North can learn a project on first run and store a persistent profile for context in future conversations.

Startup Flow:

Profile Detection (in index.ts):
- Checks if profile exists via hasProfile(repoRoot)
- If profile exists: loads it with loadProfile(repoRoot)
- If no profile and not declined: sets needsLearningPrompt = true
Learning Prompt (first-time projects):
- App renders LearningPrompt component when learningPromptId is set
- User presses y (accept) or n (decline)
- On decline: marks project as declined via markDeclined(repoRoot)
- On accept: triggers orchestrator.startLearningSession()
Learning Session:
- Runs 10 sequential discovery topics via runLearningSession()
- Each topic: focused LLM query with read-only tools
- Progress updates via callback: onProgress(percent, topicTitle)
- UI shows LearningProgress component with percent and current topic
- Profile saved to ~/.north/projects/<hash>/profile.md
Profile Injection (in buildMessagesForClaude()):
- If projectProfileText exists, inject after cursor rules
- Format: markdown with H2 sections for each topic
- Position: Cursor rules → Project profile → Rolling summary → Transcript

Discovery Topics:

Project Summary - What it is, who it's for, workflows, what it doesn't do
Architecture Map - Major modules, entry points, structure
Code Style and Conventions - Naming, layout, formatting, lint rules
Domain Model Vocabulary - Key concepts, terms, canonical locations
Data Flow and State - Persistence, caches, data paths
External Dependencies - Frameworks, libraries, services, config
Build, Run, and Test Workflow - Commands and workflows
Hot Spots and Change Patterns - Frequently changed areas
Common Tasks Playbook - Where to implement common changes
Safety Rails and Footguns - Known pitfalls and constraints

Storage:

Profile: ~/.north/projects/<hash>/profile.md
Declined marker: ~/.north/projects/<hash>/declined.json
Hash: SHA-256 of repo root path (first 16 chars)
Format: Markdown with # Project Profile header + H2 sections

/learn Command:

Clears declined marker via clearDeclined(repoRoot)
Triggers learning session via ctx.triggerLearning()
Overwrites existing profile if present
Use case: manually update profile after major project changes

UI Components:

LearningPrompt: Y/N prompt with border pulse animation (pending)
LearningProgress: Percent + topic name display during learning

State Management:

Orchestrator tracks: learningPromptId, learningInProgress, learningPercent, learningTopic
Transcript entries: learning_prompt (with status), learning_progress (with percent/topic)
Learning entries excluded from buildMessagesForClaude() (UI-only)

Conversation Persistence

North persists conversations for later resumption using an append-only event log.

Storage Location:

~/.north/conversations/<id>.jsonl - append-only event log
~/.north/conversations/<id>.snapshot.json - optional full snapshot
~/.north/conversations/index.json - conversation metadata index

Event Types:

conversation_started: ID, repoRoot, repoHash, model, timestamp
entry_added: full TranscriptEntry payload
entry_updated: entry ID + partial updates (streaming completion, review decisions)
model_changed: new model ID
rolling_summary_set: StructuredSummary or null
conversation_ended: clean exit marker

Resume Flow:

north resume <id> loads conversation from event log
Validates repoRoot exists (warns if missing)
Orchestrator initialized with initialState (transcript, rollingSummary, model)
Conversation continues normally with logging enabled

Persistence Triggers:

addEntry() → logEntryAdded()
updateEntry() → logEntryUpdated()
setModel() → logModelChanged()
setRollingSummary() → logRollingSummarySet()
stop() → logConversationEnded() + resolve pending reviews

Pending Review Handling:

On exit, pending reviews are resolved as cancelled/rejected/denied
Review status updates are persisted before exit
Resume never has pending interactive states (deterministic)

CLI Commands:

north - new conversation (generates 6-char hex ID)
north resume <id> - resume by ID
north resume - interactive picker of recent conversations
north conversations or north list - list conversations with metadata

Slash Commands:

/conversations - picker to switch to another conversation
/resume <id> - switch to conversation by ID directly

Portability:

Both repoRoot (path) and repoHash (SHA-256 prefix) stored
If repoRoot missing on resume, warns user and continues
User can provide --path to specify new location

Cancellation Flow (CTRL+C)

The app handles CTRL+C via Ink's useInput hook (not process.on SIGINT) contextually:

During processing (isProcessing() === true):
- Calls orchestrator.cancel()
- Aborts the current AbortController (stops API streaming)
- Aborts the shell AbortController (kills any running shell command)
- Resolves any pending reviews as rejected/denied/cancelled
- Appends [Cancelled] to the assistant's message
- Returns control to the input field
- App remains running
When idle (isProcessing() === false):
- Calls orchestrator.stop()
- Disposes all shell services
- Exits the application

Implementation details:

currentAbortController tracks the active API request
shellAbortController tracks any running shell command (created per-command)
cancelled flag checked in conversation loop
Provider stream loop checks signal.aborted and exits gracefully
Shell process killed via proc.kill() when abort signal fires
Pending write/shell/command reviews auto-resolve on cancel

Tool Display Formatting

The orchestrator formats tool names for better readability in the TUI:

list_root → "Listing project files - N entries"
find_files → "Finding pattern - N files" (with + suffix if truncated)
read_file → "Reading filename.ext"
get_line_count → "Checking size of filename.ext"
get_file_symbols → "Extracting symbols from filename.ext"
get_file_outline → "Outlining filename.ext"
edit_replace_exact → "Editing filename.ext (+X/-Y)" after approval
edit_insert_at_line → "Editing filename.ext (+X/-Y)" after approval
edit_create_file → "Creating filename.ext (+X/-Y)" after approval
edit_apply_batch → "Editing N files (+X/-Y)" after approval
Other tools: shown as-is

Edit Stats Display:

After an edit is approved (accept/always) or auto-applied, the tool entry is updated to show line statistics
Format: +X/-Y where X is lines added and Y is lines removed
Stats computed from the diff content using linesAdded and linesRemoved from FileDiff

Implementation split between:

formatToolNameForDisplay() in orchestrator: extracts display name from tool arguments
computeDiffStats() in orchestrator: calculates total added/removed lines from diffs
getToolResultSuffix() in Transcript.tsx: appends result counts for file listing tools

UI Animations

North uses subtle, frame-based animations to enhance feedback without overwhelming the terminal:

Streaming Indicator Pulse (Assistant & Tool messages):
- Pulses through magenta shades (magenta → #ff6ec7 → #ff8fd5 → #ffa0dc → back)
- 500ms interval per color transition
- Indicates active streaming or processing
Tool Execution Spinner:
- Animated spinner frames: ⠋⠙⠹⠸⠼⠴⠦⠧⠇⠏
- 80ms frame interval for smooth rotation
- Yellow color to match tool theme
- Shown when tool is executing (isStreaming: true)
Pending Review Border Pulse:
- Pulses through yellow shades (yellow → #ffff87 → #ffffaf → back)
- 600ms interval per color transition
- Applied to DiffReview, ShellReview, and PlanReview when status is "pending"
- Draws attention to items requiring user action

Implementation Details:

Custom React hooks (useSpinner, usePulse, useBorderPulse)
Uses setInterval with cleanup on unmount
Frame rates kept low (12-15 fps) to avoid terminal flicker
Colors cycle smoothly for breathing effect
All animations respect terminal color support
Conditional timers: Animation hooks accept an active boolean parameter; timers only run when active
Auto-disable threshold: Animations auto-disable when transcript exceeds 100 entries

Alternate Screen Buffer & In-App Scrolling

North uses an alternate screen buffer (like htop, less, vim) instead of terminal scrollback:

Why alternate screen?

Ink's differential rendering (cursor moves + line clears) conflicts with terminal scrollback
When Ink redraws while user scrolls, scrollback can become corrupted
Different terminals (iTerm2, Terminal.app) handle this inconsistently
Alternate screen provides a stable, controlled viewport

Architecture:

Terminal (alternate screen)
┌────────────────────────────────────┐
│ ScrollableTranscript               │ ← viewport-height, renders line slice
│   - Pre-wrapped lines with ANSI    │
│   - Only visible lines rendered    │
│   - Scroll offset from bottom      │
├────────────────────────────────────┤
│ Interactive entries (reviews)      │ ← Always visible at bottom
├────────────────────────────────────┤
│ Composer                           │ ← Fixed height
├────────────────────────────────────┤
│ StatusLine                         │ ← Fixed height, shows [SCROLL]
└────────────────────────────────────┘

State:

scrollOffset: lines from bottom (0 = follow mode)
viewportHeight: terminal rows - composer - status - padding (dynamic based on composer line count)
viewportWidth: terminal columns - padding
composerLineCount: tracked via callback from Composer for dynamic height calculation

Keyboard:

Up/Down: scroll ±1 line (when composer disabled)
PageUp/PageDown: scroll ±viewportHeight
G: jump to bottom

Tradeoff:

Transcript is not in terminal scrollback after exit
Future: add /export command to save transcript to file

Transcript Performance Optimizations

To prevent flickering in large conversations, North implements several Ink-specific optimizations:

Static Rendering with <Static>:
- Ink's <Static> component renders items once and never re-renders them
- Completed transcript entries (not streaming, not pending review) are rendered inside <Static>
- Only dynamic entries (streaming messages, pending reviews) re-render on state changes
- This transforms "redraw 2000-line screen 12x/sec" into "redraw small dynamic section"
- Entry uniqueness: Deduplication check prevents same entry ID from appearing in both sections
- Review status priority: For review entries, reviewStatus determines static vs dynamic, preventing race conditions during state transitions
Conditional Animation Timers:
- All animation hooks (useSpinner, usePulse, useBorderPulse) accept an active parameter
- Timers only start when active === true
- Prevents "zombie timers" from completed entries causing unnecessary state updates
- Example: useSpinner(entry.isStreaming, 80) only animates while streaming
Animation Kill Switch:
- When transcript exceeds ANIMATION_DISABLE_THRESHOLD (100 entries), animations auto-disable
- animationsEnabled boolean passed through component tree
- Pending reviews still show correct state, just without pulsing animations
Memoized Components:
- All message components wrapped in React.memo: UserMessage, AssistantMessage, ToolMessage, CommandExecutedMessage, MessageBlock, StaticEntry
- Review components also memoized: DiffReview, ShellReview, CommandReview
- Primitive props preferred over object props where possible
Precomputed Render Data:
- DiffContent precomputes colored line data in useMemo
- Line styling decisions made once per diff, not on every render
- Reduces CPU work during animation frames
Entry Classification:
- isEntryStatic() helper determines if an entry can be rendered statically
- Criteria: not streaming, not the active review, no pending review status
- Entries graduate from dynamic to static as their state settles

Architecture:

<Transcript>
  <Static items={staticEntries}>     // Completed entries - render once
    {(entry) => <StaticEntry />}
  </Static>
  {dynamicEntries.map((entry) =>     // Active entries - re-render on changes
    <MessageBlock />
  )}
</Transcript>

This architecture ensures that only the actively changing portion of the transcript triggers redraws, keeping the terminal responsive even in very long conversations.

Gitignore Handling

The ignore checker:

Always ignores common directories (node_modules, .git, etc.)
Parses .gitignore if present at repo root
Supports glob patterns: *, **, ?
Supports negation patterns: !important.log
Supports directory-only patterns: logs/

Search Implementation

search_text uses ripgrep if available (faster, better output), with fallback to pure JS implementation:

Ripgrep: spawns rg --json for structured output
Fallback: walks files and searches line by line

Output Truncation

All tools enforce limits to prevent context overflow:

read_file: 500 lines or 100KB max, optional head/tail inclusion
search_text: 50 matches default, 200 max, supports file-specific and line range searches
find_files: 50 files default, 500 max
get_line_count: No limits, quick stat check
get_file_symbols: Returns all detected symbols (functions, classes, types, etc.)
get_file_outline: Returns hierarchical structure with line ranges
read_readme: 8KB max
hotfiles: 10 files default, 50 max

Truncation is always explicit with truncated: true in results.

Dependencies

Package	Version	Purpose
`@anthropic-ai/sdk`	^0.39.0	Claude API client
`ink`	^5.1.0	Terminal UI framework
`react`	^18.3.1	UI component model
`wrap-ansi`	^9.0.2	ANSI-aware text wrapping for scroll viewport
`string-width`	^8.1.0	Unicode-aware string width calculation

Dev Dependencies

Package	Version	Purpose
`eslint`	^9.17.0	Code linting
`typescript-eslint`	^8.18.1	TypeScript ESLint support
`eslint-plugin-react`	^7.37.2	React-specific linting
`eslint-plugin-react-hooks`	^5.1.0	React Hooks linting
`prettier`	^3.4.2	Code formatting
`typescript`	^5.7.2	Type checking

Code Quality

The project uses ESLint and Prettier for code quality enforcement.

Scripts

bun run lint          # Run ESLint
bun run lint:fix      # Run ESLint with auto-fix
bun run format        # Format code with Prettier
bun run format:check  # Check Prettier formatting
bun run typecheck     # Run TypeScript type checking
bun run check         # Run all checks (typecheck + lint + format:check)

ESLint Configuration

Uses flat config format (eslint.config.js)
TypeScript support via typescript-eslint
React and React Hooks plugins
Key rules:
- @typescript-eslint/consistent-type-imports: Enforces type imports
- @typescript-eslint/no-unused-vars: Errors on unused variables (allows _ prefix)
- @typescript-eslint/no-explicit-any: Warns on any type usage
- react-hooks/exhaustive-deps: Enforces correct hook dependencies

Prettier Configuration

4-space indentation
Double quotes
Semicolons required
100 character line width
ES5 trailing commas

Git Hooks

Pre-commit hooks are configured in .githooks/pre-commit. The hook runs:

TypeScript type checking
ESLint linting
Prettier format verification

To enable hooks after cloning:

bun run prepare  # or: git config core.hooksPath .githooks

The prepare script runs automatically on bun install.

Running

# Development
bun run dev

# With options
bun run dev --path /some/repo --log-level debug

# Build for distribution
bun run build         # builds to dist/
bun run link          # symlinks for global 'north' command

# Build standalone binaries
bun run build:binary       # current platform
bun run build:binary:mac-arm
bun run build:binary:linux

Build Process

Simple and straightforward:

JavaScript bundling: bun build compiles TypeScript to dist/index.js
Binary compilation: bun build --compile creates a standalone executable with Bun runtime embedded
No native dependencies: Uses only Bun's built-in APIs (Bun.spawn()) for shell commands

The compiled binary is completely self-contained and can be distributed as a single file with no external dependencies.

Testing

Uses Bun's built-in test runner:

bun test                    # run all tests
bun test --watch            # watch mode
bun test tests/openai*.ts   # run specific tests

Test coverage:

tests/openai-provider.test.ts: OpenAI provider tests
- Tool schema conversion (verifies Responses API format)
- Provider factory and message builders
- SSE streaming event parsing
- Error handling
tests/storage.test.ts: Storage layer tests
- Allowlist storage (per-project command allowlist)
- AutoAccept storage (per-project auto-accept settings)
- Global config storage (user preferences)
tests/tools-read.test.ts: Read tool tests
- get_file_outline HTML embedded block tests (style/script parsing)
- get_file_symbols HTML/CSS redirect hint tests
- search_text contextLines tests
tests/tools-edit.test.ts: Edit tool tests
- Prepare contract tests
- Trailing newline preservation
- Failure diagnostic tests (whitespace, near-miss, hints)
tests/tools-find-code-block.test.ts: Find code block tool tests
- CSS selector and @media/@keyframes detection
- HTML embedded style/script block parsing
- Helpful hints for HTML/CSS files
- Nested block deduplication
tests/tools-find-blocks.test.ts: Find blocks tool tests
- Mixed HTML parsing (embedded style/script)
- CSS rules inside style blocks
- JS symbols inside script blocks
- Kind filtering
- C#/PHP/Java symbol detection (namespaces, classes, methods, traits)
tests/tools-workflow.test.ts: Integration-style workflow tests
- Mixed HTML navigation patterns
- Edit failure diagnostics workflow
- Structure-first editing patterns
- CSS selector pre-checking
tests/tools-security.test.ts: Path traversal and symlink security tests
tests/tools-shell.test.ts: Shell service tests
tests/rules-cursor.test.ts: Cursor rules loader tests

Test Isolation:

Tests that interact with user storage use environment variable overrides to prevent modifying actual user data:

Config tests: Set NORTH_CONFIG_DIR to temporary directory instead of manipulating HOME
Repo-scoped tests: Use createTempRepo() helper to create isolated temporary repositories
afterEach hooks ensure cleanup of temporary directories and restoration of environment variables

Environment

Required:

ANTHROPIC_API_KEY: For Claude models
OPENAI_API_KEY: For GPT models (at least one required)

Logs

Location: ~/.local/state/north/north.log

Example entries:

{"timestamp":"2025-12-08T10:00:00.000Z","level":"info","event":"app_start","data":{"version":"0.1.0","projectPath":"/path/to/repo","cwd":"/path/to/repo"}}
{"timestamp":"2025-12-08T10:00:05.000Z","level":"info","event":"user_prompt","data":{"length":42}}
{"timestamp":"2025-12-08T10:00:05.001Z","level":"info","event":"model_request_start","data":{"requestId":"req-123-abc","model":"claude-sonnet-4-20250514"}}
{"timestamp":"2025-12-08T10:00:06.000Z","level":"info","event":"tool_call_start","data":{"toolName":"search_text","argsSummary":{"query":"useState","path":"src"}}}
{"timestamp":"2025-12-08T10:00:06.150Z","level":"info","event":"tool_call_complete","data":{"toolName":"search_text","durationMs":150,"ok":true}}
{"timestamp":"2025-12-08T10:00:08.500Z","level":"info","event":"model_request_complete","data":{"requestId":"req-123-abc","durationMs":3499}}

Todo Backlog

Technical debt and improvements are tracked in the todo/ folder. Each file represents a single actionable item with:

Severity level (Major/Minor/Trivial)
Affected location(s)
Problem description
Solution approach
Implementation notes

Files are numbered by priority (01-xx = Major, 05-xx = Minor, 10-xx+ = Trivial). Delete each file after completing the task: rm todo/XX-filename.md

FilesExpand file tree

implementation.md

Latest commit

History

implementation.md

File metadata and controls

North Implementation Details

Milestone Status

Project Structure

Module Responsibilities

index.ts (Entry Point)

commands/ (Slash Command System)

commands/types.ts

commands/models.ts

commands/registry.ts

commands/parse.ts

commands/commands/*.ts

logging/index.ts

orchestrator/index.ts

rules/cursor.ts

shell/index.ts

storage/allowlist.ts

storage/config.ts

storage/autoaccept.ts

storage/costs.ts

storage/profile.ts

storage/conversations.ts

profile/learn.ts

provider/index.ts (Provider Factory)

provider/anthropic.ts (Anthropic Provider)

provider/openai.ts (OpenAI Provider)

tools/registry.ts

tools/*.ts (Tool Implementations)

Tool Output Digesting

Anchor-Based Editing

Edit Failure Diagnostics

Find Code Block Tool

Large File Navigation Strategy

read_around Tool

find_blocks Tool

edit_by_anchor Tool

utils/ignore.ts

ui/App.tsx

ui/useAlternateScreen.ts

ui/useTerminalSize.ts

ui/StatusLine.tsx

ui/CostsDialog.tsx

ui/Composer.tsx

ui/ScrollableTranscript.tsx

ui/Transcript.tsx (legacy)

ui/CommandReview.tsx

ui/DiffReview.tsx

ui/ShellReview.tsx

utils/editing.ts

utils/tokens.ts

utils/pricing.ts

utils/fileindex.ts

utils/filepreview.ts

Data Flow

Startup Flow

User Input with Commands

CTRL+C Signal Flow

Key Implementation Details

Path Security

Slash Command Execution

Rolling Summary

Model Switching

System Prompt Structure

Mode System

Cost Tracking

Context Tracking & Auto-Summarization

Tool Call Loop

Tool Result Consistency

Transient Error Retry

Write Approval Flow

NORTH_FILE Protocol

Shell Approval Flow

File Mentions (@ Autocomplete)

Cursor Project Rules (.cursor/rules)

path/to/rule.mdc