This document describes the current implementation state and module architecture.
| Milestone | Status |
|---|---|
| 1: Chat UI + streaming | ✅ Complete |
| 2: Read/search tools | ✅ Complete |
| 3: Deterministic edits + diff review | ✅ Complete |
| 4: Persistent PTY shell + approvals | ✅ Complete |
| 4.5: Slash commands + model switching | ✅ Complete |
| 4.6: Conversation save + resume | ✅ Complete |
| 5: Memory + project card cache | Not started |
| 6: UX polish | Not started |
Last verified: 2025-12-10
src/
├── index.ts # CLI entry point, arg parsing, app bootstrap
├── commands/
│ ├── index.ts # Command exports and registry factory
│ ├── types.ts # Command type definitions
│ ├── models.ts # Shared model list (alias, pinned, display)
│ ├── registry.ts # Command registry implementation
│ ├── parse.ts # Span-based command tokenizer
│ └── commands/
│ ├── quit.ts # /quit - exit application
│ ├── new.ts # /new - reset chat
│ ├── help.ts # /help - list commands
│ ├── model.ts # /model - switch Claude model
│ ├── mode.ts # /mode - switch conversation mode (ask/agent)
│ ├── summarize.ts # /summarize - summarize and trim transcript
│ ├── learn.ts # /learn - learn or relearn project codebase
│ ├── conversations.ts # /conversations - picker to switch conversations
│ └── resume.ts # /resume <id> - switch to conversation by ID
├── logging/
│ └── index.ts # Append-only JSON-lines logger
├── orchestrator/
│ └── index.ts # Conversation state, message flow, tool loop, commands, reviews
├── provider/
│ ├── index.ts # Provider factory, selects provider by model
│ ├── anthropic.ts # Claude streaming client (Anthropic Messages API)
│ └── openai.ts # GPT streaming client (OpenAI Responses API)
├── rules/
│ ├── index.ts # Rules module exports
│ └── cursor.ts # Cursor rules loader (.cursor/rules/*.mdc)
├── shell/
│ └── index.ts # Persistent PTY service with sentinel-based output parsing
├── storage/
│ ├── allowlist.ts # Per-project shell command allowlist (.north/allowlist.json)
│ ├── autoaccept.ts # Per-project edit auto-accept settings
│ ├── config.ts # Global config (~/.config/north/config.json)
│ ├── conversations.ts # Conversation persistence (event log + index)
│ ├── costs.ts # Global cost tracking (~/.north/costs.json)
│ └── profile.ts # Per-project learning profile storage
├── profile/
│ └── learn.ts # Project learning orchestration and discovery topics
├── tools/
│ ├── index.ts # Tool exports and registry factory
│ ├── types.ts # Tool type definitions (including edit and shell types)
│ ├── registry.ts # Tool registry implementation with approval policy
│ ├── list_root.ts # List repo root entries
│ ├── find_files.ts # Glob pattern file search
│ ├── search_text.ts # Text/regex search (ripgrep or fallback, supports file+range)
│ ├── read_file.ts # File content reader with ranges and head/tail inclusion
│ ├── get_line_count.ts # Quick file size checker
│ ├── get_file_symbols.ts # Symbol extraction (functions, classes, types)
│ ├── get_file_outline.ts # Hierarchical file structure outline
│ ├── read_readme.ts # README finder and reader
│ ├── detect_languages.ts # Language composition detector
│ ├── hotfiles.ts # Frequently modified files (git or fallback)
│ ├── edit_replace_exact.ts # Exact text replacement
│ ├── edit_insert_at_line.ts # Insert at line number
│ ├── edit_after_anchor.ts # Insert after anchor text
│ ├── edit_before_anchor.ts # Insert before anchor text
│ ├── edit_replace_block.ts # Replace content between anchors
│ ├── edit_create_file.ts # Create or overwrite file
│ ├── edit_apply_batch.ts # Atomic batch edits
│ ├── expand_output.ts # Retrieve cached digested outputs
│ ├── find_code_block.ts # Find code blocks containing text
│ ├── read_around.ts # Context window around anchor
│ ├── find_blocks.ts # Structural map without content
│ ├── edit_by_anchor.ts # Unified anchor-based editing
│ └── shell_run.ts # Shell command execution (requires approval)
├── ui/
│ ├── App.tsx # Root Ink component, SIGINT handling, review wiring
│ ├── Composer.tsx # Multiline input with slash command and @ file autocomplete
│ ├── CommandReview.tsx # Interactive picker for commands (e.g., model selection)
│ ├── DiffReview.tsx # Inline diff viewer with accept/reject
│ ├── ShellReview.tsx # Shell command approval with run/always/deny
│ ├── LearningPrompt.tsx # Project learning Y/N prompt
│ ├── LearningProgress.tsx # Learning progress indicator
│ ├── StatusLine.tsx # Model name, mode indicator, project path display
│ ├── Transcript.tsx # User/assistant/tool/review/command entry rendering
│ ├── ConversationList.tsx # Conversation list for north conversations
│ └── ConversationPicker.tsx # Conversation picker for north resume
└── utils/
├── repo.ts # Repo root detection
├── ignore.ts # Gitignore parsing and file walking
├── editing.ts # Diff computation and atomic file writes
├── tokens.ts # Token estimation for context tracking
├── retry.ts # Transient error retry with exponential backoff
├── fileindex.ts # File index for @ mention autocomplete
├── filepreview.ts # File preview + outline generation for context
├── fileblock.ts # NORTH_FILE streaming parser with events
├── filesession.ts # Streaming file writer with auto-resume
├── digest.ts # Tool output digesting for context efficiency
└── pricing.ts # Model pricing data and cost calculation
tests/
└── openai-provider.test.ts # OpenAI provider unit tests
- Parses CLI args and subcommands
- Supported subcommands:
north- start new conversationnorth resume <id>- resume conversation by IDnorth resume- open conversation pickernorth conversationsornorth list- list recent conversations
- Flags:
--path,--log-level - Detects repo root from start directory
- Initializes logger
- Renders Ink app (or list/picker components for subcommands)
- Handles clean exit via
waitUntilExit() - Wires tool logging callbacks
- Generates conversation ID on new conversations
- Loads conversation state on resume
Registry-driven command system with span-based parsing, cursor-aware autocomplete, and interactive pickers.
Defines core types:
Mode: "ask" | "agent" - conversation mode typeCommandDefinition: name, description, usage, execute functionCommandContext: orchestrator methods available to commandsParsedArgs: positional args and flags from parsingPickerOption: id, label, hint for interactive selectionCommandReviewStatus: "pending" | "selected" | "cancelled"StructuredSummary: goal, decisions, constraints, openTasks, importantFiles
Centralized model list shared by /model command and Composer autocomplete:
ProviderType: "anthropic" | "openai"MODELS: array of{ alias, pinned, display, contextLimitTokens, provider, supportsThinking?, thinkingBudget? }resolveModelId(input): maps alias or pinned ID to pinned ID (supports both Claude and GPT prefixes)getModelDisplay(modelId): returns human-readable namegetModelContextLimit(modelId): returns context limit in tokensgetModelProvider(modelId): returns provider type for modelgetModelThinkingConfig(modelId): returns thinking config if model supports extended thinkingisThinkingModel(modelId): returns true if model ID ends with-thinkingsuffixDEFAULT_MODEL: default pinned model ID (Claude Sonnet 4)
Extended Thinking Model Selection:
- Thinking is now selected at model-choice time, not toggled separately
- Each Anthropic model exists twice: once without thinking, once with
-thinkingsuffix - Example:
claude-sonnet-4-20250514(fast) vsclaude-sonnet-4-20250514-thinking(with extended thinking) - When you select a model with
-thinkingsuffix, extended thinking is enabled automatically - OpenAI models don't have thinking variants (reasoning built into those models)
Supported Model Variants:
- Anthropic:
- sonnet-4, sonnet-4-thinking
- opus-4, opus-4-thinking
- opus-4-1, opus-4-1-thinking
- sonnet-4-5, sonnet-4-5-thinking
- haiku-4-5, haiku-4-5-thinking
- opus-4-5, opus-4-5-thinking
- OpenAI: gpt-5.1, gpt-5.1-codex, gpt-5.1-codex-mini, gpt-5.1-codex-max, gpt-5, gpt-5-mini, gpt-5-nano
- In-process registry mapping command name -> definition
register(command): add command to registryhas(name): check if command exists (used by parser)list(): get all commands (used by /help and autocomplete)execute(name, ctx, args): run command with error handling
Span-based tokenizer for reliable command extraction:
parseCommandInvocations(input, registry): returns{ invocations, remainingText }- Each invocation has
name,args,span(start/end indices) - Parsing rules:
/namemust be preceded by start-of-line or whitespace- Args stop at next
/nametoken (unless inside quotes) - Supports
--flag valueand-fshort flags - Quoted strings preserve whitespace
remainingTextcomputed by slicing out spans in reverse ordergetTokenAtCursor(value, cursorPos): for autocomplete
| Command | Usage | Purpose |
|---|---|---|
/quit |
/quit |
Exit North cleanly |
/new |
/new |
Reset chat (clears transcript + summary, keeps PTY) |
/help |
/help |
List available commands |
/model |
/model [alias] |
Switch model (with picker if no arg) |
/mode |
/mode [ask|agent] |
Switch conversation mode (with picker if no arg) |
/summarize |
/summarize [--keep-last N] |
Summarize conversation, trim transcript |
/costs |
/costs |
Show cost breakdown dialog by model/provider |
/learn |
/learn |
Learn or relearn project codebase |
/conversations |
/conversations |
Picker to switch conversations |
/resume |
/resume <id> |
Switch to conversation by ID |
- Writes to
~/.local/state/north/north.log - JSON-lines format (one JSON object per line)
- Events:
app_start,user_prompt,model_request_start,model_request_complete,tool_call_start,tool_call_complete,write_review_shown,write_review_decision,write_apply_start,write_apply_complete,shell_review_shown,shell_review_decision,shell_run_start,shell_run_complete,app_exit - Silent fail on write errors (logging must not crash the app)
- Owns
transcript(array ofTranscriptEntry) - Owns
isProcessing,pendingReviewId,currentModel,rollingSummary - Owns
contextUsedTokens,contextLimitTokens,contextUsagefor context tracking - Receives
cursorRulesTextin context (loaded once at startup) - Owns command registry via
createCommandRegistryWithAllCommands() - Preprocesses user input for slash commands before sending to Claude
- Accepts mode parameter in
sendMessage(content, mode)to filter available tools - In Plan mode only: enforces plan requirement (write tools blocked until plan is accepted)
- Implements tool call loop:
- Parse and execute any slash commands in input
- Add
command_executedentry for each command - If
remainingTextnon-empty, append user entry to transcript - Create assistant entry with
isStreaming: true - Build messages and estimate token usage
- If context usage >= 92%, auto-summarize conversation
- Send messages to Claude with tool schemas and current model
- Stream response text (throttled at ~32ms)
- If
stopReason === "tool_use":- Execute each tool via registry
- For
approvalPolicy: "write": creatediff_reviewentry, block for user decision - For
approvalPolicy: "shell": check allowlist, createshell_reviewif not allowed - On accept/run: apply edits or execute command, send result to Claude
- On reject/deny: send rejection/denial to Claude
- Continue until Claude stops requesting tools
- Streaming throttle: buffer chunks, flush every 32ms or on complete
- Emits state changes via
onStateChangecallback (includescurrentModel) buildMessagesForClaude(): excludescommand_reviewandcommand_executedentries- Prepends
cursorRulesTextas first context block if present - Prepends
rollingSummaryas second context block if present - Exposes
resolveWriteReview(reviewId, decision)for UI to signal accept/reject - Exposes
resolveShellReview(reviewId, decision)for UI to signal run/always/deny - Exposes
resolveCommandReview(reviewId, decision)for UI to signal selection/cancel - Exposes
getCommandRegistry()for Composer autocomplete - Exposes
cancel()for interrupting ongoing operations (CTRL+C during processing) - Exposes
stop()for clean exit (CTRL+C when idle) - Exposes
isProcessing()for checking if an operation is in progress
- Loads Cursor project rules from
.cursor/rules/directory - Walks directory recursively, collecting all
*.mdcfiles - Parses optional YAML frontmatter, extracts body content
- Returns stable order (sorted by relativePath)
- Hard cap at 30KB total size, truncates with
[truncated]marker - API:
loadCursorRules(repoRoot)returnsLoadedCursorRules | null LoadedCursorRules:{ rules, text, truncated }CursorRule:{ name, relativePath, body }
- Simple shell command execution using Bun's built-in
Bun.spawn()API - No external dependencies - works in standalone compiled binaries
- Each command spawns a fresh bash process (no persistent session)
- Uses
bash -cfor command execution - Timeout handling: kills process after timeout (default 60s)
- Cancellation support: accepts
AbortSignaloption to kill running commands on CTRL+C - Properly separates stdout and stderr streams
- Per-project service caching for consistent interface
- API:
getShellService(repoRoot, logger)returns service withrun(command, options)anddispose() - Run options:
cwd,timeoutMs,signal(AbortSignal for cancellation) disposeAllShellServices()cleans up all cached services on exit
- Per-project shell command allowlist at
.north/allowlist.json - Simple JSON format:
{ "allowedCommands": ["pnpm test", "bun test"] }
- Global configuration at
~/.config/north/config.json - Stores user preferences that persist across sessions
- Currently stores:
selectedModel(persisted model selection) - API:
getSavedModel()returns saved model ID or null,saveSelectedModel(modelId)persists selection - API:
isCommandAllowed(repoRoot, command),allowCommand(repoRoot, command),getAllowedCommands(repoRoot) - Exact string matching only (no patterns)
- Creates
.north/directory on first write - Test isolation: Respects
NORTH_CONFIG_DIRenvironment variable to override config directory for testing (prevents tests from modifying user's actual config)
- Per-project auto-accept settings at
.north/autoaccept.json - JSON format:
{ "editsAutoAccept": boolean, "shellAutoApprove": boolean } - Edit API:
isEditsAutoAcceptEnabled(repoRoot),enableEditsAutoAccept(repoRoot),disableEditsAutoAccept(repoRoot) - Shell API:
isShellAutoApproveEnabled(repoRoot),enableShellAutoApprove(repoRoot),disableShellAutoApprove(repoRoot) - When edits auto-accept enabled, all edit tool results are automatically applied without user confirmation
- When shell auto-approve enabled, all shell commands run automatically without individual approval
- Creates
.north/directory on first write
- Global API cost tracking at
~/.north/costs.json - JSON format:
{ "allTimeCostUsd": number, "byModel": Record<string, ModelCost>, "lastUpdated": number } ModelCost:{ inputTokens: number, outputTokens: number, costUsd: number }- API:
getAllTimeCost(),getCostBreakdown(),addCostByModel(),resetAllTimeCost() addCostByModel(modelId, inputTokens, outputTokens, costUsd)accumulates per-model and updates totalgetCostBreakdown()returns full breakdown for/costsdialog- Creates
~/.north/directory on first write - Test isolation: Respects
NORTH_DATA_DIRenvironment variable to override data directory
- Per-project learning profile storage at
~/.north/projects/<hash>/profile.md - Hash-based project identification using SHA-256 of repo root path (16 chars)
- Profile stored in markdown format with H2 sections for each discovery topic
- Declined state tracked via
declined.jsonmarker file - API:
hasProfile(repoRoot),loadProfile(repoRoot),saveProfile(repoRoot, content) - API:
hasDeclined(repoRoot),markDeclined(repoRoot),clearDeclined(repoRoot) getProjectHash(repoRoot)generates stable hash for directory identification- Storage location keeps repos clean (no commits of generated content)
- Conversation persistence at
~/.north/conversations/ - Each conversation identified by 6-char hex ID (e.g.,
abc123) - Event log format:
<id>.jsonl(append-only JSONL for crash safety) - Optional snapshot:
<id>.snapshot.json(full state for fast resume) - Index file:
index.json(conversation metadata for listing) - Event types:
conversation_started,entry_added,entry_updated,model_changed,rolling_summary_set,conversation_ended - API:
generateConversationId(),startConversation(),loadConversation(),listConversations() - API:
logEntryAdded(),logEntryUpdated(),logModelChanged(),logRollingSummarySet(),logConversationEnded() - Stores both
repoRoot(path) andrepoHash(stable ID) for portability - Resume validates repoRoot exists, warns if missing
- Project learning orchestration with 10 discovery topics
- Topics: summary, architecture, conventions, vocabulary, data flow, dependencies, workflow, hotspots, playbook, safety
- Runs sequential LLM sessions with read-only tools for each topic
- Uses custom system prompt focused on concise exploration
- Progress callback for UI updates (percent + topic name)
- Tool filtering: only read-only tools available during learning
- Returns complete markdown profile with H2 sections
- Maximum 5 tool use iterations per topic to prevent infinite loops
- Error handling: continues to next topic on failure
- Exports
createProviderForModel(modelId): creates correct provider based on model prefix getModelProvider(modelId): returns "anthropic" or "openai" based on model- Re-exports common types:
Provider,Message,StreamCallbacks,ToolCall, etc. - Orchestrator uses this to dynamically switch providers when
/modelchanges
- Wraps
@anthropic-ai/sdk - Default model:
claude-sonnet-4-20250514 - Handles
-thinkingsuffix in model IDs:- Model IDs may end with
-thinkingsuffix (e.g.,claude-sonnet-4-20250514-thinking) - Provider automatically strips suffix before API call
- Passes thinking config separately via
thinkingoption
- Model IDs may end with
- Streaming via
client.messages.stream()(Messages API) - Supports tool definitions and tool_use blocks
- Per-request options:
model,tools,systemOverride,signal(AbortSignal),thinking(ThinkingConfig) systemOverridereplaces default system prompt (used for summarization)- Callbacks:
onChunk,onToolCall,onThinking,onComplete,onError - Abort support: checks signal during stream loop, returns
stopReason: "cancelled" - Helpers for building tool result and assistant messages
Extended Thinking Support:
ThinkingConfig:{ type: "enabled", budget_tokens: number }enables Claude's thinking mode- Handles
thinking_deltaandsignature_deltaevents during streaming ThinkingBlock: contains summarized thinking text and signature (for API continuity)- Thinking blocks must be preserved and passed back unmodified during tool loops
buildAssistantMessage()includes thinking blocks when providedStreamResultincludesthinkingBlocksarray
- Uses native fetch with SSE streaming (no SDK dependency)
- Endpoint:
https://api.openai.com/v1/responses(Responses API) - Default model:
gpt-5.1 - Note: OpenAI models do not have
-thinkingvariants; reasoning is intrinsic to their models - Streaming via SSE events:
response.output_text.delta,response.function_call_arguments.delta - Tool format converted to OpenAI function format:
{ type: "function", function: { name, description, parameters } } - Tool results sent as
function_call_outputitems with matchingcall_id - Per-request options: same interface as Anthropic provider
- Abort support: passes AbortSignal to fetch, returns
stopReason: "cancelled" - Env var:
OPENAI_API_KEYrequired
Supported OpenAI Models:
| Alias | Model ID | Description |
|---|---|---|
| gpt-5.1 | gpt-5.1 | GPT-5.1 flagship |
| gpt-5.1-codex | gpt-5.1-codex | Optimized for coding |
| gpt-5.1-codex-mini | gpt-5.1-codex-mini | Faster coding variant |
| gpt-5.1-codex-max | gpt-5.1-codex-max | Maximum capability coding |
| gpt-5 | gpt-5 | GPT-5 flagship |
| gpt-5-mini | gpt-5-mini | Faster GPT-5 variant |
| gpt-5-nano | gpt-5-nano | Fastest/cheapest GPT-5 |
- In-process registry mapping tool name -> definition
- Each tool has: name, description, inputSchema, approvalPolicy, execute()
getSchemas()returns tool definitions for Claude APIgetApprovalPolicy()returns "none", "write", or "shell" for a toolexecute()runs tool and returns structured result
All tools follow the pattern:
- Input validation
- Operation scoped to repoRoot
- Structured result with
ok,data,error
| Tool | Purpose | Key Features |
|---|---|---|
list_root |
List repo root entries | Respects .gitignore |
find_files |
Glob pattern search | Case-insensitive, limit |
search_text |
Text/regex search | Uses ripgrep if available, supports file+line range scope, optional contextLines (1-5) |
read_file |
Read file content | Line ranges, head/tail inclusion; use read_around for text search |
get_line_count |
Check file size | Quick stats before reading large files |
get_file_symbols |
Extract symbols | Functions, classes, types, interfaces (TS/JS/Py/Rust/Go/Java); redirects to find_blocks for HTML/CSS |
get_file_outline |
File structure outline | Hierarchical view with line numbers (TS/JS/Py/HTML/CSS) |
read_readme |
Read README | Auto-detect README.* |
detect_languages |
Language composition | By extension and size |
hotfiles |
Important files | Git history or fallback |
find_code_block |
Find code blocks | Locate functions/classes containing text, deduplicates nested HTML blocks |
expand_output |
Retrieve full output | Access cached digested tool outputs |
edit_replace_exact |
Replace exact text | Requires approval, enhanced failure diagnostics (whitespace, near-miss) |
edit_insert_at_line |
Insert at line | 1-based, requires approval |
edit_after_anchor |
Insert after anchor | Anchor-based insertion, handles multiple matches |
edit_before_anchor |
Insert before anchor | Anchor-based insertion, handles multiple matches |
edit_replace_block |
Replace between anchors | Replace content between two text markers |
edit_create_file |
Create/overwrite file | Requires approval |
edit_apply_batch |
Atomic batch edits | All-or-nothing, requires approval |
shell_run |
Execute shell command | Persistent PTY, requires approval or allowlist, stderr merged into stdout |
read_around |
Context window | Asymmetric before/after lines around anchor, occurrence handling |
find_blocks |
Structural map | Block coordinates without content (html_section, css_rule, js_ts_symbol, csharp_symbol, php_symbol, java_symbol) |
edit_by_anchor |
Unified anchor edit | Four modes: insert_before, insert_after, replace_line, replace_between |
North implements a context-efficient digesting layer that stores full tool outputs locally but forwards only condensed summaries to the model:
Digest Strategies by Tool:
| Tool | Digest Format |
|---|---|
read_file |
First 50 lines + "... N more lines" + last 10 lines |
search_text |
First 10 matches with context, total count |
find_files |
First 20 files + "... N more" |
shell_run |
First 20 lines + last 10 lines of stdout |
| Others | Pass through (already compact) |
Cache Behavior:
- Full outputs are cached per conversation turn
- Cache is cleared at the start of each
sendMessage() - Use
expand_outputtool to retrieve full cached output - Digested outputs include
outputIdanddigestNotefor retrieval
Implementation:
src/utils/digest.ts:digestToolOutput()function with per-tool strategiessrc/tools/expand_output.ts: Tool to retrieve cached full outputs- Orchestrator integrates digest layer in
executeToolCall()
North provides anchor-based edit tools that address content by text patterns instead of brittle line numbers:
Tools:
edit_after_anchor: Insert content after a line containing anchor textedit_before_anchor: Insert content before a line containing anchor textedit_replace_block: Replace content between two anchor markers
Behavior:
- If anchor appears once: operation proceeds
- If anchor appears multiple times without
occurrencespecified: returns candidates list - Candidates include line number and preview for disambiguation
- Anchor-based edits are more reliable than line numbers across file changes
Example:
// Instead of: edit_insert_at_line({ path, line: 42, content })
// Use: edit_after_anchor({ path, anchor: "function setupApp() {", content })edit_replace_exact provides enhanced failure diagnostics when text is not found:
Whitespace Detection:
- Tab vs space indentation mismatches
- CRLF vs LF line ending differences
- Trailing whitespace mismatches
Near-Miss Candidates:
- Uses Levenshtein distance to find lines similar to the search text
- Reports character-level differences (e.g., "differs at position 12: 'a' vs 'e'")
- Shows line numbers for near matches
Actionable Hints:
- Suggests
read_aroundfor verification - Recommends anchor-based editing as alternative
Example error output:
Text not found in file.
Possible whitespace issues:
- Your search uses tabs but file uses spaces for indentation
Near matches found:
- Line 42: "const myVariable = 1;"
(differs at position 12: 'a' vs 'e')
Hint: Use read_around to see exact content, or use anchor-based editing (edit_by_anchor).
find_code_block enables "jump to place" navigation without multiple search/read cycles:
Input:
path: File to searchquery: Text to find within blockskind: Optional filter - "function", "class", "method", "block", "any"
Output:
matches: Array of blocks containing the query- Each match includes:
startLine,endLine,snippet(first 5 lines),kind,name hint: Helpful tip when no blocks match but text exists (HTML/CSS files suggestfind_blocks)
Supported Languages:
- TypeScript/JavaScript: functions, classes, methods
- Python: functions, classes (indentation-based)
- CSS/SCSS: selectors,
@mediaqueries,@keyframesanimations - HTML: semantic sections, embedded
<style>blocks with CSS rules, embedded<script>blocks with JS symbols - Generic: brace-delimited blocks
Helpful Hints:
When searching HTML/CSS files and no code blocks contain the query (but the text exists in the file), the tool returns a hint suggesting find_blocks for better structural navigation of CSS selectors, @media queries, and embedded blocks.
The tool system includes specialized tools to efficiently navigate and understand large files without reading entire contents:
Tool Chain for Large Files:
- Check size first: Use
get_line_countto determine file size before reading - Understand structure: Use
get_file_symbolsorget_file_outlineto see what's in the file - Find targets: Use
search_textwith file+lineRange to locate specific content - Read strategically: Use
read_filewith specific line ranges and optional context
Symbol Extraction (get_file_symbols):
- Regex-based parsing (fast, no dependencies)
- Supported languages: TypeScript, JavaScript, Python, Rust, Go, Java
- Extracts: functions, classes, interfaces, types, enums, methods
- Returns: symbol name, type, line number, signature preview
- Use case: "Where is function X defined?" or "What classes are in this file?"
File Outline (get_file_outline):
- Hierarchical structure with line ranges
- TypeScript/JavaScript: imports, symbols, exports
- Python: imports, classes (with methods), functions
- HTML: major sections (head, body, main, section), elements with IDs, embedded content parsing
- CSS/SCSS/Less: selectors with line ranges, media queries, keyframes
- Generic fallback: 50-line chunks
- Use case: "Show me the overall structure of this 1000-line file"
HTML Embedded Block Parsing:
For HTML files, get_file_outline now parses embedded <style> and <script> blocks:
<style>blocks: Shows CSS rules inside with nested indicator (└─ .selector)<script>blocks: Shows JS symbols (functions, classes) with nested indicator- Example output includes:
<style>,└─ .card,└─ @keyframes fadeIn,<script>,└─ function init
Enhanced Search (search_text):
- New
fileparameter: search within a specific file only - New
lineRangeparameter: search within specific line range - New
contextLinesparameter: include 1-5 lines of context before/after each match - Language hints in description: "For TypeScript: search for 'export function'"
- Use case: "Find all uses of X within lines 100-200 of file.ts"
- Context use case:
search_text({ query: "target", contextLines: 2 })reduces follow-up read_around calls
Read File (read_file):
range: read specific line range (1-indexed)includeHeadTail: always include first 10 and last 10 lines for orientation- Use
read_aroundtool instead for text-based searching with context
System Prompt Guidance: The provider system prompts now explicitly instruct the LLM to:
- Check file size before reading files >200 lines
- Use symbols/outline tools to understand structure first
- Never read entire files when only one section is needed
- Chain tools strategically: outline → search → targeted read
Expected Impact:
- 60-80% token reduction when working with large files
- Faster symbol lookups without full reads
- Better targeting: LLM reads only what's needed
- Clearer guidance through concrete strategies
read_around provides a focused context window around an anchor string:
Input:
path: File to readanchor: Text to findbefore: Lines before match (default: 12)after: Lines after match (default: 20)occurrence: Which occurrence (1-based, required if multiple matches)
Output:
totalLines: File lengthmatchCount: How many times anchor appearsoccurrenceUsed: Which occurrence was returnedmatchLine: Line number of the matchcontent: Lines with line numbers, match line marked with>
Behavior:
- 0 matches: error suggesting
search_text - Multiple matches without occurrence: error listing candidates with previews
- Single call replaces "search → read range" pattern
find_blocks returns a structural map with coordinates but no content:
Input:
path: File to mapkind: Filter -html_section,css_rule,js_ts_symbol, orall(default: auto-detect)
Output:
totalLines: File lengthblocks: Array of{ id, label, startLine, endLine }
Supported Kinds:
html_section:<section>,<article>,<nav>, elements with IDscss_rule: selectors,@media,@keyframesjs_ts_symbol: functions, classes, interfaces, types, React componentscsharp_symbol: namespaces, classes, structs, interfaces, methods, properties, enumsphp_symbol: namespaces, classes, interfaces, traits, functions, methodsjava_symbol: packages, classes, interfaces, enums, methods
Mixed HTML Support:
For HTML files with embedded <style> and <script> blocks, find_blocks automatically detects and parses both:
- Returns the
<style>block itself with line range - Parses CSS rules inside the style block (selectors, @media, @keyframes)
- Returns the
<script>block itself with line range - Parses JS/TS symbols inside the script block (functions, classes)
Example output for mixed HTML:
blocks: [
{ id: "html-0", label: "<header>", startLine: 5, endLine: 20 },
{ id: "style-0", label: "<style> (lines 22-45)", startLine: 22, endLine: 45 },
{ id: "style0-css-0", label: ".site-footer", startLine: 24, endLine: 28 },
{ id: "script-0", label: "<script> (lines 50-80)", startLine: 50, endLine: 80 },
{ id: "script0-js-0", label: "function initApp", startLine: 52, endLine: 65 }
]
Use case: Get coordinates in one call, then use read_around for targeted reading. For mixed HTML files, use this to locate specific CSS rules or JS functions before editing.
edit_by_anchor provides unified anchor-based editing with four modes:
Input:
path: File to editmode:insert_before,insert_after,replace_line, orreplace_betweenanchor: Primary anchor textanchorEnd: End anchor (required forreplace_between)content: Content to insert/replaceoccurrence: Which occurrence (1-based, required if multiple matches)inclusive: Forreplace_between- replace anchor lines too (default: false)
Mode Behaviors:
| Mode | Effect |
|---|---|
insert_before |
Insert content before anchor line |
insert_after |
Insert content after anchor line |
replace_line |
Replace the anchor line with content |
replace_between |
Replace content between two anchors |
Safety:
- 0 matches: error
- Multiple matches without occurrence: error listing candidates
replace_linemode is new capability (replaces the anchor line itself)
- Parses .gitignore patterns
- Always ignores common directories (node_modules, .git, etc.)
createIgnoreChecker()returns checker withisIgnored(path, isDir)walkDirectory()recursively walks repo respecting ignoreslistRootEntries()lists root level entries
- Root Ink component, wires orchestrator to UI state
- Uses alternate screen buffer via
useAlternateScreen()hook (like htop/less) - Tracks terminal dimensions via
useTerminalSize()hook for viewport calculations - CTRL+C handling via
useInput: cancel if processing, exit if idle - Requires
exitOnCtrlC: falsein render options to prevent Ink's default exit behavior - Delegates review decisions to orchestrator methods (write, shell, command, plan)
- Tracks
isProcessing,pendingReviewId,nextMode, andscrollOffsetfor UI state - Passes mode to orchestrator on message submission
- Auto-resets scroll to bottom when transcript changes
- Layout: ScrollableTranscript (viewport-height top), Composer (fixed bottom), StatusLine (fixed bottom)
- Custom hook that switches terminal to alternate screen buffer on mount
- Uses ANSI escape codes:
\x1b[?1049h(enter) and\x1b[?1049l(exit) - Hides cursor during render, shows on exit
- Alternate screen means transcript is not in terminal scrollback after exit
- Similar behavior to
less,htop,vim
- Custom hook that tracks terminal dimensions (rows and columns)
- Listens to stdout "resize" events for dynamic updates
- Returns
{ rows, columns }object - Used by App to calculate viewport height for ScrollableTranscript
- Full-width status bar using
width="100%"andjustifyContent="space-between" - Left side: project name with truncation for long names (
wrap="truncate") - Right side: scroll indicator, current model name, context usage, and cost display
- Scroll indicator: yellow [SCROLL] badge when scrollOffset > 0 (not at bottom)
- Model display: shows the selected model (including
-thinkingsuffix if thinking is enabled) - Context display: color-coded circle (green < 60%, yellow 60-85%, red > 85%) + token count + percentage
- Token count formatted as K/M for readability (e.g., "42.5K (21%)")
- Cost display: session cost (green) / all-time cost (blue) in USD
- Updates in real-time as context fills and costs accumulate
- Centered modal dialog showing cost breakdown
- Triggered by
/costscommand viashowCostsDialog()context method - Displays two sections: Session Costs and All-Time Costs
- Groups costs by provider (Anthropic, OpenAI) then by model
- Shows input/output token counts and USD cost per model
- Provider subtotals and section totals displayed
- Press Esc or Q to close dialog
- Reads all-time breakdown from
~/.north/costs.jsonviagetCostBreakdown() - Session costs passed from orchestrator state
- Multiline input with Ctrl+J or Shift+Enter for newlines
- Paste support: multi-character input and newlines are detected and inserted directly
- Dynamic height: grows as content is added, reports line count to parent
- Shows "Ctrl+C to cancel" hint when disabled/waiting
- Mode cycling with Tab key (when no autocomplete suggestions):
- Cycles: ask → agent → ask
- Mode applies to next message only (per-request mode)
- Visual indicator shows current mode in top-right
- Cursor-aware slash command autocomplete:
- Detects
/tokens at cursor position - Queries command registry for suggestions
- Shows dropdown with command name + description
- Tab to insert (when suggestions present), Up/Down to navigate, Esc to close
- Detects
- Model argument autocomplete for
/modelcommand:- Detects when cursor follows
/model - Shows model aliases with display names
- Detects when cursor follows
- File mention autocomplete with
@:- Detects
@tokens at cursor position - Fuzzy matches against project files (respecting .gitignore)
- Shows dropdown with filename + full path hint
- Tab/Enter to attach file, Space/Esc to cancel (treat @ as literal)
- Attached files tracked in component state
- Visual indicator shows count of attached files
- On submit, attached files passed to orchestrator for context injection
- Detects
- Smart space insertion: only adds space after completion if needed
- Clamps selection index when suggestions change
- Renders conversation history with in-app scrolling (no terminal scrollback dependency)
- Pre-computes wrapped lines with ANSI color codes using
wrap-ansi - Renders only visible lines based on viewport height and scroll offset
- User messages: cyan label
- Assistant messages: magenta label
- Tool messages: yellow ⚡ icon, gray text
- Command executed messages: blue ⚙ icon with result
- Interactive entries (diff_review, shell_review, command_review) rendered at bottom only when pending
- Resolved interactive entries convert to compact text lines that flow with transcript
- Keyboard navigation for scrolling (when composer not active):
- Up/Down: scroll one line
- PageUp/PageDown: scroll viewport height
- G: jump to bottom (follow mode)
- Auto-scrolls to bottom when new content arrives
- Animation hooks disabled when transcript exceeds 100 entries
- Legacy transcript renderer using Ink's
<Static>component pattern - Kept for reference but replaced by ScrollableTranscript
- Renders interactive picker for commands needing selection
- Used by
/modelwhen no argument provided - Shows list of options with labels and hints
- Keyboard shortcuts: Up/Down navigate, Enter select, Esc cancel
- Status badges: pending (yellow), selected (green), cancelled (red)
- Renders inline diff with syntax highlighting
- Green for additions, red for deletions, cyan for hunk headers
- Truncates diffs over 100 lines with indicator
- Shows file stats (+lines/-lines)
- Keyboard shortcuts:
aaccept,yalways (auto-accept all future edits),rreject - Status badges: pending (pulsing yellow border), accepted (green), always/auto-applied (cyan), rejected (red)
- Animation: border color pulses when status is pending to draw attention
- "Always" option: enables auto-accept for all future edit operations in this project
- Renders shell command approval prompt
- Shows command and optional cwd
- Keyboard shortcuts:
rrun,aalways (adds to allowlist),yauto all (approves all future commands),ddeny - Status badges: pending (pulsing yellow border), ran/always/auto (green), denied (red)
- Animation: border color pulses when status is pending to draw attention
- "Auto All" option: enables global auto-approve for all future shell commands in this project
resolveSafePath(): validates paths stay within repo root with symlink resolution- First checks normalized path is within repo
- Resolves symlinks using
realpathSync()to prevent path traversal attacks - Verifies resolved real path is still within repo boundary
- For non-existent files (during creation), validates parent directory instead
readFileContent(): safe file reading with error handlingpreserveTrailingNewline(): ensures trailing newline consistency after editscomputeUnifiedDiff(): generates unified diff formatcomputeCreateFileDiff(): generates diff for new filesapplyEditsAtomically(): writes to temp files then renames for safety; handles cross-filesystem scenarios (EXDEV) via copy+unlink fallback
Token estimation for context tracking:
estimatePromptTokens(systemPrompt, messages): estimates total tokens in request- Uses character-based heuristic (3.5 chars per token)
- Applies 10% safety margin to reduce overflow risk
- Returns structured breakdown: system, messages, overhead
- Handles both string and structured message content (tool results, etc.)
Model pricing data and cost calculation:
ModelPricing: interface for per-model pricing (input, output, cached input, cache read/write)TokenUsage: interface for token counts (input, output, cached, cache read/write)getModelPricing(modelId): returns pricing data for a model (falls back to defaults for unknown models)calculateCost(modelId, usage): computes USD cost from token usageformatCost(cost): formats cost as string (e.g., "$0.123", "$1.50")
Anthropic Pricing (per 1M tokens):
| Model | Input | Output | Cache Write | Cache Hit |
|---|---|---|---|---|
| claude-sonnet-4-* | $3.00 | $15.00 | $3.75 | $0.30 |
| claude-opus-4-* | $15.00 | $75.00 | $18.75 | $1.50 |
| claude-opus-4-1-* | $15.00 | $75.00 | $18.75 | $1.50 |
| claude-sonnet-4-5-* | $3.00 | $15.00 | $3.75 | $0.30 |
| claude-haiku-4-5-* | $1.00 | $5.00 | $1.25 | $0.10 |
| claude-opus-4-5-* | $5.00 | $25.00 | $6.25 | $0.50 |
OpenAI Pricing (per 1M tokens):
| Model | Input | Output | Cached Input |
|---|---|---|---|
| gpt-5.1 | $1.25 | $10.00 | $0.125 |
| gpt-5.1-codex | $1.25 | $10.00 | $0.125 |
| gpt-5.1-codex-mini | $0.25 | $2.00 | $0.025 |
| gpt-5.1-codex-max | $1.25 | $10.00 | $0.125 |
| gpt-5 | $1.25 | $10.00 | $0.125 |
| gpt-5-mini | $0.25 | $2.00 | $0.025 |
| gpt-5-nano | $0.05 | $0.40 | $0.005 |
File index for @ mention autocomplete in Composer:
getFileIndex(repoRoot): returns cached list of all non-ignored files- Uses
walkDirectory()fromignore.tswith 5000 file cap fuzzyMatchFiles(query, files, limit): fuzzy match files against query- Scoring: exact filename > prefix match > contains > subsequence
- Cache per repoRoot for performance
clearFileIndexCache(repoRoot?): clear cache when needed
File preview generation for attached file context:
generateFilePreview(repoRoot, filePath): returns preview + outline- Preview: first 30 lines or 2KB (whichever is smaller)
- Outline: extracted symbols (functions, classes, types) with line numbers
- Supports TypeScript/JavaScript and Python symbol extraction
formatAttachedFilesContext(repoRoot, filePaths): formats multiple files for injection- Output format: markdown with code blocks and symbol outlines
- Limited to 15 symbols per file with "more" indicator
main()
│
├──► parseArgs()
├──► detectRepoRoot()
├──► initLogger()
│
▼
loadCursorRules(projectPath)
│
├──► Walk .cursor/rules/ for *.mdc files
├──► Parse frontmatter, extract body
├──► Sort by relativePath
├──► Concatenate into single text block
│
▼
render(App, { cursorRulesText, ... })
│
▼
Orchestrator created with cursorRulesText in context
User Input (may contain /commands)
│
▼
Composer.onSubmit(content)
│
▼
App.handleSubmit(content)
│
├──► logger.info("user_prompt", { length })
│
▼
orchestrator.sendMessage(content)
│
▼
parseCommandInvocations(content, commandRegistry)
│
├──► For each command invocation:
│ │
│ ├──► Execute command via registry
│ ├──► If picker needed: create command_review entry, block for selection
│ └──► Add command_executed entry with result
│
▼
If remainingText.trim() non-empty:
│
├──► Push user entry to transcript
│
▼
┌─► provider.stream(messages, { tools, model })
│ │
│ ├──► onChunk: buffer chunks, throttled emit
│ ├──► onToolCall: add tool intent to transcript
│ │
│ ▼
│ Stream completes
│ │
│ ├──► If tool calls requested:
│ │ │
│ │ ├──► Execute each tool
│ │ ├──► Log tool_call_start/complete
│ │ ├──► Update tool entry with result
│ │ ├──► Build tool result message
│ │ └──► Continue loop ─────────────────┐
│ │ │
│ └──► No tools: exit loop │
│ │
└───────────────────────────────────────────────────┘
│
▼
Transcript re-renders with all content
SIGINT (Ctrl+C)
│
▼
App.handleSigint()
│
├──► If orchestrator.isProcessing():
│ │
│ ▼
│ orchestrator.cancel()
│ │
│ ├──► currentAbortController.abort()
│ ├──► Resolve pending reviews (reject/deny/cancel)
│ ├──► Set cancelled = true
│ └──► Return to input
│
└──► If not processing:
│
▼
orchestrator.stop()
│
├──► disposeAllShellServices()
└──► exit()
All file operations use symlink-aware path validation to prevent path traversal attacks:
Security layers:
- Normalization: Resolve
..and.segments in paths - Boundary check: Verify normalized path is within repo root
- Symlink resolution: Use
realpathSync()to resolve symlinks - Final verification: Ensure resolved real path is still within repo boundary
Implementation sites:
resolveSafePath()inutils/editing.ts- Used by all write operationsresolvePath()intools/read_file.ts- Used by read operations
Non-existent file handling:
- When file doesn't exist (e.g., during creation), recursively validates parent directories
- Walks up the directory tree until finding an existing directory, then validates it
- Ensures at least one ancestor directory exists and resolves within repo
- Prevents creating files via symlink directory chains that escape repo
- Supports creating files in deeply nested directories that don't exist yet (e.g.,
deep/nested/dir/file.txt)
Attack prevented:
A symlink inside the repo pointing to /etc/passwd or other sensitive files would fail validation because the real path would resolve outside the repo boundary.
When user input contains slash commands:
parseCommandInvocations()tokenizes input, finds registered commands- Each command has
span(start/end indices) for clean removal - Commands execute sequentially via registry
- If command needs picker (e.g.,
/modelwithout arg):- Creates
command_reviewtranscript entry - Blocks until user selects or cancels
- Updates entry with selection status
- Creates
- After execution,
command_executedentry added with result message remainingText(input with commands removed) sent to Claude if non-emptycommand_reviewandcommand_executedentries are excluded frombuildMessagesForClaude()
The /summarize command:
- Calls
generateSummary()which prompts Claude for structured JSON - Uses
systemOverridewith minimal prompt (no tool guidance noise) - Returns
StructuredSummaryor null on failure - On success: sets rolling summary, trims transcript
trimTranscript(keepLast)preserves chronological order:- Keeps last N user/assistant entries
- Preserves non-pending diff_review and shell_review outcomes
- Filters original array (no reordering)
- Rolling summary prepended to Claude context as structured block
Model selection via /model:
- With argument:
resolveModelId()maps alias → pinned ID - Without argument: shows picker with all models (both Anthropic and OpenAI)
getModelProvider(modelId)determines which provider to usecreateProviderForModel()creates appropriate provider instance- Switching between providers (e.g., Claude → GPT) recreates provider
- Extended thinking is now selected at model time, not via separate toggle:
- Select a model with
-thinkingsuffix to enable extended thinking - Example:
claude-sonnet-4-5-thinkingenables thinking for Sonnet 4.5 - OpenAI models do not have thinking variants (reasoning is intrinsic to those models)
- Select a model with
- Anthropic provider automatically strips
-thinkingsuffix before API call and passes thinking config isThinkingModel(modelId)checks if thinking is enabled for current modelcurrentModelstored in orchestrator state (includes-thinkingsuffix if enabled)thinkingEnabledcomputed from whethercurrentModelends with-thinking- Context limit updates automatically on model change
- Selection persisted globally to
~/.config/north/config.json - On startup, loads saved model or defaults to Claude Sonnet 4
Assistant Name Display:
getAssistantName(modelId)returns "Assistant"- Transcript displays correct name for current model
Environment Variables:
ANTHROPIC_API_KEY: required for Claude modelsOPENAI_API_KEY: required for GPT models
Both providers (anthropic.ts and openai.ts) use identical system prompts with a Cursor-inspired structured format using XML-like sections:
Sections:
<communication>- Tone, formatting, honesty rules (no lying, no guessing paths). Includes operational workflow: "If you need a file, find it first."<tool_calling>- Schema adherence, batch-level narration (not per-call), batching etiquette (1-2 info rounds before edits, no re-reading same ranges)<planning>- Micro-planning for 2+ file tasks (2-5 bullet plan, then execute immediately)<search_and_reading>- Question-first search methodology, formulation checklist (broad → narrow → minimal reads), bias toward self-discovery, tool selection by file type (HTML/CSS →find_blocks, JS/TS/Python →get_file_outline), optimal tool chain for HTML/CSS (find_blocks→search_text→read_around→ edit)<making_code_changes>- Default workflow (locate → confirm → atomic write → verify), read before edit, one edit per turn or atomic batch, no large pastes<verification>- Mandatory verification after edits, fix duplication/malformed structure immediately<mixed_files>- Strategy for HTML with embedded style/script: use find_blocks first, target by coordinates, pre-check selectors<tool_churn_limits>- After 2 reads + 1 write without success, switch to structure-first and atomic edits<debugging>- Edit only if confident, retry logic (re-read once on mismatch, max 3 lint loops)<calling_external_apis>- Only when explicitly requested<long_running_commands>- Never start dev servers or processes needing CTRL+C to stop<conversation>- Session UX rules (end with "Next I would: ...", acknowledge session resumption)
Key behaviors enforced:
- "If you did not read it, do not claim it exists"
- Never guess file paths or symbol names; find files/symbols before describing behavior
- Describe actions in natural language ("I'll search the repo") not tool names
- Before any batch of tool calls, write one sentence explaining the batch goal (not per-call)
- Prefer 1-2 rounds of info gathering before any edits; edit in the same turn when ready
- Plan briefly (2-5 bullets) for 2+ file tasks, then execute immediately
- Phrase search needs as questions first, then translate to exact patterns
- Retry once on edit mismatch, then ask for clarification
- Prefer surgical edits over large rewrites; break large content into chunks
- For new files >200 lines: create skeleton first, then add content in subsequent edits
- Avoid generating >300 lines in a single tool call
- End longer responses with "Next I would: ..." to signal continuation
- Default workflow: LOCATE → CONFIRM → ATOMIC WRITE → VERIFY
- Verification mandatory: After every edit, read the edited region to confirm
- Mixed files: For HTML with embedded CSS/JS, use find_blocks to get structural map first
- Tool churn limits: After 2 reads + 1 write on same file, switch to structure-first atomic edits
North supports two conversation modes that control tool availability:
Mode Types:
- Ask Mode: Read-only - only read tools available (read_file, search_text, find_files, list_root, read_readme, detect_languages, hotfiles, get_line_count, get_file_symbols, get_file_outline, expand_output, find_code_block, read_around, find_blocks)
- Agent Mode: Full access - all tools available including write and shell tools
Mode Selection:
- Mode is per-request, not global state
- Set via
/modecommand (with optional argument or interactive picker) - Cycle with Tab key in Composer: ask → agent → ask
- Tab cycles mode only when no autocomplete suggestions are present
- Current mode shown in Composer badge and StatusLine
Tool Filtering:
- Orchestrator's
sendMessage(content, mode)accepts mode parameter - Tools filtered via
filterToolsForMode(mode, allSchemas)before sending to Claude - Only tools allowed by current mode are included in API request
North tracks API costs in real-time, displaying both session and all-time totals.
How it works:
- Providers capture actual token usage from API responses (
usagefield inStreamResult) - After each successful API request, orchestrator calculates cost using
calculateCost() - Session cost accumulated in memory, all-time cost persisted to
~/.north/costs.json - StatusLine displays both costs:
$session / $all-time
Token usage sources:
Anthropic:
message_startandmessage_deltaevents containusageobjectinput_tokens: non-cached, non-cache-write input tokens (already excludes cached)output_tokens: output tokens (includes extended thinking)cache_read_input_tokens: tokens served from cache (charged at reduced cache hit rate)cache_creation_input_tokens: tokens written to cache (charged at cache write rate)
OpenAI:
response.completedevent containsresponse.usageobjectinput_tokens: total input tokensoutput_tokens: output tokens (includes reasoning tokens)input_tokens_details.cached_tokens: tokens served from prompt cache (charged at reduced rate)
Cost calculation (additive model):
Anthropic (fields are additive, not subtractive):
- Base input cost =
inputTokens× inputRate - Cache hit cost =
cacheReadTokens× cacheHitRate - Cache write cost =
cacheWriteTokens× cacheWriteRate - Output cost =
outputTokens× outputRate
OpenAI (cachedInputTokens subtracted from total):
- Non-cached input cost = (
inputTokens-cachedInputTokens) × inputRate - Cached input cost =
cachedInputTokens× cachedInputRate - Output cost =
outputTokens× outputRate
Note: Currently only supports 5-minute cache duration pricing for Anthropic. 1-hour cache has higher write rates not yet modeled.
Display format:
- Session cost: green color, shows cost since app started
- All-time cost: blue color, shows cumulative cost across all sessions
- Format:
$0.00to$0.001(3 decimals for small),$0.12to$99.99(2 decimals)
North tracks context usage in real-time to prevent overflow:
-
Token Estimation (before each request):
- Builds outgoing messages payload (system + transcript + injected rules)
- Estimates tokens using character-based heuristic (3.5 chars/token)
- Applies 10% safety margin
- Updates
contextUsedTokens,contextLimitTokens,contextUsage
-
Visual Indicator (StatusLine):
- Green circle: < 60% usage
- Yellow circle: 60-85% usage
- Red circle: > 85% usage
- Shows numeric percentage
-
Auto-Summarization (at 92% threshold):
- Automatically calls
generateSummary()before sending request - Replaces older transcript with structured summary
- Keeps last 10 messages verbatim
- Preserves injected rules and context
- Recomputes usage after compaction
- Proceeds with request normally
- Automatically calls
-
Per-Model Limits:
- All current Claude models: 200K tokens
- Limit updates automatically on model switch
- Usage recalculated with new limit
When Claude requests tools:
- Stream completes with
stopReason: "tool_use" - Orchestrator executes each tool via registry
- Results are JSON-stringified and sent back as
tool_resultblocks - Claude processes results and may request more tools or respond
The API requires every tool_use block to have a corresponding tool_result. To ensure this:
-
Write tool ID tracking: Tool IDs are only added to
writeToolCallIdsafter the tool succeeds and adiff_reviewentry is created. Failed write tools have their results sent via the normaltoolentry path. -
Recovery mechanism: If the API returns an "orphaned tool_use" error (tool_use without tool_result), the orchestrator:
- Extracts the orphaned tool ID from the error message
- Removes it from
writeToolCallIdsandshellToolCallIdssets - Removes the incomplete assistant entry from transcript
- Retries the request once
- Logs the recovery event for debugging
The orchestrator automatically retries API requests that fail due to transient errors:
Retryable errors:
- Network errors: ECONNREFUSED, ECONNRESET, ETIMEDOUT, ENETUNREACH, socket hang up, fetch failed
- Rate limits: HTTP 429, "rate limit", "too many requests"
- Server errors: HTTP 5xx, "overloaded", "service unavailable", "internal server error"
- Incomplete streams: "incomplete tool call", "possible timeout" (stream ended mid-tool-generation)
Non-retryable errors (fail immediately):
- Authentication errors (401, 403)
- Bad request errors (400)
- Cancellation/abort
API Request Timeout:
- Both providers configured with 10-minute timeout for large responses
- Anthropic:
timeoutoption passed to SDK client constructor - OpenAI:
AbortSignal.timeout()combined with user abort signal viaAbortSignal.any()
Incomplete Stream Detection:
- Anthropic: If
currentToolIdis set when stream ends, throws error (tool was mid-generation) - OpenAI: If
stopReason === nullandtoolCallsInProgresshas entries, throws error - These errors are detected by
isRetryableError()and trigger automatic retry
Retry behavior:
- Maximum 3 retry attempts per conversation turn
- Exponential backoff with jitter: ~1s, ~2s, ~4s (capped at 30s)
- Counter resets after successful request
- Silent retry (no UI change, request resumes after delay)
- Logs
api_retry_attemptevent with attempt count, delay, and error message
When Claude requests an edit tool (approvalPolicy: "write"):
- In Plan mode only: Orchestrator checks if
acceptedPlanexists - if not, returnsPLAN_REQUIREDerror - Orchestrator checks if auto-accept is enabled (
.north/autoaccept.json) - If auto-accept enabled: edits applied immediately, status set to "always", Claude continues
- If auto-accept disabled:
- Tool executes in "prepare" mode - computes diff but doesn't write
- Orchestrator creates a
diff_reviewtranscript entry with status "pending" - Tool loop blocks, waiting for user decision
- DiffReview component renders inline diff with Accept/Always/Reject options
- User presses
a(accept),y(always), orr(reject) - On Accept: edits applied atomically (temp files then rename)
- On Always: enables auto-accept for future edits, applies current edits
- On Reject: nothing written, status set to "rejected"
- Tool result sent to Claude with outcome (applied: true/false)
- Claude continues processing
File creation uses a streaming-to-disk protocol where the model outputs file contents as plain assistant text. Content is written directly to disk as it streams, with automatic continuation on provider timeouts.
Format:
<NORTH_FILE path="relative/path/to/file.ts">
...file contents...
</NORTH_FILE>
Continuation format (auto-generated on timeout):
<NORTH_FILE path="relative/path/to/file.ts" mode="append">
...continuation content...
</NORTH_FILE>
Why streaming-to-disk:
- Provider timeouts (~90s) can interrupt large file generation
- Tool calls buffer in memory and lose all content on timeout
- Direct-to-disk streaming preserves partial content
- Auto-continuation resumes from last written line
Flow:
- Model outputs
<NORTH_FILE path="...">tag in response StreamingFileBlockParserdetects open tag, emitssession_starteventFileWriteSessioncreated, opens file at final path (creates parent dirs if needed)- Content chunks written directly to disk as they stream
- Session tracks
linesWrittenand maintains 30-line trailing window for context - When
</NORTH_FILE>closes: session finalized, diff review triggered - On accept: file already written, nothing more to do
- On reject: file deleted from disk
Auto-continuation on timeout:
- Stream ends without close tag (provider timeout ~90s)
- Orchestrator detects incomplete session
- Sends continuation prompt with trailing window context
- Model responds with
<NORTH_FILE mode="append">block - Content appended to existing file
- Repeats until complete or max retries (3) exceeded
- On max retries: partial file preserved, error surfaced to user
Implementation (src/utils/):
fileblock.ts:StreamingFileBlockParser- event-based streaming parser- Events:
session_start,session_content,session_complete,display_text - Parses
mode="append"attribute for continuation blocks
filesession.ts:FileWriteSession- streaming file writer with line trackingstartSession(repoRoot, path)- creates new fileappendToSession(...)- continues from existing stategetResumeInfo()- returns lines written + trailing window
Tool Input Size Guard:
- All tool inputs checked against 50KB limit before execution
- Prevents large payloads from being sent via tools
- Error message directs model to use NORTH_FILE protocol instead
When Claude requests shell_run (approvalPolicy: "shell"):
- Orchestrator checks if global auto-approve is enabled (
.north/autoaccept.json) - If auto-approve enabled: execute immediately, status set to "auto", return result
- If not auto-approved, check if command is in
.north/allowlist.json - If allowed: execute immediately, status set to "always", return result
- If not allowed: create
shell_reviewtranscript entry with status "pending" - Tool loop blocks, waiting for user decision
- ShellReview component renders command with Run/Always/Auto All/Deny options
- User presses
r(run),a(always),y(auto all), ord(deny) - On Run: execute command, status set to "ran"
- On Always: add to allowlist, execute command, status set to "always"
- On Auto All: enable global auto-approve, execute command, status set to "auto"
- On Deny: return
{ denied: true }to Claude, status set to "denied" - Tool result sent to Claude with outcome
- Claude continues processing
Approval Priority: Global auto-approve (step 1) takes precedence over command allowlist (step 3). Once auto-approve is enabled, all commands run automatically without checking the allowlist.
North supports @ file mentions similar to Cursor and Claude Code. Users can attach files to their messages for automatic context injection.
User Flow:
- User types
@in the Composer - Autocomplete shows fuzzy-matched project files (respecting .gitignore)
- User can:
- Tab/Enter: Accept suggestion, file becomes attached
- Space/Escape: Dismiss autocomplete,
@treated as literal text
- Attached files shown as badge in Composer (e.g., "📎 2 files attached")
- On message submit, attached files passed to orchestrator
Context Injection:
- Orchestrator receives
attachedFiles: string[]insendMessage() - In
buildMessagesForClaude(), attached files injected as context block - Position: after cursor rules, after project profile, before rolling summary
- Format per file:
- Markdown header with file path
- Code block with first 30 lines (or 2KB)
- Symbol outline (functions, classes, types with line numbers)
Example Injected Context:
# Attached Files
## src/ui/Composer.tsx
```typescript
import React, { useState, useMemo, useEffect } from "react";
import { Box, Text, useInput } from "ink";
... [27 more lines]
Outline (Composer.tsx):
- interface Suggestion (line 7)
- interface ComposerProps (line 14)
- function Composer (line 197)
**File Index:**
- Built lazily on first `@` autocomplete
- Cached per repoRoot for performance
- Respects .gitignore via `walkDirectory()`
- Capped at 5000 files
**Fuzzy Matching:**
- Exact filename matches score highest
- Prefix matches score next
- Subsequence matching for partial queries
- Results sorted by score, limited to 10 suggestions
### Cursor Rules Loading
North automatically loads Cursor project rules at startup:
1. **Loading** (in `index.ts`):
- Calls `loadCursorRules(projectPath)` once before rendering
- Walks `.cursor/rules/` recursively for `*.mdc` files
- Parses YAML frontmatter (if present), keeps body content
- Sorts by relativePath for deterministic order
- Enforces 30KB hard cap, truncates if exceeded
2. **Storage** (in orchestrator context):
- `cursorRulesText` passed through App to orchestrator context
- Stored as plain string or null
3. **Injection** (in `buildMessagesForClaude()`):
- If `cursorRulesText` is non-empty, prepends to every request
- Format: `# Cursor Project Rules (.cursor/rules)` header
- Each rule: `## relativePath` followed by rule body
- Injected before rolling summary, ensuring rules always apply
4. **Format of injected rules**:
North can learn a project on first run and store a persistent profile for context in future conversations.
Startup Flow:
-
Profile Detection (in
index.ts):- Checks if profile exists via
hasProfile(repoRoot) - If profile exists: loads it with
loadProfile(repoRoot) - If no profile and not declined: sets
needsLearningPrompt = true
- Checks if profile exists via
-
Learning Prompt (first-time projects):
- App renders
LearningPromptcomponent whenlearningPromptIdis set - User presses
y(accept) orn(decline) - On decline: marks project as declined via
markDeclined(repoRoot) - On accept: triggers
orchestrator.startLearningSession()
- App renders
-
Learning Session:
- Runs 10 sequential discovery topics via
runLearningSession() - Each topic: focused LLM query with read-only tools
- Progress updates via callback:
onProgress(percent, topicTitle) - UI shows
LearningProgresscomponent with percent and current topic - Profile saved to
~/.north/projects/<hash>/profile.md
- Runs 10 sequential discovery topics via
-
Profile Injection (in
buildMessagesForClaude()):- If
projectProfileTextexists, inject after cursor rules - Format: markdown with H2 sections for each topic
- Position: Cursor rules → Project profile → Rolling summary → Transcript
- If
Discovery Topics:
- Project Summary - What it is, who it's for, workflows, what it doesn't do
- Architecture Map - Major modules, entry points, structure
- Code Style and Conventions - Naming, layout, formatting, lint rules
- Domain Model Vocabulary - Key concepts, terms, canonical locations
- Data Flow and State - Persistence, caches, data paths
- External Dependencies - Frameworks, libraries, services, config
- Build, Run, and Test Workflow - Commands and workflows
- Hot Spots and Change Patterns - Frequently changed areas
- Common Tasks Playbook - Where to implement common changes
- Safety Rails and Footguns - Known pitfalls and constraints
Storage:
- Profile:
~/.north/projects/<hash>/profile.md - Declined marker:
~/.north/projects/<hash>/declined.json - Hash: SHA-256 of repo root path (first 16 chars)
- Format: Markdown with
# Project Profileheader + H2 sections
/learn Command:
- Clears declined marker via
clearDeclined(repoRoot) - Triggers learning session via
ctx.triggerLearning() - Overwrites existing profile if present
- Use case: manually update profile after major project changes
UI Components:
LearningPrompt: Y/N prompt with border pulse animation (pending)LearningProgress: Percent + topic name display during learning
State Management:
- Orchestrator tracks:
learningPromptId,learningInProgress,learningPercent,learningTopic - Transcript entries:
learning_prompt(with status),learning_progress(with percent/topic) - Learning entries excluded from
buildMessagesForClaude()(UI-only)
North persists conversations for later resumption using an append-only event log.
Storage Location:
~/.north/conversations/<id>.jsonl- append-only event log~/.north/conversations/<id>.snapshot.json- optional full snapshot~/.north/conversations/index.json- conversation metadata index
Event Types:
conversation_started: ID, repoRoot, repoHash, model, timestampentry_added: full TranscriptEntry payloadentry_updated: entry ID + partial updates (streaming completion, review decisions)model_changed: new model IDrolling_summary_set: StructuredSummary or nullconversation_ended: clean exit marker
Resume Flow:
north resume <id>loads conversation from event log- Validates repoRoot exists (warns if missing)
- Orchestrator initialized with
initialState(transcript, rollingSummary, model) - Conversation continues normally with logging enabled
Persistence Triggers:
addEntry()→logEntryAdded()updateEntry()→logEntryUpdated()setModel()→logModelChanged()setRollingSummary()→logRollingSummarySet()stop()→logConversationEnded()+ resolve pending reviews
Pending Review Handling:
- On exit, pending reviews are resolved as cancelled/rejected/denied
- Review status updates are persisted before exit
- Resume never has pending interactive states (deterministic)
CLI Commands:
north- new conversation (generates 6-char hex ID)north resume <id>- resume by IDnorth resume- interactive picker of recent conversationsnorth conversationsornorth list- list conversations with metadata
Slash Commands:
/conversations- picker to switch to another conversation/resume <id>- switch to conversation by ID directly
Portability:
- Both
repoRoot(path) andrepoHash(SHA-256 prefix) stored - If repoRoot missing on resume, warns user and continues
- User can provide
--pathto specify new location
The app handles CTRL+C via Ink's useInput hook (not process.on SIGINT) contextually:
-
During processing (
isProcessing() === true):- Calls
orchestrator.cancel() - Aborts the current AbortController (stops API streaming)
- Aborts the shell AbortController (kills any running shell command)
- Resolves any pending reviews as rejected/denied/cancelled
- Appends
[Cancelled]to the assistant's message - Returns control to the input field
- App remains running
- Calls
-
When idle (
isProcessing() === false):- Calls
orchestrator.stop() - Disposes all shell services
- Exits the application
- Calls
Implementation details:
currentAbortControllertracks the active API requestshellAbortControllertracks any running shell command (created per-command)cancelledflag checked in conversation loop- Provider stream loop checks
signal.abortedand exits gracefully - Shell process killed via
proc.kill()when abort signal fires - Pending write/shell/command reviews auto-resolve on cancel
The orchestrator formats tool names for better readability in the TUI:
list_root→ "Listing project files - N entries"find_files→ "Finding pattern - N files" (with + suffix if truncated)read_file→ "Reading filename.ext"get_line_count→ "Checking size of filename.ext"get_file_symbols→ "Extracting symbols from filename.ext"get_file_outline→ "Outlining filename.ext"edit_replace_exact→ "Editing filename.ext (+X/-Y)" after approvaledit_insert_at_line→ "Editing filename.ext (+X/-Y)" after approvaledit_create_file→ "Creating filename.ext (+X/-Y)" after approvaledit_apply_batch→ "Editing N files (+X/-Y)" after approval- Other tools: shown as-is
Edit Stats Display:
- After an edit is approved (accept/always) or auto-applied, the tool entry is updated to show line statistics
- Format:
+X/-Ywhere X is lines added and Y is lines removed - Stats computed from the diff content using
linesAddedandlinesRemovedfrom FileDiff
Implementation split between:
formatToolNameForDisplay()in orchestrator: extracts display name from tool argumentscomputeDiffStats()in orchestrator: calculates total added/removed lines from diffsgetToolResultSuffix()in Transcript.tsx: appends result counts for file listing tools
North uses subtle, frame-based animations to enhance feedback without overwhelming the terminal:
-
Streaming Indicator Pulse (Assistant & Tool messages):
- Pulses through magenta shades (magenta → #ff6ec7 → #ff8fd5 → #ffa0dc → back)
- 500ms interval per color transition
- Indicates active streaming or processing
-
Tool Execution Spinner:
- Animated spinner frames: ⠋⠙⠹⠸⠼⠴⠦⠧⠇⠏
- 80ms frame interval for smooth rotation
- Yellow color to match tool theme
- Shown when tool is executing (
isStreaming: true)
-
Pending Review Border Pulse:
- Pulses through yellow shades (yellow → #ffff87 → #ffffaf → back)
- 600ms interval per color transition
- Applied to DiffReview, ShellReview, and PlanReview when status is "pending"
- Draws attention to items requiring user action
Implementation Details:
- Custom React hooks (
useSpinner,usePulse,useBorderPulse) - Uses
setIntervalwith cleanup on unmount - Frame rates kept low (12-15 fps) to avoid terminal flicker
- Colors cycle smoothly for breathing effect
- All animations respect terminal color support
- Conditional timers: Animation hooks accept an
activeboolean parameter; timers only run when active - Auto-disable threshold: Animations auto-disable when transcript exceeds 100 entries
North uses an alternate screen buffer (like htop, less, vim) instead of terminal scrollback:
Why alternate screen?
- Ink's differential rendering (cursor moves + line clears) conflicts with terminal scrollback
- When Ink redraws while user scrolls, scrollback can become corrupted
- Different terminals (iTerm2, Terminal.app) handle this inconsistently
- Alternate screen provides a stable, controlled viewport
Architecture:
Terminal (alternate screen)
┌────────────────────────────────────┐
│ ScrollableTranscript │ ← viewport-height, renders line slice
│ - Pre-wrapped lines with ANSI │
│ - Only visible lines rendered │
│ - Scroll offset from bottom │
├────────────────────────────────────┤
│ Interactive entries (reviews) │ ← Always visible at bottom
├────────────────────────────────────┤
│ Composer │ ← Fixed height
├────────────────────────────────────┤
│ StatusLine │ ← Fixed height, shows [SCROLL]
└────────────────────────────────────┘
State:
scrollOffset: lines from bottom (0 = follow mode)viewportHeight: terminal rows - composer - status - padding (dynamic based on composer line count)viewportWidth: terminal columns - paddingcomposerLineCount: tracked via callback from Composer for dynamic height calculation
Keyboard:
- Up/Down: scroll ±1 line (when composer disabled)
- PageUp/PageDown: scroll ±viewportHeight
- G: jump to bottom
Tradeoff:
- Transcript is not in terminal scrollback after exit
- Future: add
/exportcommand to save transcript to file
To prevent flickering in large conversations, North implements several Ink-specific optimizations:
-
Static Rendering with
<Static>:- Ink's
<Static>component renders items once and never re-renders them - Completed transcript entries (not streaming, not pending review) are rendered inside
<Static> - Only dynamic entries (streaming messages, pending reviews) re-render on state changes
- This transforms "redraw 2000-line screen 12x/sec" into "redraw small dynamic section"
- Entry uniqueness: Deduplication check prevents same entry ID from appearing in both sections
- Review status priority: For review entries,
reviewStatusdetermines static vs dynamic, preventing race conditions during state transitions
- Ink's
-
Conditional Animation Timers:
- All animation hooks (
useSpinner,usePulse,useBorderPulse) accept anactiveparameter - Timers only start when
active === true - Prevents "zombie timers" from completed entries causing unnecessary state updates
- Example:
useSpinner(entry.isStreaming, 80)only animates while streaming
- All animation hooks (
-
Animation Kill Switch:
- When transcript exceeds
ANIMATION_DISABLE_THRESHOLD(100 entries), animations auto-disable animationsEnabledboolean passed through component tree- Pending reviews still show correct state, just without pulsing animations
- When transcript exceeds
-
Memoized Components:
- All message components wrapped in
React.memo:UserMessage,AssistantMessage,ToolMessage,CommandExecutedMessage,MessageBlock,StaticEntry - Review components also memoized:
DiffReview,ShellReview,CommandReview - Primitive props preferred over object props where possible
- All message components wrapped in
-
Precomputed Render Data:
DiffContentprecomputes colored line data inuseMemo- Line styling decisions made once per diff, not on every render
- Reduces CPU work during animation frames
-
Entry Classification:
isEntryStatic()helper determines if an entry can be rendered statically- Criteria: not streaming, not the active review, no pending review status
- Entries graduate from dynamic to static as their state settles
Architecture:
<Transcript>
<Static items={staticEntries}> // Completed entries - render once
{(entry) => <StaticEntry />}
</Static>
{dynamicEntries.map((entry) => // Active entries - re-render on changes
<MessageBlock />
)}
</Transcript>
This architecture ensures that only the actively changing portion of the transcript triggers redraws, keeping the terminal responsive even in very long conversations.
The ignore checker:
- Always ignores common directories (node_modules, .git, etc.)
- Parses .gitignore if present at repo root
- Supports glob patterns:
*,**,? - Supports negation patterns:
!important.log - Supports directory-only patterns:
logs/
search_text uses ripgrep if available (faster, better output), with fallback to pure JS implementation:
- Ripgrep: spawns
rg --jsonfor structured output - Fallback: walks files and searches line by line
All tools enforce limits to prevent context overflow:
read_file: 500 lines or 100KB max, optional head/tail inclusionsearch_text: 50 matches default, 200 max, supports file-specific and line range searchesfind_files: 50 files default, 500 maxget_line_count: No limits, quick stat checkget_file_symbols: Returns all detected symbols (functions, classes, types, etc.)get_file_outline: Returns hierarchical structure with line rangesread_readme: 8KB maxhotfiles: 10 files default, 50 max
Truncation is always explicit with truncated: true in results.
| Package | Version | Purpose |
|---|---|---|
@anthropic-ai/sdk |
^0.39.0 | Claude API client |
ink |
^5.1.0 | Terminal UI framework |
react |
^18.3.1 | UI component model |
wrap-ansi |
^9.0.2 | ANSI-aware text wrapping for scroll viewport |
string-width |
^8.1.0 | Unicode-aware string width calculation |
| Package | Version | Purpose |
|---|---|---|
eslint |
^9.17.0 | Code linting |
typescript-eslint |
^8.18.1 | TypeScript ESLint support |
eslint-plugin-react |
^7.37.2 | React-specific linting |
eslint-plugin-react-hooks |
^5.1.0 | React Hooks linting |
prettier |
^3.4.2 | Code formatting |
typescript |
^5.7.2 | Type checking |
The project uses ESLint and Prettier for code quality enforcement.
bun run lint # Run ESLint
bun run lint:fix # Run ESLint with auto-fix
bun run format # Format code with Prettier
bun run format:check # Check Prettier formatting
bun run typecheck # Run TypeScript type checking
bun run check # Run all checks (typecheck + lint + format:check)- Uses flat config format (
eslint.config.js) - TypeScript support via
typescript-eslint - React and React Hooks plugins
- Key rules:
@typescript-eslint/consistent-type-imports: Enforcestypeimports@typescript-eslint/no-unused-vars: Errors on unused variables (allows_prefix)@typescript-eslint/no-explicit-any: Warns onanytype usagereact-hooks/exhaustive-deps: Enforces correct hook dependencies
- 4-space indentation
- Double quotes
- Semicolons required
- 100 character line width
- ES5 trailing commas
Pre-commit hooks are configured in .githooks/pre-commit. The hook runs:
- TypeScript type checking
- ESLint linting
- Prettier format verification
To enable hooks after cloning:
bun run prepare # or: git config core.hooksPath .githooksThe prepare script runs automatically on bun install.
# Development
bun run dev
# With options
bun run dev --path /some/repo --log-level debug
# Build for distribution
bun run build # builds to dist/
bun run link # symlinks for global 'north' command
# Build standalone binaries
bun run build:binary # current platform
bun run build:binary:mac-arm
bun run build:binary:linuxSimple and straightforward:
- JavaScript bundling:
bun buildcompiles TypeScript todist/index.js - Binary compilation:
bun build --compilecreates a standalone executable with Bun runtime embedded - No native dependencies: Uses only Bun's built-in APIs (
Bun.spawn()) for shell commands
The compiled binary is completely self-contained and can be distributed as a single file with no external dependencies.
Uses Bun's built-in test runner:
bun test # run all tests
bun test --watch # watch mode
bun test tests/openai*.ts # run specific testsTest coverage:
tests/openai-provider.test.ts: OpenAI provider tests- Tool schema conversion (verifies Responses API format)
- Provider factory and message builders
- SSE streaming event parsing
- Error handling
tests/storage.test.ts: Storage layer tests- Allowlist storage (per-project command allowlist)
- AutoAccept storage (per-project auto-accept settings)
- Global config storage (user preferences)
tests/tools-read.test.ts: Read tool testsget_file_outlineHTML embedded block tests (style/script parsing)get_file_symbolsHTML/CSS redirect hint testssearch_textcontextLines tests
tests/tools-edit.test.ts: Edit tool tests- Prepare contract tests
- Trailing newline preservation
- Failure diagnostic tests (whitespace, near-miss, hints)
tests/tools-find-code-block.test.ts: Find code block tool tests- CSS selector and
@media/@keyframesdetection - HTML embedded style/script block parsing
- Helpful hints for HTML/CSS files
- Nested block deduplication
- CSS selector and
tests/tools-find-blocks.test.ts: Find blocks tool tests- Mixed HTML parsing (embedded style/script)
- CSS rules inside style blocks
- JS symbols inside script blocks
- Kind filtering
- C#/PHP/Java symbol detection (namespaces, classes, methods, traits)
tests/tools-workflow.test.ts: Integration-style workflow tests- Mixed HTML navigation patterns
- Edit failure diagnostics workflow
- Structure-first editing patterns
- CSS selector pre-checking
tests/tools-security.test.ts: Path traversal and symlink security teststests/tools-shell.test.ts: Shell service teststests/rules-cursor.test.ts: Cursor rules loader tests
Test Isolation:
Tests that interact with user storage use environment variable overrides to prevent modifying actual user data:
- Config tests: Set
NORTH_CONFIG_DIRto temporary directory instead of manipulatingHOME - Repo-scoped tests: Use
createTempRepo()helper to create isolated temporary repositories afterEachhooks ensure cleanup of temporary directories and restoration of environment variables
Required:
ANTHROPIC_API_KEY: For Claude modelsOPENAI_API_KEY: For GPT models (at least one required)
Location: ~/.local/state/north/north.log
Example entries:
{"timestamp":"2025-12-08T10:00:00.000Z","level":"info","event":"app_start","data":{"version":"0.1.0","projectPath":"/path/to/repo","cwd":"/path/to/repo"}}
{"timestamp":"2025-12-08T10:00:05.000Z","level":"info","event":"user_prompt","data":{"length":42}}
{"timestamp":"2025-12-08T10:00:05.001Z","level":"info","event":"model_request_start","data":{"requestId":"req-123-abc","model":"claude-sonnet-4-20250514"}}
{"timestamp":"2025-12-08T10:00:06.000Z","level":"info","event":"tool_call_start","data":{"toolName":"search_text","argsSummary":{"query":"useState","path":"src"}}}
{"timestamp":"2025-12-08T10:00:06.150Z","level":"info","event":"tool_call_complete","data":{"toolName":"search_text","durationMs":150,"ok":true}}
{"timestamp":"2025-12-08T10:00:08.500Z","level":"info","event":"model_request_complete","data":{"requestId":"req-123-abc","durationMs":3499}}Technical debt and improvements are tracked in the todo/ folder. Each file represents a single actionable item with:
- Severity level (Major/Minor/Trivial)
- Affected location(s)
- Problem description
- Solution approach
- Implementation notes
Files are numbered by priority (01-xx = Major, 05-xx = Minor, 10-xx+ = Trivial).
Delete each file after completing the task: rm todo/XX-filename.md