docs: singularity-crush SPEC.md + design docs by mikkihugo · Pull Request #1 · singularity-ng/crush

mikkihugo · 2026-04-29T04:26:28Z

Summary

Adds SPEC.md — authoritative 1721-line specification for singularity-crush with RFC 2119 normative language and a 55-item conformance checklist across 5 tiers
Adds harness.md — engineering practices and design notes (research working doc)
Adds migrate.md — migration notes, model routing, knowledge layer design, dispatch scheduling

What SPEC.md covers (26 sections)

Phase state machine, orchestration loop, worker attempt lifecycle, context budget, supervision + circuit breaker, hook pipeline, workspace management (symlink-aware path containment), worktree isolation, verification gates, configuration + dynamic reload, model routing, knowledge layer (Hindsight + memory tiers + anti-pattern library), persistent agents, inter-agent messaging, observability, failure taxonomy, trust boundary, distributed SSH execution, plugin extension points, secret management (Vault), CLI commands, conformance checklist.

🤖 Generated with Claude Code

SPEC.md is the authoritative language-agnostic specification (1721 lines, RFC 2119 normative language, 55-item conformance checklist) covering all 26 design sections: phase state machine, orchestration loop, worker attempt lifecycle, context budget, supervision, hook pipeline, workspace management, verification gates, model routing, knowledge layer (Hindsight), persistent agents, inter-agent messaging, observability, failure taxonomy, trust boundary, distributed SSH execution, plugin extension points, and secret management. harness.md and migrate.md are the research/working notes from which SPEC.md was synthesised. docs/hooks/FUTURE.md adds the SF hook event table. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

qodo-code-review · 2026-04-29T04:26:51Z

Review Summary by Qodo

Add comprehensive singularity-crush specification, harness design, and migration documentation

📝 Documentation

Walkthroughs

Description

• Adds comprehensive SPEC.md (1721 lines) — authoritative specification for singularity-crush with
  RFC 2119 normative language covering 26 sections including phase state machine, orchestration loop,
  worker lifecycle, knowledge layer, persistent agents, and 55-item conformance checklist across 5
  tiers
• Adds harness.md (899 lines) — engineering practices document defining clean architectural
  boundaries and 20 core harness concerns with RFC 2119 normative rules for budget management, phase
  transitions, hook pipelines, supervision, and distributed execution
• Adds migrate.md — migration guide from SF TypeScript to singularity-crush Go, detailing
  architecture decisions, prompt template contracts, HTTP observability API, git-aware revert
  protocol, dispatch scheduling, Hindsight memory integration, and plugin extension points
• Adds docs/hooks/FUTURE.md — SF harness hook events specification documenting 7 new event types
  (PreDispatch, PostUnit, PhaseChange, AutoLoop, AgentWake, AgentIdle, AgentMessage)
  with payload structures and aggregation behavior
• Adds mise.toml — tool configuration file with environment variables

Diagram

flowchart LR
  SF["SF TypeScript<br/>Codebase"]
  SPEC["SPEC.md<br/>1721-line Spec"]
  HARNESS["harness.md<br/>Engineering Practices"]
  MIGRATE["migrate.md<br/>Migration Guide"]
  HOOKS["docs/hooks/FUTURE.md<br/>Hook Events"]
  SC["singularity-crush<br/>Go Implementation"]
  
  SF -- "migration path" --> MIGRATE
  MIGRATE -- "references" --> SPEC
  MIGRATE -- "references" --> HARNESS
  SPEC -- "defines" --> SC
  HARNESS -- "defines" --> SC
  HOOKS -- "extends" --> SC

File Changes

1. SPEC.md 📝 Documentation +1721/-0

Comprehensive singularity-crush specification with phase machine and orchestration

• Adds comprehensive 1721-line specification for singularity-crush with RFC 2119 normative language
 covering 26 sections
• Defines phase state machine (10 phases), orchestration loop, worker attempt lifecycle, and context
 budget management
• Specifies data model with SQLite schema, supervision system with 9 built-in checks, and hook
 pipeline architecture
• Details knowledge layer (Hindsight integration), persistent agents with memory blocks, inter-agent
 messaging, and model routing with three tiers
• Includes 55-item conformance checklist across 5 tiers (core, knowledge layer, model routing,
 persistent agents, extensions)

SPEC.md

2. migrate.md 📝 Documentation +779/-0

Migration guide from SF TypeScript to singularity-crush Go with architecture decisions

• Explains singularity-crush as Crush on autopilot, mapping existing codebases and driving
 autonomous execution through research → plan → execute → verify → complete phases
• Details what Crush already provides (agent loop, LLM multi-provider, MCP, SQLite, TUI) and what SF
 adds (planning system, phase dispatch, git/worktree management, session state)
• Specifies prompt template contract with strict variable checking (unit, attempt, phase,
 session_id) and continuation turn guidance-only prompts
• Covers HTTP observability API, git-aware revert protocol, dispatch scheduling with priority
 ordering and blocker-aware dispatch, and model routing with three tiers and benchmarking
• Describes Hindsight memory integration (two-bank pattern, anti-pattern library, pattern
 maturation, confidence decay) replacing SF's flat KNOWLEDGE.md
• Lists plugin extension points (SupervisorCheck, Shipper, VCS, Store, Notifier) and effort estimate
 (~11-14 weeks for working SF-equivalent with persistent agents)

migrate.md

3. docs/hooks/FUTURE.md 📝 Documentation +62/-0

SF harness hook events for unit lifecycle and persistent agents

• Adds SF harness hook events section documenting 7 new event types: PreDispatch, PostUnit,
 PhaseChange, AutoLoop, AgentWake, AgentIdle, AgentMessage
• Specifies PostUnit payload structure with unit metadata, verdict, duration, tokens, cost, model,
 and learnings
• Details aggregation behavior: PreDispatch and AgentWake follow PreToolUse semantics with
 allow/deny/halt; PostUnit and AutoLoop are notification-only
• Explains AgentWake/AgentIdle hooks for persistent agent lifecycle and AgentMessage hooks for
 enforcing routing policy between agents

docs/hooks/FUTURE.md

View more (2)

4. mise.toml ⚙️ Configuration changes +2/-0

Add mise tool configuration

• Adds mise configuration file with environment variable for OpenAI Codex latest version

mise.toml

5. harness.md 📝 Documentation +899/-0

Harness engineering practices and architectural boundaries specification

• Adds comprehensive 899-line engineering practices document for singularity-crush harness layer
• Defines clean architectural boundaries between agent loop, orchestration logic, and planning
 layers
• Specifies 20 core harness concerns: context budget, phase transitions, unit lifecycle hooks,
 session contract, observability, supervision, tool sandboxing, configuration, knowledge layer
 integration, post-unit hooks, worktree isolation, model routing, verification gates, environment
 variables, persistent agents, inter-agent messaging, failure taxonomy, worker attempt loop,
 distributed SSH execution, and trust boundaries
• Establishes RFC 2119 normative rules for budget compaction, phase state machines, hook pipelines,
 session recovery, structured logging with span-based tracing, supervisor checks (stuck loop,
 timeout, abandon detection, circuit breaker), tool response contracts, and symlink-aware path
 containment

harness.md

qodo-code-review · 2026-04-29T04:26:53Z

Code Review by Qodo

🐞 Bugs (1) 📘 Rule violations (0) 📎 Requirement gaps (0)

1. ~~Phase flow inconsistent~~ ☑ 🐞 Bug ≡ Correctness

Description

Multiple docs define different canonical phase sequences (5-phase vs 8-phase), making the state
machine ambiguous and likely to be implemented incorrectly.

Code

SPEC.md[46]

+singularity-crush is an autopilot layer built on top of [charmbracelet/crush](https://github.com/charmbracelet/crush). Crush is an interactive coding agent — a human drives it turn by turn. singularity-crush adds a harness that drives Crush autonomously through a structured phase sequence (research → plan → execute → verify → complete) without human intervention per unit, while the human watches or steers.

Evidence

SPEC’s Overview describes a 5-phase loop, while the Phase State Machine section defines an 8-phase
“Standard flow”; harness.md and migrate.md also repeat differing shorthand sequences. This creates
an ambiguous source of truth for the orchestrator’s state machine.

SPEC.md[44-47]
SPEC.md[271-274]
harness.md[54-57]
migrate.md[5-8]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The docs define conflicting “canonical” phase sequences (5-phase vs 8-phase). Because these documents are meant to guide implementation, the phase/state-machine contract needs a single, consistent definition.
### Issue Context
- SPEC.md Overview uses: research → plan → execute → verify → complete.
- SPEC.md Phase State Machine defines: Research → Plan → Execute → TDD → Verify → Review → Merge → Complete.
- harness.md and migrate.md also include shorthand phase sequences that don’t match the standard flow.
### Fix Focus Areas
- SPEC.md[44-47]
- SPEC.md[271-274]
- harness.md[54-57]
- migrate.md[5-8]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

2. ~~Prompt vars inconsistent~~ ☑ 🐞 Bug ≡ Correctness

Description

SPEC.md defines canonical execute-task template variables, but harness.md and migrate.md document
different variable sets, increasing the risk of template/code drift and strict-render startup
panics.

Code

migrate.md[R67-77]

+## Prompt template contract
+
+Every dispatch renders the unit's prompt template with a strict variable checker — unknown variables fail rendering immediately (not silently). Template input variables:
+
+| Variable | Type | Value |
+|---|---|---|
+| `unit` | object | Full unit record: id, type, phase, title, description, labels, blockers |
+| `attempt` | integer or null | `null` on first dispatch; `1+` on retry or continuation |
+| `phase` | string | Current phase name (`execute`, `tdd`, etc.) |
+| `session_id` | string | Stable session UUID |
+

Evidence

SPEC.md lists canonical variables (including unit_id, unit_type, issue, last_error, etc.),
while migrate.md documents a unit object and fewer fields, and harness.md omits some SPEC-listed
variables. With strict rendering (“unknown variable MUST cause loadPrompt to panic”), divergent
documentation is likely to produce broken templates/tests.

SPEC.md[487-505]
harness.md[131-145]
migrate.md[67-77]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
Prompt template variable documentation diverges across the new docs. Given strict template rendering (unknown vars panic at startup), this inconsistency is likely to cause broken templates/tests during implementation.
### Issue Context
- SPEC.md defines the canonical execute-task variables.
- harness.md and migrate.md present different variable sets/structures.
### Fix Focus Areas
- SPEC.md[487-505]
- harness.md[131-145]
- migrate.md[67-77]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

3. PostUnit semantics ambiguous 🐞 Bug ⚙ Maintainability

Description

docs/hooks/FUTURE.md simultaneously says PostUnit hooks can abort a session on non-zero exit, and
also says PostUnit is “notification-only” and cannot block, leaving implementers without a coherent
contract.

Code

docs/hooks/FUTURE.md[R415-438]

+`PostUnit` hooks that exit non-zero signal `SignalAbort` — the harness stops
+the session. Hooks that time out (default 30s) are killed and logged but do
+not block the next dispatch. This is the primary hook for: git commit/push,
+hermes-memory feedback, test gate execution, custom notifications.
+
+### AgentWake / AgentIdle hooks
+
+These fire per persistent agent, not per session. A hook on `AgentWake` can
+gate which agents are allowed to start (e.g. enforce a fleet size limit). A
+hook on `AgentIdle` is the natural place for post-turn git operations scoped
+to that agent's workspace.
+
+`AgentMessage` hooks fire before the message is delivered to the inbox. A
+`deny` decision drops the message and returns an error to the calling agent's
+`send_message` tool. Use this to enforce routing policy (e.g. an agent cannot
+message outside its designated group).
+
+### Aggregation behaviour for new events
+
+`PreDispatch` and `AgentWake` follow `PreToolUse` semantics: any `deny` or
+`halt` blocks the dispatch/wake. `PostUnit`, `AgentIdle`, and `AutoLoop` are
+notification-only — hooks cannot block these events, only observe them.
+`PhaseChange` and `AgentMessage` support `deny` to block the transition or
+message delivery respectively.

Evidence

One section states PostUnit non-zero exits stop the session (abort behavior), while the aggregation
section classifies PostUnit as non-blocking/notification-only. These statements need reconciliation
or explicit definitions (e.g., whether abort is considered ‘blocking’).

docs/hooks/FUTURE.md[415-418]
docs/hooks/FUTURE.md[432-438]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The document gives two incompatible interpretations of PostUnit hooks:
- non-zero exit aborts the session
- PostUnit is notification-only and cannot block
### Issue Context
Implementers need an explicit contract for whether PostUnit hooks are allowed to influence control flow (abort/pause) or are strictly observational.
### Fix Focus Areas
- docs/hooks/FUTURE.md[415-438]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

gemini-code-assist

Code Review

This pull request introduces the specification, migration notes, and harness engineering practices for singularity-crush, an autonomous autopilot layer for the Crush coding agent. The changes define the data model, phase state machine, orchestration loop, and knowledge layer integration. My feedback identifies several inconsistencies in the SQLite schema definitions regarding timestamp types and vector storage, as well as a documentation discrepancy regarding log truncation limits and a redundant interface definition.

gemini-code-assist · 2026-04-29T04:29:11Z

+CREATE TABLE memories (
+    id            TEXT PRIMARY KEY,
+    content       TEXT NOT NULL,
+    embedding     F32_BLOB(2560),        -- Qwen3-Embedding-4B; NULL until indexed


The F32_BLOB(2560) type and the USING libsql_vector_idx index syntax (line 200) are specific to libsql. Since the project overview (line 54) mentions ncruces/go-sqlite3 and migrate.md (line 443) mentions sqlite-vec, it would be more accurate to use a standard BLOB type or the specific syntax required by sqlite-vec to ensure compatibility with the chosen database engine.

Suggested change

embedding F32_BLOB(2560), -- Qwen3-Embedding-4B; NULL until indexed

embedding BLOB, -- Vector embedding; NULL until indexed

gemini-code-assist · 2026-04-29T04:29:11Z

+    last_accessed TEXT,
+    valid_until   TEXT,
+    superseded_by TEXT,
+    created_at    TEXT NOT NULL,
+    updated_at    TEXT NOT NULL


The memories table uses TEXT for several timestamp fields (last_accessed, valid_until, created_at, updated_at), whereas all other tables in this specification (e.g., sessions, units, agents) use INTEGER for Unix timestamps. For consistency and to simplify date-based queries (such as the decay logic in section 16.6), these should be changed to INTEGER.

Suggested change

last_accessed TEXT,

valid_until TEXT,

superseded_by TEXT,

created_at TEXT NOT NULL,

updated_at TEXT NOT NULL

last_accessed INTEGER,

valid_until INTEGER,

superseded_by TEXT,

created_at INTEGER NOT NULL,

updated_at INTEGER NOT NULL

gemini-code-assist · 2026-04-29T04:29:11Z

+type SupervisorCheck interface {
+    Name() string
+    Check(ctx context.Context, state SupervisorState) SupervisorSignal
+}


This definition of the SupervisorCheck interface is redundant as it was already defined identically in section 9.1 (lines 581-584).

gemini-code-assist · 2026-04-29T04:29:11Z

+| Phase transitions | `from=`, `to=`, `reason=` |
+| Gate execution | `gate=`, `attempt=`, `passed=` |
+
+Include action outcome in the message: `completed`, `failed`, `retrying`, `canceled`. Never log large raw payloads — truncate at 512 bytes and note `[truncated]`. If a log sink fails, continue running and emit a warning through any remaining sink.


The truncation limit for raw payloads is specified as 512 bytes here, but line 270 in this file and several places in SPEC.md (e.g., lines 1240, 1669) specify 2 KB. These should be synchronized to avoid confusion during implementation.

Suggested change

Include action outcome in the message: `completed`, `failed`, `retrying`, `canceled`. Never log large raw payloads — truncate at 512 bytes and note `[truncated]`. If a log sink fails, continue running and emit a warning through any remaining sink.

Include action outcome in the message: `completed`, `failed`, `retrying`, `canceled`. MUST NOT log large raw payloads — truncate hook output at 2 KB and append `(truncated)`. If a log sink fails, continue running and emit a warning through any remaining sink.

gemini-code-assist · 2026-04-29T04:29:11Z

+### Storage
+
+**sqlite-vec** with `F32_BLOB(2560)` columns in the same SQLite DB Crush already uses (`ncruces/go-sqlite3`). No extra processes, no separate index files. FTS5 virtual table alongside for BM25 hybrid search — FTS5 is the fallback when the embedding endpoint is unreachable.
+


sqlite-vec does not use the F32_BLOB syntax; that is a libsql specific type. For consistency with the chosen extension, this should refer to BLOB or the specific float[N] syntax if applicable.

Suggested change

sqlite-vec with BLOB columns in the same SQLite DB Crush already uses (ncruces/go-sqlite3).

gemini-code-assist · 2026-04-29T04:29:11Z

+CREATE TABLE memories (
+    id           TEXT PRIMARY KEY,
+    content      TEXT NOT NULL,
+    embedding    F32_BLOB(2560),  -- Qwen3-Embedding-4B
+    decay_factor REAL DEFAULT 1.0,
+    confidence   REAL DEFAULT 0.7,     -- affects half-life
+    maturity     TEXT DEFAULT 'candidate',
+    is_negative  INTEGER DEFAULT 0,
+    helpful_hits INTEGER DEFAULT 0,
+    harmful_hits INTEGER DEFAULT 0,
+    access_count INTEGER DEFAULT 0,
+    collection   TEXT DEFAULT 'default',
+    tags         TEXT,                 -- JSON array
+    last_accessed TEXT,
+    valid_until  TEXT,
+    superseded_by TEXT,
+    created_at   TEXT NOT NULL,
+    updated_at   TEXT NOT NULL


Similar to the memories table in SPEC.md, the timestamp fields here should use INTEGER for consistency with the rest of the schema, and F32_BLOB should be replaced with a type compatible with sqlite-vec (e.g., BLOB).

Suggested change

CREATE TABLE memories (

id TEXT PRIMARY KEY,

content TEXT NOT NULL,

embedding F32_BLOB(2560), -- Qwen3-Embedding-4B

decay_factor REAL DEFAULT 1.0,

confidence REAL DEFAULT 0.7, -- affects half-life

maturity TEXT DEFAULT 'candidate',

is_negative INTEGER DEFAULT 0,

helpful_hits INTEGER DEFAULT 0,

harmful_hits INTEGER DEFAULT 0,

access_count INTEGER DEFAULT 0,

collection TEXT DEFAULT 'default',

tags TEXT, -- JSON array

last_accessed TEXT,

valid_until TEXT,

superseded_by TEXT,

created_at TEXT NOT NULL,

updated_at TEXT NOT NULL

CREATE TABLE memories (

id TEXT PRIMARY KEY,

content TEXT NOT NULL,

embedding BLOB, -- Vector embedding

decay_factor REAL DEFAULT 1.0,

confidence REAL DEFAULT 0.7, -- affects half-life

maturity TEXT DEFAULT 'candidate',

is_negative INTEGER DEFAULT 0,

helpful_hits INTEGER DEFAULT 0,

harmful_hits INTEGER DEFAULT 0,

access_count INTEGER DEFAULT 0,

collection TEXT DEFAULT 'default',

tags TEXT, -- JSON array

last_accessed INTEGER,

valid_until INTEGER,

superseded_by TEXT,

created_at INTEGER NOT NULL,

updated_at INTEGER NOT NULL

);

…vs-units Critical-item resolution from v0.1 review. Removed: - Local sqlite-vec memories table — Hindsight is the sole knowledge backend. Drops F32_BLOB / libsql_vector_idx / vector_top_k references that wouldn't have compiled against ncruces/go-sqlite3. - specs.check as a user-project runtime gate. Moved to project CI on the singularity-crush repo itself. Added: - §3.3 Task Tracker Integration — Tracker interface, lifecycle states (active|blocked|done|cancelled|unknown), built-in adapters (linear, github, jira, sqlite), (tracker_kind, tracker_id) unique key, failure handling. - §4.7 Crash Recovery — concrete model: in-memory state lost; running units marked interrupted on startup; fresh re-dispatch from last persisted phase boundary; tool calls NOT replayed. - §17.1 Agent-vs-unit comparison table — defines what's shared (worker attempt lifecycle, supervisor checks) and what's different (no phase machine, no gates, persistent budget) for persistent agent runs. - circuit_breakers and schema_migrations tables to §3.1. - parent_id, claim_holder, claim_until, tracker_kind, tracker_id, interrupted status to units schema. - Claim and Tracker definitions to §2. Fixed: - Attempt is 1-indexed (default 1, not 0); retry table aligned with formula. - Hook timeout default unified to 60s (was 30s in §10.3, 60s in config). - Doc-sync is now a sub-step of PhaseMerge (was undefined placement). - max_turns_per_attempt, turn_timeout, stall_timeout, hot_cache_turns, max_attempts added to canonical config schema. Conformance: added C-41..C-46 covering tracker, crash recovery, doc-sync placement, and SQLite-orchestration-only constraint. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…weep, more Iteration 1 of self-paced review loop. Score moved 0.79 → ~0.88. Adversarial fixes: - A1: Tracker BlockedBy translation now defined (placeholder rows for upstream not-yet-fetched). - A2: Claim expiry sweep is now an explicit orchestrator responsibility on every poll tick. - A4: Memory-block "last-writer-wins" rule was wrong — agents don't share blocks; replaced with the actual concurrency story (serial tool calls within a turn, commit between turns). - A6: SignalAbort now has a 5s/3s/SIGKILL escalation for in-flight tool calls. Documented tool_abort_grace and tool_abort_kill config. - agent_inbox/messages now have a 30d retention sweep with archive to .sf/archive/agents/{id}/inbox-{YYYY-MM}.jsonl. Architectural fixes: - B2: Run is now a first-class abstraction. New `runs` table unifies unit_attempt and agent_run with ULID id, run_kind, outcome, error_code, token/cost columns. Trace and billing key on run_id. - B7: local_anti_patterns SQLite mirror — anti-patterns survive Hindsight outage. The one knowledge category small/critical enough to dual-store. - B8: Unit ID format now defined in §2 (milestone/m{n}, slice/m{n}/s{n}, task/m{n}/s{n}/t{n}). Self-describing in logs. - B9: Per-phase unit_timeout via [harness.unit_timeout_by_phase]; default 10m was too tight for reasoning-tier phases. - B10: Tier names enumerated as fixed (fast | standard | reasoning); custom tier names not supported. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…mination Score moved 0.88 → ~0.93. Adversarial fixes: - C1: units.attempt vs runs.attempt authority resolved — units.attempt is current counter, runs is historical; both updated in same transaction. - C2: runs CHECK constraint enforces XOR between unit_attempt and agent_run. - C3: outcome enum split — unit_timeout | turn_timeout | stalled distinct. - C5: atomic claim via conditional UPDATE (rows_affected = 1 gates dispatch); safe under multi-orchestrator even if run.lock is missing. - C6: task_blockers gets FK references with ON DELETE CASCADE. - C7: PhaseUAT trigger defined — workflow require_uat = true; /sf uat-approve resumes; release.toml example added. - A11: last_error injected only on TurnFirst of attempt >= 2 — clarifies the continuation/retry interaction. - A12: tracker `unknown` mid-run does NOT cancel; only blocks new dispatch. `blocked` mid-run also non-cancelling. Protects against flaky tracker APIs. - C11: Hindsight retain failures queue in pending_retain table with exponential backoff; flush to lost-learnings.jsonl after 7d. No silent knowledge loss to outage. Architectural fixes: - D1: runs aggregate columns documented as end-of-run rollup; spans remain authoritative. - D8: /sf rate target now precise — last completed run in current session. - D9: workflow selection at dispatch defined: tracker label → default → fallback. Pinned at first dispatch, immutable for retries. - D10/A8: agent run termination conditions enumerated. Hot cache NOT preserved across agent runs; durable memory blocks and messages ARE. Schema: - units.workflow column added (pinned per-unit). - runs CHECK constraint on run_kind XOR. - task_blockers FK with cascade. - pending_retain table for Hindsight outage queue. Conformance: C-47 through C-55 added. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Score moved 0.93 → ~0.96. Adversarial fixes: - E1: ULID consistency required for all runtime PKs. - E2: workspace authority — runs.workspace per-attempt, units.workspace = latest. - E4: parent_id depth via CHECK on type + harness validation. - E5: lock is per-project at <project>/.sf/run.lock; multiple projects can run auto. - E7: per-hook-type timeouts table; before_run 120s, post_unit 60s, doc_sync 5m. - E9/F6: soft-delete instead of cascade — units.archived_at, agents.archived_at; runs FK uses ON DELETE SET NULL; runs.unit_id_snap and agent_name_snap preserve forensics across entity deletion. - E3: agent compaction preserves wake message + recent 3 inbox arrivals + durable memory blocks; thread continuity guaranteed. - C8: PreToolUse hooks outrank auto_approve list (deny wins). - C10: SSH disconnect handling — error_code "ssh_disconnected", remote zombie cleanup via marker pgrep, host quarantine on cleanup failure, orphaned workspace preserved. - A3: PhaseChange documented as non-vetoable; veto semantics on PreDispatch. Architectural fixes: - F2: Binary integration model — sf is single fork binary; Crush internal/ re-used directly; /sf <subcommand> namespace extends slash-command router. - F3+F4: Project vs Session boundary defined; per-project DB at <project>/.sf/sf.db; sessions are project-scoped, ULID, 30d inactivity TTL. - F7: PhaseReassess behavior — three outcomes (Re-plan / Abandon / Escalate); reasoning tier with Think; max_reassess decrements only on Re-plan. - B6: Slice merge ordering — dependency-aware via code_depends_on; serial merge gate per project; falls back to created_at order. - C13: Canonical .sf/ directory layout documented in new §14.5 — config, workflows, hooks, gates, sf.db, locks, active/, archive/, log/, runtime/, trace/. - D6: Three-pass review — establish-context → parallel chunked review → synthesis. Cross-file context no longer blind. - D7: Doc-sync runs at end of last code-mutating phase, not just Merge. Spike workflows that adopt new dependencies now get doc updates. Schema: - units.archived_at, agents.archived_at for soft-delete. - agents.capabilities (JSON), max_turns_per_run. - runs.unit_id_snap, agent_name_snap for forensic preservation. - CHECK constraints on enums (units.type, units.phase_status, agents.state). Conformance: C-56 through C-68. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

… handoff Score moved 0.96 → ~0.98. Adversarial fixes: - G2: session_blockers resolution rules table (auto vs user vs command). - G3: cost stored as INTEGER micro-USD (cost_micro_usd, cost_per_1k_micro_usd); float drift eliminated. - G5: HTTP API ?session=<id> filter; multi-session DB returns sessions array. - G6: runs.outcome includes 'interrupted' with CHECK enum. - G7: SSH auth model — agent / key / key+agent; ssh_known_hosts MUST verify. - G8: UAT timeout = 0 (infinite); advance via /sf uat-approve. - G10: max_agents_by_phase.review default = 4; merge = 1 (serial). - G11: last_error capped at 4 KB head+tail; full payload at last-error-full.txt. Architectural fixes: - H2: projectHash derivation defined — git remote SHA-256 → path fallback; cached in .sf/runtime/project-hash.json so same project hits same Hindsight bank from any clone. - H3: workflow content pinning via workflow_pins table + units.workflow_hash. In-flight units locked to pinned content; on-disk template changes affect only new units. - H5: HTTP API auth via Bearer token at .sf/runtime/api.token (mode 0600). - H6: /sf doctor exit code spec; --json structured output. - B5: Per-turn semantic outcome via <turn_status> marker (complete | blocked | giving_up); checkpoint between turns without phase boundary. - B4: trace_index SQL table for /sf forensics; spans-on-disk + SQL pointer layer for fast lookup. - B1: handoff supports capability:tag1,tag2 form with round-robin matching; ErrNoCapableAgent if none. - A10: dynamic reload of session-immutable fields warns + keeps in-process value + surfaces in /sf status as drift; does NOT crash. - C12: providers section in canonical config; vault:// required, plaintext rejected at startup. - D5: trace JSONL _meta first-line record with trace_schema_version. - F1: conformance items tagged [REQUIRED] / [STRONG] / [OPTIONAL]. Schema: - runs.cost_micro_usd INTEGER (was REAL). - benchmark_results.cost_per_1k_micro_usd INTEGER (was REAL). - session_blockers.resolved_by column. - units.workflow_hash column. - workflow_pins table. - trace_index table. Errors: ErrNoCapableAgent, ErrSshDisconnected, ErrCanceledBySupervisor added. Conformance: C-69 through C-83. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…t, CLI completion Score moved 0.97 → ~0.98+. Final-tier fixes: - Gate script protocol: env vars table, stdin shape, exit codes 0/1/2/3, output expectations, timeout escalation. - Gate retry counter is now distinct from units.attempt; resets on phase transition. - plan.md format: frontmatter (unit_id, created_at, written_by, plan_version) + required Goal/Approach/Deliverables/Verification/Notes sections; validated at PhasePlan exit. - Hindsight client interface formalised: Recall/Retain/Feedback/Validate/ Health methods + RecallOpts and Memory types. The wrapper is the seam for testing. - SF tool registration: SF tools register through Crush's internal/agent/tools rather than a parallel registry; PreToolUse and auto_approve apply uniformly. - Missing CLI commands added to §25: reassess-resolve, force-clear, merge-resolve, uat-approve, uat-reject, agent {list, run, reset, delete, inspect, history}, history (with filter syntax), clean. - Trace_index archive-rotation: transactional with file_path UPDATE; interrupted move is repairable. - agent_capabilities indexed table: capability lookup is now O(log n) instead of full-scan over agents.capabilities JSON. - Rate-limit observability-only: justified (per-provider semantics differ; router + circuit breaker handle reactive retry). - Version policy: SemVer; v1.0 freezes §§3, 4, 6, 10, 14, 26. - Internal/ usage clarified: fork model satisfies Go's internal/ rule. Conformance: C-84 through C-93. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

… layer These are the working notes that SPEC.md was synthesised from. Both now have a top-of-file callout pointing readers to SPEC.md as authoritative. Specifically scrubbed: - harness.md §9: hermes-memory + hermes_memory_* tools + local embedding/reranker pipeline replaced with a redirect to SPEC.md § 16 (Hindsight is the sole knowledge backend). - harness.md §10: PostUnit hook line updated from "hermes-memory feedback" to "Hindsight feedback via the client wrapper." - migrate.md "Memory and knowledge" section (~170 lines): the entire sqlite-vec + FTS5 + RRF + Qwen3-Reranker-0.6B/4B + memories schema + retrieval pipeline replaced with a superseded note pointing to SPEC.md. - migrate.md sf init step: "FTS5 + Qwen3-Embedding-4B vectors" → "Hindsight project bank". What survives in both docs: phase state machine, hooks, supervisor, worktree, distributed execution, plugin extension points, Vault, skills, /sf revert, dispatch scheduling. Those are broadly aligned with SPEC.md; SPEC.md has the canonical wording. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

After singularity-ng/singularity-memory#1 was merged, the engine formerly known as Hindsight was assimilated into our codebase under singularity_memory_server/ (MIT-attributed). From sf's perspective there is no upstream Hindsight service — Singularity Memory IS the engine. Changes: - §2: definition of Singularity Memory (sm); embedded vs remote modes; cross-tool sharing. - §16.1: rewritten architecture; the "Hindsight is the sole knowledge backend" framing replaced with sm-as-our-engine; embedded-vs-remote table. - §16.1.1: client interface renamed `Hindsight` → `Memory`; type renamed `Memory` (the struct) → `Entry` to avoid collision; client library is `singularity-memory-client-go` generated from openapi.yaml. - §3.1: comments updated; local_anti_patterns mirror still applies, just against sm not Hindsight. - §16.3: two-bank pattern preserved verbatim; references updated. - §16.7: retrieval delegated to sm rather than Hindsight. - §14.2: new [memory] config block — mode, url, api_key. - §25 /sf doctor: checks Singularity Memory connectivity. - §19.4 chapters / §16.8 sf init: sm references. What did NOT change: - Schema (no SQLite changes). - Two-bank pattern semantics. - pending_retain queue. - local_anti_patterns mirror behavior on outage. - Anti-pattern decay rules. - Conformance gates (with name updates only). Conformance: C-94 through C-96 added. The remaining `hindsight` strings in SPEC.md are MIT attribution links to vectorize-io/hindsight, which stay. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Now that singularity-memory lives at singularity-ng/, the spec points explicitly at: - repo: github.com/singularity-ng/singularity-memory - Go client: github.com/singularity-ng/singularity-memory-client-go (auto-generated) - OpenAPI source: the running sm server's /openapi.json (not a checked-in YAML) Last point matters because the SM repo doesn't ship a static openapi.yaml; FastAPI generates it at runtime. Anyone regenerating the Go client points at a running instance, not a file. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Major rewrite. The spec was previously aimed at a Go fork of Crush; that direction was reconsidered after recognising sf already has gen-2 harness control via the vendored pi-mono SDK packages. Forking Crush would mostly duplicate what pi-mono already provides (agent loop, multi-provider via pi-ai, hooks, TUI primitives), pay a fork-merge tax, and ignore that the daemon model already absorbs Node.js cold-start cost. Implementation target now: the next major version of singularity-forge (sf, formerly Get-Shit-Done / GSD), built directly on the existing packages/pi-* vendored modules. TypeScript, not Go. External tracker integration dropped entirely. sf's SQLite DB is the sole source of work units. The Symphony-style poll-Linear-and-pull model doesn't fit sf's "user states a goal, sf decomposes" pattern. External visibility (GH Issues, Slack, etc.) is achieved via PostUnit hook scripts (recipe in §10.5.1) rather than core integration. Major changes: - §1: implementation target changed from Crush fork to sf v3 on pi-mono. References to packages/daemon, packages/mcp-server, packages/native, packages/rpc-client added — all already exist in sf. - §2: Tracker definition replaced with Plan definition. - §3: tracker_kind/tracker_id columns removed from units; metadata JSON column added for arbitrary user-side links. - §3.3: Tracker Integration replaced with "No external tracker" rationale. - §4.6 PhaseReassess Abandon: tracker write removed; visibility flows through PostUnit hooks. - §4.8 crash recovery: tracker reconciliation step removed. - §6 worker loop: between-turn fetch is local DB read, not tracker call. - §9 supervisor: ReconciliationCancel check removed. - §10.5.1 added: PostUnit hook recipe for GH Issues publishing. - §20 failure taxonomy: tracker class removed; ErrCanceledByOperator replaces ErrCanceledByReconciliation. - §21 hardening: tracker tool reference replaced with plan_unit. - §21.3: PreToolUse precedence text de-Crushed. - §23: plugin loading model changed (TS dynamic import, not Caddy). - §25: /sf plan and /sf abandon added. - §26: C-41 / C-42 / C-50 / E-04 / E-05 reframed; E-04 is now plan_unit (agent self-refines plan), not tracker_query. - All "Crush", "internal/", "Bubbletea", "fantasy", "catwalk" references rewritten to pi-mono / pi-coding-agent / pi-ai / pi-tui equivalents. What survives unchanged: phase state machine, persistent agents, inter-agent messaging, Singularity Memory integration (§16), workspace path safety (§11), worktree isolation (§12), verification gates (§13), trust boundary (§21), distributed SSH workers (§22), Vault secret management (§24), observability (§19), failure taxonomy structure (§20). The conformance checklist's bones survive — just retargeted from Go-on-Crush to TS-on-sf. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

gemini-code-assist Bot reviewed Apr 29, 2026

View reviewed changes

qodo-code-review Bot reviewed Apr 29, 2026

View reviewed changes

Comment thread SPEC.md Outdated

Comment thread migrate.md

mikkihugo and others added 10 commits April 29, 2026 07:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: singularity-crush SPEC.md + design docs#1

docs: singularity-crush SPEC.md + design docs#1
mikkihugo wants to merge 11 commits into
mainfrom
docs/spec

mikkihugo commented Apr 29, 2026

Uh oh!

qodo-code-review Bot commented Apr 29, 2026

Uh oh!

qodo-code-review Bot commented Apr 29, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Apr 29, 2026

Uh oh!

gemini-code-assist Bot Apr 29, 2026

Uh oh!

gemini-code-assist Bot Apr 29, 2026

Uh oh!

gemini-code-assist Bot Apr 29, 2026

Uh oh!

gemini-code-assist Bot Apr 29, 2026

Uh oh!

gemini-code-assist Bot Apr 29, 2026

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	embedding F32_BLOB(2560), -- Qwen3-Embedding-4B; NULL until indexed
	embedding BLOB, -- Vector embedding; NULL until indexed

	Include action outcome in the message: `completed`, `failed`, `retrying`, `canceled`. Never log large raw payloads — truncate at 512 bytes and note `[truncated]`. If a log sink fails, continue running and emit a warning through any remaining sink.
	Include action outcome in the message: `completed`, `failed`, `retrying`, `canceled`. MUST NOT log large raw payloads — truncate hook output at 2 KB and append `(truncated)`. If a log sink fails, continue running and emit a warning through any remaining sink.

		### Storage

		sqlite-vec with `F32_BLOB(2560)` columns in the same SQLite DB Crush already uses (`ncruces/go-sqlite3`). No extra processes, no separate index files. FTS5 virtual table alongside for BM25 hybrid search — FTS5 is the fallback when the embedding endpoint is unreachable.


	sqlite-vec with BLOB columns in the same SQLite DB Crush already uses (ncruces/go-sqlite3).

Conversation

mikkihugo commented Apr 29, 2026

Summary

What SPEC.md covers (26 sections)

Uh oh!

qodo-code-review Bot commented Apr 29, 2026

Review Summary by Qodo

Walkthroughs

File Changes

Uh oh!

qodo-code-review Bot commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review by Qodo

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

qodo-code-review Bot commented Apr 29, 2026 •

edited

Loading