Patchwork is an agent-native version control system designed for concurrent, multi-agent development workflows. Unlike traditional VCS implementations built around human-centric branching models, Patchwork models all mutations as an append-only operation log, materializes file trees on demand through composable views, and gates integration through a speculative submit queue.
This document covers the storage model, core algorithms, data flow, and complexity characteristics in sufficient detail to reproduce the design or contribute to it.
+-----------+
| CLI |
+-----+-----+
|
+-----v-----+
| HTTP API |
+-----+-----+
|
+-----v-----+
| Server |
+-----+-----+
|
+-------+-------+------+---+---+-------+--------+
| | | | | | |
+----v--+ +-v----+ +-v---+ +v----+ +v----+ +v-----+ +v---------+
| OpLog | | Blob | |Change| |View | |Event| |Queue | |Workspace |
| | |Store | |Store | |Engine| | Bus | | | | |
+---+---+ +--+---+ +--+--+ +--+--+ +--+--+ +--+---+ +----+-----+
| | | | | | |
+---v--------v---------v-------v-------+--------+----------+
| SQLite (WAL mode) |
+----------------------------------------------------------+
| Filesystem (blobs) |
+----------------------------------------------------------+
Dependency graph between packages:
types (no dependencies -- pure data definitions)
^
|--- blob (types)
|--- log (types)
|--- events (types)
|--- diff (no patchwork dependencies)
|--- change (types)
|--- view (log, blob, change, diff, types)
|--- workspace(blob, events, types)
|--- server (log, blob, change, view, events, workspace, config, types)
The Server struct is the composition root:
type Server struct {
Root string
OpLog *log.OpLog
Blobs *blob.Store
Changes *change.Store
Views *view.Engine
Events *events.Bus
Config *config.Store
db *sql.DB
mu sync.Mutex
workspaces map[string]*workspace.Workspace
queue []*types.QueueEntry
queueMu sync.Mutex
Queue *SubmitQueue
SnapshotInterval uint64
lastSnapshotSeq uint64
}Two SQLite databases are used: patchwork.db (changes, views, view cache) and oplog.db (operations, snapshots). Both are opened in WAL mode.
Both databases are opened with WAL journaling and a 5-second busy timeout:
dbPath+"?_journal_mode=WAL&_busy_timeout=5000"
WAL mode rationale. Write-Ahead Logging permits concurrent readers while a single writer holds the write lock. This is critical for Patchwork because view materialization (read-heavy) must not block on operation appends (write path), and multiple agents may be reading views simultaneously while a workspace scan appends operations.
CREATE TABLE IF NOT EXISTS operations (
seq INTEGER PRIMARY KEY AUTOINCREMENT,
timestamp TEXT NOT NULL,
agent_id TEXT NOT NULL,
change_id TEXT NOT NULL,
version INTEGER NOT NULL,
op_type TEXT NOT NULL,
path TEXT NOT NULL,
content_hash TEXT,
diff_hash TEXT,
old_path TEXT,
parent_seq INTEGER NOT NULL,
metadata TEXT
);
CREATE INDEX IF NOT EXISTS idx_ops_change ON operations(change_id, version);
CREATE INDEX IF NOT EXISTS idx_ops_path ON operations(path);CREATE TABLE IF NOT EXISTS snapshots (
id INTEGER PRIMARY KEY AUTOINCREMENT,
seq INTEGER NOT NULL UNIQUE,
tree TEXT NOT NULL, -- JSON: { path: content_hash, ... }
created_at TEXT NOT NULL
);
CREATE INDEX IF NOT EXISTS idx_snapshots_seq ON snapshots(seq DESC);CREATE TABLE IF NOT EXISTS changes (
id TEXT PRIMARY KEY,
title TEXT NOT NULL,
author TEXT NOT NULL,
target TEXT NOT NULL DEFAULT 'main',
status TEXT NOT NULL DEFAULT 'draft',
current_ver INTEGER NOT NULL DEFAULT 1,
deps TEXT, -- JSON array of change IDs
metadata TEXT, -- JSON object
created_at TEXT NOT NULL,
updated_at TEXT NOT NULL
);
CREATE TABLE IF NOT EXISTS change_versions (
change_id TEXT NOT NULL,
version INTEGER NOT NULL,
base_seq INTEGER NOT NULL,
message TEXT,
created_at TEXT NOT NULL,
test_status TEXT,
test_output TEXT,
PRIMARY KEY (change_id, version),
FOREIGN KEY (change_id) REFERENCES changes(id)
);
CREATE TABLE IF NOT EXISTS version_operations (
change_id TEXT NOT NULL,
version INTEGER NOT NULL,
op_seq INTEGER NOT NULL,
PRIMARY KEY (change_id, version, op_seq),
FOREIGN KEY (change_id, version) REFERENCES change_versions(change_id, version)
);CREATE TABLE IF NOT EXISTS views (
id TEXT PRIMARY KEY,
owner TEXT NOT NULL,
base_type TEXT NOT NULL DEFAULT 'main_at_latest',
base_seq INTEGER NOT NULL DEFAULT 0,
conflict_policy TEXT NOT NULL DEFAULT 'warn',
created_at TEXT NOT NULL,
updated_at TEXT NOT NULL
);
CREATE TABLE IF NOT EXISTS view_overlays (
view_id TEXT NOT NULL,
change_id TEXT NOT NULL,
version INTEGER NOT NULL,
pin INTEGER NOT NULL DEFAULT 0,
sort_order INTEGER NOT NULL,
PRIMARY KEY (view_id, change_id),
FOREIGN KEY (view_id) REFERENCES views(id)
);
CREATE TABLE IF NOT EXISTS view_cache (
view_id TEXT NOT NULL,
path TEXT NOT NULL,
seq INTEGER NOT NULL,
content_hash TEXT NOT NULL,
PRIMARY KEY (view_id, path)
);The blob store is a content-addressed filesystem store using SHA-256. File contents and diffs are stored under a two-level directory hash:
.patchwork/blobs/{hash[0:2]}/{hash[2:4]}/{hash}
For example, a blob with hash a1b2c3d4e5... is stored at:
.patchwork/blobs/a1/b2/a1b2c3d4e5...
Implementation from blob/store.go:
func (s *Store) path(hash string) string {
if len(hash) < 4 {
return filepath.Join(s.root, hash)
}
return filepath.Join(s.root, hash[0:2], hash[2:4], hash)
}Write atomicity. Blobs are written to a .tmp suffix first, then atomically renamed. Duplicate writes are a no-op (checked via os.Stat before writing). The store is protected by a sync.RWMutex: writes take an exclusive lock, reads share a read lock.
func (s *Store) Put(data []byte) (string, error) {
hash := Hash(data)
s.mu.Lock()
defer s.mu.Unlock()
p := s.path(hash)
if _, err := os.Stat(p); err == nil {
return hash, nil // already exists
}
// ... write tmp, rename ...
}The OpLog is the central data structure: an append-only, totally-ordered log of file-level operations.
type Operation struct {
ID uint64 `json:"id"` // seq -- assigned by AUTOINCREMENT
Timestamp time.Time `json:"timestamp"`
AgentID string `json:"agent_id"`
ChangeID string `json:"change_id"`
Version uint32 `json:"version"`
OpType OpType `json:"op_type"` // create | modify | delete | rename | copy
Path string `json:"path"`
ContentHash string `json:"content_hash,omitempty"`
DiffHash string `json:"diff_hash,omitempty"`
OldPath string `json:"old_path,omitempty"`
ParentSeq uint64 `json:"parent_seq"`
Metadata map[string]any `json:"metadata,omitempty"`
}- Total ordering.
seqis theINTEGER PRIMARY KEY AUTOINCREMENT. Every operation receives a unique, monotonically increasing sequence number. This is the global clock of the system. - Append-only invariant. The
OpLogexposes onlyAppend(). There are no update or delete methods on operations. Once appended, an operation is immutable. - Mutex-serialized writes.
OpLog.mu sync.Mutexserializes all calls toAppend(). Reads (Get,GetRange,GetSince,GetByChange,GetByPath) do not acquire the mutex and rely on SQLite WAL for snapshot isolation. - Indexes. Two secondary indexes accelerate the common access patterns:
idx_ops_change(change_id, version)-- used byGetByChange()during overlay materialization.idx_ops_path(path)-- used byGetByPath()for per-file history.
| Method | SQL predicate | Use case |
|---|---|---|
Get(seq) |
WHERE seq = ? |
Point lookup by sequence number |
GetRange(from, to) |
WHERE seq >= ? AND seq <= ? |
Materialization, rebase conflict detection |
GetSince(afterSeq) |
WHERE seq > ? |
Incremental updates |
GetByChange(changeID, version) |
WHERE change_id = ? AND version = ? |
Overlay materialization |
GetByPath(path) |
WHERE path = ? |
File history |
Head() |
SELECT MAX(seq) |
Current log tip |
A View is a composable projection of the codebase. It consists of a base (the "main" timeline up to some sequence number) plus zero or more overlays (pending changes layered on top).
type View struct {
ID string `json:"id"`
Owner string `json:"owner"`
Base ViewBase `json:"base"`
Overlays []Overlay `json:"overlays"`
ConflictPolicy ConflictPolicy `json:"conflict_policy"`
}
type ViewBase struct {
Type ViewBaseType `json:"type"` // "main_at_seq" | "main_at_latest"
Seq uint64 `json:"seq,omitempty"`
}
type Overlay struct {
ChangeID string `json:"change_id"`
Version uint32 `json:"version"`
Pin bool `json:"pin"`
}materialize(view) -> FileTree:
// Step 1: Resolve the base sequence number.
if view.base.type == "main_at_latest":
base_seq = oplog.Head()
else:
base_seq = view.base.seq
// Step 2: Build the base tree from landed operations.
tree = {} // map[path] -> content_hash
start_seq = 1
snapshot = oplog.GetLatestSnapshot(before=base_seq)
if snapshot != nil:
tree = snapshot.tree // O(f) files
start_seq = snapshot.seq + 1
if start_seq <= base_seq:
ops = oplog.GetRange(start_seq, base_seq)
for op in ops:
change = changes.Get(op.change_id)
if change.status != "landed":
continue // skip non-landed operations
apply_op(tree, op)
// Step 3: Layer overlays in order.
conflicts = []
for overlay in view.overlays:
ops = oplog.GetByChange(overlay.change_id, overlay.version)
for op in ops:
if op.op_type == "modify" and op.path in tree:
conflict = detect_conflict(op, tree[op.path], overlay, blobs)
if conflict != nil:
if conflict.type == "clean_merge":
tree[op.path] = conflict.merged_hash
continue
switch view.conflict_policy:
case "warn":
conflicts.append(conflict)
tree[op.path] = conflict.merged_hash // with markers
continue
case "last_write":
// fall through to apply_op
case "manual":
return error("conflict requires manual resolution")
apply_op(tree, op)
return FileTree{entries: tree, seq: base_seq, conflicts: conflicts}
The function that mutates the in-memory tree:
func applyOp(tree map[string]string, op *types.Operation, blobs *blob.Store) {
switch op.OpType {
case OpFileCreate:
tree[op.Path] = op.ContentHash
case OpFileModify:
if op.ContentHash != "" {
tree[op.Path] = op.ContentHash // full replacement
} else if op.DiffHash != "" {
diffData, _ := blobs.Get(op.DiffHash)
resultHash, _ := blobs.Put(diffData)
tree[op.Path] = resultHash // apply diff
}
case OpFileDelete:
delete(tree, op.Path)
case OpFileRename:
if hash, ok := tree[op.OldPath]; ok {
tree[op.Path] = hash
delete(tree, op.OldPath)
}
case OpFileCopy:
if hash, ok := tree[op.OldPath]; ok {
tree[op.Path] = hash
}
}
}For persisted views without overlays, the engine uses a SQLite-backed cache (view_cache table) to avoid full replay:
- Look up cached
(path, content_hash, seq)entries for this view. - If
cachedSeq >= baseSeq, return the cached tree directly. - Otherwise, replay only operations in
(cachedSeq+1, baseSeq]against the cached tree. - Store the updated tree back to the cache.
Views with overlays always do a full materialization because overlay operations can change between calls (new versions, rebase).
The merge algorithm lives in diff/merge.go and is invoked during overlay materialization when an overlay modifies a file that was also modified in the base tree.
The three versions for the merge:
- base: the content the overlay was originally based on (stored in
op.Metadata["base_hash"]) - ours: the current tree content (what has landed on main)
- theirs: the overlay's new content
Merge3(base, ours, theirs, oursLabel, theirsLabel) -> MergeResult:
baseLines = splitLines(base)
oursLines = splitLines(ours)
theirsLines = splitLines(theirs)
// Step 1: Compute line-level changes from base to each side.
oursChanges = computeLineChanges(baseLines, oursLines)
theirsChanges = computeLineChanges(baseLines, theirsLines)
// Step 2: Walk both change streams, detect overlaps.
// Each lineChange has { baseStart, baseEnd, newLines }.
result = []
while changes remain:
pick the change with the smallest baseStart
if no overlap with the other side:
apply cleanly (copy base lines up to change, insert newLines)
else: // overlapping regions
if both sides produced identical newLines:
apply once (convergent edit)
else:
emit conflict markers:
<<<<<<< oursLabel
<ours content>
=======
<theirs content>
>>>>>>> theirsLabel
return MergeResult{ content, clean, conflicts }
Both Compute() (unified diff) and computeLineChanges() (for 3-way merge) use the same LCS algorithm:
func lcsEditScript(a, b []string) []edit {
n, m := len(a), len(b)
// Build LCS table: dp[i][j] = length of LCS of a[0..i-1] and b[0..j-1]
dp := make([][]int, n+1)
for i := range dp {
dp[i] = make([]int, m+1)
}
for i := 1; i <= n; i++ {
for j := 1; j <= m; j++ {
if a[i-1] == b[j-1] {
dp[i][j] = dp[i-1][j-1] + 1
} else if dp[i-1][j] >= dp[i][j-1] {
dp[i][j] = dp[i-1][j]
} else {
dp[i][j] = dp[i][j-1]
}
}
}
// Backtrack to produce edit script: editEqual, editInsert, editRemove
// ...
}The LCS table is O(n * m) in both time and space, where n and m are the line counts of the two inputs.
The edit script is converted to a list of lineChange structs, each representing a contiguous modified region:
type lineChange struct {
baseStart int // start index in base (inclusive)
baseEnd int // end index in base (exclusive)
newLines []string // replacement lines
}Contiguous runs of editRemove and editInsert are collapsed into a single lineChange. Runs of editEqual terminate the current change.
Two changes overlap when oursChange.baseEnd > theirsChange.baseStart (or vice versa). When overlap is detected:
- Compute
overlapStart = min(ours.baseStart, theirs.baseStart)andoverlapEnd = max(ours.baseEnd, theirs.baseEnd). - Copy base lines up to
overlapStart. - Compare
oursContentandtheirsContent. If identical, apply once (convergent edit). Otherwise, emit conflict markers. - Advance
baseIdxtooverlapEnd.
<<<<<<< landed
<content from the landed/main side>
=======
<content from the overlay/change side>
>>>>>>> change-id
If no base_hash metadata is available on the operation, the engine falls back to a simple hash comparison: if the existing tree hash differs from the overlay's content hash, a conflict event is emitted without attempting a merge.
The submit queue gates the integration of changes into main. It is implemented in server/queue.go.
type QueueConfig struct {
TestCmd string // e.g., "go test ./..."
TestDir string // working directory for tests
MaxSpeculative int // max entries to test in parallel (1 = linear)
PollInterval time.Duration // default: 2s
}Linear mode (MaxSpeculative == 1): One entry at a time. Dequeue the first pending entry, run tests, land or reject.
Speculative mode (MaxSpeculative > 1, Zuul-style): Test up to MaxSpeculative entries in parallel. Each entry is tested against a stacked view that assumes all preceding entries will pass.
func (q *SubmitQueue) loop() {
ticker := time.NewTicker(q.pollInterval)
for {
select {
case <-q.stopCh:
return
case <-ticker.C:
q.processQueue()
}
}
}processQueue() executes each poll cycle:
processQueue():
entries = first N pending entries (N = maxSpeculative)
if len(entries) == 0:
return
mark all as "running"
if len(entries) == 1:
processLinear(entry) // test and handle result
else:
processSpeculative(entries)
For speculative mode, each entry N is tested against a view containing:
base = main@latest
overlays = [entries[0], entries[1], ..., entries[N]]
This is implemented in runTest():
func (q *SubmitQueue) runTest(entry *QueueEntry, preceding []*QueueEntry) (bool, string) {
var overlays []types.Overlay
for _, prev := range preceding {
overlays = append(overlays, types.Overlay{
ChangeID: prev.ChangeID,
Version: prev.Version,
})
}
overlays = append(overlays, types.Overlay{
ChangeID: entry.ChangeID,
Version: ch.CurrentVersion,
})
v := &types.View{
ID: "queue-test-" + entry.ChangeID,
Owner: "submit-queue",
Base: types.ViewBase{Type: types.ViewBaseMainAtLatest},
Overlays: overlays,
ConflictPolicy: types.ConflictPolicyLastWrite,
}
ft, _ := q.server.Views.MaterializeView(v)
// Write ft to testDir, execute testCmd via sh -c
// 5-minute timeout via context.WithTimeout
}Results are processed in order:
processSpeculative(entries, indices):
results = parallel_test(entries) // WaitGroup
for i, result in results:
if result.passed:
land(entries[i])
remove from queue
else:
reject(entries[i])
remove from queue
// Cascade: reset all subsequent entries to "pending"
for j = i+1 to len(entries):
entries[j].test_status = "pending"
break // stop processing; subsequent entries will be retested
When an entry fails, all entries that were tested speculatively after it are reset to pending. They assumed the failed entry would pass, so their test results are invalid. On the next poll cycle, they will be retested with a corrected stacked view.
change.statusis updated tolanded.change_landedevent is published.test_result(passed)event is published.- Entry is removed from the queue; positions are recomputed.
change.statusis reverted todraft.test_result(failed)event is published.- Entry is removed from the queue; positions are recomputed.
The event bus (events/bus.go) provides in-process pub/sub with typed subscriptions.
type Handler func(event types.Event)
type Subscription struct {
id uint64
types []types.EventType // empty = all events
handler Handler
}
type Bus struct {
mu sync.RWMutex
subs map[uint64]*Subscription
nextID uint64
}Synchronous: Publish(event) copies the subscriber list under a read lock, then invokes each matching handler sequentially in the caller's goroutine. The lock is released before handler invocation to avoid deadlocks.
func (b *Bus) Publish(event types.Event) {
b.mu.RLock()
var handlers []Handler
for _, sub := range b.subs {
if matchesTypes(sub.types, event.Type) {
handlers = append(handlers, sub.handler)
}
}
b.mu.RUnlock()
for _, h := range handlers {
h(event)
}
}Asynchronous: PublishAsync(event) launches each handler in its own goroutine:
for _, h := range handlers {
go h(event)
}PublishAsync is used for all events published from the server's write path (AppendOp, UpdateChangeStatus, handleResult) to avoid blocking the critical section.
Bus.Channel() returns a buffered channel and a cancel function. Events are delivered via a non-blocking send; if the channel buffer is full, events are dropped (the subscriber is too slow).
func (b *Bus) Channel(eventTypes []types.EventType, bufSize int) (<-chan types.Event, func()) {
ch := make(chan types.Event, bufSize)
id := b.Subscribe(eventTypes, func(e types.Event) {
select {
case ch <- e:
default: // drop if buffer full
}
})
cancel := func() {
b.Unsubscribe(id)
close(ch)
}
return ch, cancel
}This is the primitive used for SSE bridges in the HTTP API: a handler calls Channel(), then reads from the channel in a loop, writing each event as an SSE frame.
| Event | Published when | Key data fields |
|---|---|---|
op_appended |
Operation appended to log | seq, change_id, path, op_type, agent_id |
change_landed |
Change status set to landed | change_id, version |
conflict_detected |
3-way merge detects conflict | path, your_change, your_version, conflicting_op |
test_result |
Queue test completes | change_id, version, status, output |
queue_position |
Queue position changes | (position data) |
A workspace is a local directory backed by a Patchwork view. It bridges the gap between the operation log (abstract) and the filesystem (concrete).
type Workspace struct {
ID, Dir, ChangeID, ViewID, AgentID string
opAppender OpAppender
viewReader ViewReader
blobs *blob.Store
eventBus *events.Bus
mu sync.Mutex
watching bool
stopCh chan struct{}
knownFiles map[string]string // path -> content_hash
version uint32
autoSyncID uint64
}knownFiles is the workspace's understanding of what is on disk. It is the baseline for detecting changes.
This is the filesystem-to-oplog bridge. It runs on a polling interval or can be called explicitly.
ScanAndCapture():
head = oplog.Head()
diskFiles = {}
ops = []
walk(workspace.dir):
skip hidden files/dirs (prefix ".")
for each file:
relPath = relative(workspace.dir, file)
diskFiles[relPath] = true
hash = SHA-256(file.content)
if relPath not in knownFiles:
// New file
contentHash = blobs.Put(file.content)
op = Operation{
type: "create", path: relPath,
content_hash: contentHash, parent_seq: head
}
ops.append(opAppender.AppendOp(op))
knownFiles[relPath] = hash
else if knownFiles[relPath] != hash:
// Modified file
contentHash = blobs.Put(file.content)
op = Operation{
type: "modify", path: relPath,
content_hash: contentHash, parent_seq: head
}
ops.append(opAppender.AppendOp(op))
knownFiles[relPath] = hash
// Detect deletions
for path in knownFiles:
if path not in diskFiles:
op = Operation{
type: "delete", path: path, parent_seq: head
}
ops.append(opAppender.AppendOp(op))
delete(knownFiles, path)
return ops
Sync() materializes the workspace's view and writes files to disk:
Sync():
ft = viewReader.Materialize(viewID)
viewFiles = {}
for entry in ft.entries:
viewFiles[entry.path] = true
if knownFiles[entry.path] == entry.content_hash:
continue // already up to date
content = blobs.Get(entry.content_hash)
writeFile(workspace.dir / entry.path, content)
knownFiles[entry.path] = entry.content_hash
// Remove files no longer in the view
for path in knownFiles:
if path not in viewFiles:
removeFile(workspace.dir / path)
delete(knownFiles, path)
EnableAutoSync() subscribes to change_landed events on the event bus. When another change lands (not the workspace's own change), the workspace calls Sync() to re-materialize its view, pulling in the latest main state.
ws.autoSyncID = ws.eventBus.Subscribe(
[]types.EventType{types.EventChangeLanded},
func(event types.Event) {
if changeID, ok := event.Data["change_id"].(string); ok && changeID == ws.ChangeID {
return // don't sync on our own landing
}
ws.Sync()
},
)Without snapshots, materializing a view requires replaying every landed operation from seq=1. Snapshots are periodic captures of the full file tree state at a given sequence number.
Creation trigger. After a change lands, Server.maybeSnapshot() checks if head - lastSnapshotSeq >= SnapshotInterval (default: 100 operations). If so, it materializes the current main tree and saves it:
func (s *Server) maybeSnapshot() {
head, _ := s.OpLog.Head()
if head - s.lastSnapshotSeq < s.SnapshotInterval {
return
}
// Materialize main, save snapshot
s.OpLog.CreateSnapshot(head, tree)
s.lastSnapshotSeq = head
s.OpLog.PruneSnapshots(5) // keep only 5 most recent
}Usage during materialization. The view engine calls GetLatestSnapshot(beforeSeq) to find the most recent snapshot at or before the target sequence. Replay begins from snapshot.seq + 1 instead of seq=1.
snap, err := e.oplog.GetLatestSnapshot(baseSeq)
if snap != nil {
tree = snap.Tree
startSeq = snap.Seq + 1
}Pruning. PruneSnapshots(keep) deletes all but the keep most recent snapshots:
DELETE FROM snapshots WHERE id NOT IN (
SELECT id FROM snapshots ORDER BY seq DESC LIMIT ?
)The view_cache table stores (view_id, path, content_hash, seq) tuples. For base-only views (no overlays):
- If
cachedSeq >= baseSeq, return the cached tree (zero replay). - Otherwise, replay only
(cachedSeq, baseSeq]against the cached tree (incremental replay). - Invalidate and repopulate the cache with the new state.
Cache invalidation occurs on:
AddOverlayorRemoveOverlayon a view (full view invalidation)DeleteView(full view invalidation)- Explicit
InvalidateCache(viewID, path)call
The system keeps N=5 most recent snapshots. With a default interval of 100 operations, this means the worst-case replay for a base-only view is at most 100 operations (from the latest snapshot to head). The 5-snapshot window provides recovery points if the latest snapshot is somehow stale.
Every operation has a unique, monotonically increasing sequence number assigned by SQLite's AUTOINCREMENT. This is the system's logical clock. All queries that reconstruct state (GetRange, GetSince) rely on this ordering.
Replaying the same sequence of operations against an empty tree (or a snapshot) always produces the same FileTree. Operations are deterministic: create sets, modify replaces, delete removes, rename moves, copy copies. There is no operation that depends on external state beyond the tree itself and the blob store.
ParentSeq on each operation tracks the causal dependency -- the log head at the time the operation was created. This is used by:
- Rebase: when
latestVersion.BaseSeq < head, the change's operations are replayed withParentSequpdated to the current head. - Conflict detection: operations landed between
BaseSeq+1andheadare checked for path overlap with the change's operations.
The submit queue ensures only tested changes land. The flow is enforced by state machine transitions:
draft -> ready -> testing -> landed
\-> draft (on failure, back to draft)
-> abandoned
No change can reach landed without passing through the queue (or an explicit LandChange bypass intended for testing).
Multi-row mutations use SQLite transactions:
CreateView: inserts intoviewsandview_overlaysin a single transaction.AddVersion: inserts intochange_versionsandversion_operations, then updateschanges.current_ver, all in a single transaction.DeleteView: deletes fromview_cache,view_overlays, andviewsin a single transaction.
| Operation | Time Complexity | Space Complexity |
|---|---|---|
| Append operation | O(1) amortized (single INSERT) | O(1) |
| Get operation by seq | O(1) (primary key lookup) | O(1) |
| Materialize view (from snapshot) | O(n) ops since snapshot + O(m) overlay ops | O(f) files in tree |
| Materialize view (no snapshot) | O(N) all landed ops + O(m) overlay ops | O(f) files in tree |
| Materialize view (cache hit, no overlays) | O(1) if cache is current; O(delta) if stale | O(f) files in tree |
| 3-way merge (LCS) | O(n * m) where n, m = line counts | O(n * m) for DP table |
| Blob store put | O(content size) for hash + write | O(content size) |
| Blob store get | O(content size) for read | O(content size) |
| Blob store has | O(1) (stat syscall) | O(1) |
| Submit queue process (linear) | O(1) entries * O(T) test time | O(f) files for test dir |
| Submit queue process (speculative) | O(k) entries * O(T) test time (parallel) | O(k * f) for k test dirs |
| ScanAndCapture | O(d) files on disk * O(file size) for hashing | O(d) for diskFiles map |
| GetDependencyChain | O(V + E) graph traversal | O(V) visited set |
| CreateSnapshot | O(f) files to serialize | O(f) JSON blob |
| PruneSnapshots | O(S) total snapshots | O(1) |
Where:
n= operations since the last snapshot (or since the beginning if no snapshot)N= total operations in the logm= operations in the overlay(s)f= number of files in the materialized treed= number of files in the workspace directoryk= number of speculative queue entriesT= wall-clock time to run the test commandV= number of changes in the dependency graph,E= number of dependency edgesS= total number of snapshots
The complete pipeline from an agent editing a file to that change being integrated into main:
Agent edits file Workspace detects Blob store
in workspace dir change (polling) (SHA-256)
| | |
v v v
1. write file -----> 2. ScanAndCapture() ----> 3. blobs.Put(content)
| returns hash
v
4. OpLog.Append(op)
{type: "modify", path: "...", content_hash: hash}
|
v
5. EventBus.PublishAsync(op_appended)
|
v
6. Agent calls AddVersion(changeID, baseSeq, msg, opSeqs)
-- snapshots the operations into a version
|
v
7. Agent calls Submit(changeID)
-- change.status -> "testing"
-- QueueEntry{changeID, version, pending} added
|
v
8. Queue.processQueue() on next tick:
a. Build stacked view: main@latest + [preceding entries] + this entry
b. MaterializeView -> FileTree
c. Write files to temp dir
d. exec("sh", "-c", testCmd) with 5-min timeout
|
v
9. On pass:
a. change.status -> "landed"
b. Entry removed from queue
c. maybeSnapshot() -- create snapshot if interval elapsed
|
v
10. EventBus.PublishAsync(change_landed)
-> Other workspaces with auto-sync call Sync()
-> Their views re-materialize with the new main state
-
Agent edits file in workspace directory. The agent (or any process) writes to a file under the workspace's directory.
-
Workspace scan detects change. If
StartWatching()is active, a periodic ticker callsScanAndCapture(). The function walks the directory, computes SHA-256 hashes, and compares them toknownFiles. New files producecreateoperations, modified files producemodifyoperations, missing files producedeleteoperations. -
Content stored in blob store. File contents are written to the content-addressed blob store. The SHA-256 hash serves as both the address and the deduplication key. If the blob already exists, the write is a no-op.
-
Operation appended to log. The operation (with
content_hash,path,parent_seq,change_id,agent_id) is validated byValidateOp()(checks for relative paths, no directory traversal, required fields per op type), then appended to the log with a mutex-serializedINSERT. The operation receives its globally unique sequence number. -
Event published.
op_appendedevent is published asynchronously. Any subscriber (other workspaces, UI, SSE clients) receives notification. -
Agent creates a version.
AddVersion()groups a set of operation sequence numbers into a named version of the change. This is recorded inchange_versionsandversion_operationstables within a single transaction. The change'scurrent_veris incremented atomically. -
Agent submits change to queue.
Submit()validates the change status (must bedraftorready), checks that all dependencies are landed, transitions the status totesting, and appends aQueueEntryto the queue. -
Queue creates stacked view and runs tests. On the next poll cycle, the queue builds a
Viewwithbase=main@latestand overlays for all preceding entries plus this entry. It materializes this view to aFileTree, writes all files to a temporary directory, and executes the test command viash -cwith a 5-minute timeout. -
On pass: change lands. The change status transitions to
landed. Its operations are now part of the "main" timeline -- future view materializations withmain_at_latestwill include them.maybeSnapshot()checks if the snapshot interval has elapsed and, if so, takes a snapshot to accelerate future materializations. -
Event published; workspaces sync. The
change_landedevent triggers auto-sync in other workspaces. Each workspace re-materializes its view (which now includes the newly landed operations in its base) and writes the updated files to disk.