Skip to content

feat: git-like revision model — commit chain, server-reconstructed delta, agent read-back (ADR 0088/0089/0090)#529

Merged
isuttell merged 14 commits into
mainfrom
worktree-idempotent-juggling-music
Jun 15, 2026
Merged

feat: git-like revision model — commit chain, server-reconstructed delta, agent read-back (ADR 0088/0089/0090)#529
isuttell merged 14 commits into
mainfrom
worktree-idempotent-juggling-music

Conversation

@isuttell

@isuttell isuttell commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

Git-like revision model (Stages 1–5) — ADR 0088 / 0089 / 0090

Makes Revisions behave like Git commits so an agent can express "change this file"
instead of re-uploading the whole tree, and lets an agent without a working directory
read a stored file back to produce a correct diff. This is the foundation the upcoming
multi-edit (edit / multi_edit) engine builds on.

What's in the stack (8 commits)

  • Stages 1–2 — revision commit chain (revisions.parent_revision_id) + partial-manifest
    upload contract (base_revision_id, deleted_paths); unlisted paths inherit the parent
    tree by reference.
  • Stage 3 — tree inheritance materialized at finalize (api-side merge).
  • Stage 4 — synchronous intra-file patch reconstruction at finalize in upload: a changed
    file may be sent as a verified unified diff (whole-blob fallback for binary); a patch that
    can't apply fails the same publish call with an agent-visible patch_conflict (422) — a broken
    patch never becomes a servable draft.
  • Stage 5 — agent file read-back: optional sha256 on AgentViewFile, a member-authed
    GET /v1/artifacts/{id}/file-content route (api decrypts member plaintext; oversize/binary
    return metadata only), an MCP read_file tool, the CLI diff client (per-artifact manifest cache
    • working-dir diff → partial manifest with verified diffs; stale-base → full-publish fallback),
      and an agent-paste pull verb.

Boundary note (Stage 5)

api now decrypts artifact bytes and returns plaintext to the owning Member only. This does
not widen the ADR 0063 confidentiality boundary — the Member already owns the artifact and can
fetch the same bytes via the signed content URL; encryption defends the platform tier, not the
owner. The blob key is derived from the RLS-scoped row's sha256 + the actor's workspace, never
client input. content is untouched.

ADR numbering

This branch was rebased onto latest main after #528 merged ADR 0087 = public-artifacts.
The revision stack was renumbered up one to avoid the collision and main's 0087 index row was
restored:

ADR Topic
0088 workspace-scoped content-addressed blob dedup
0089 revision commit chain + tree inheritance + server-reconstructed delta
0090 agent file read-back (api decrypts member plaintext)

All in-tree references (code, migrations, specs, CI, CONTEXT.md) use the correct 0088/0089/0090.
Some mid-stack commit subjects still read "ADR 0087 stage N" — historical message text only;
the content is correct. Not rewritten to avoid a cosmetic full-stack history rewrite.

Verification

  • pnpm verify
  • pnpm smoke:local:patch ✅ — base → patch → read-back → diff-from-read-back reconstructs byte-exact; conflict path returns patch_conflict (422)
  • pnpm test:coverage ✅ — 90.75% stmts / 82.39% branches / 91.9% funcs / 90.95% lines (above 88/82/88/88 floors)

🤖 Generated with Claude Code


Note

High Risk
Touches authenticated decrypt in api, upload finalize reconstruction, and publish/read contracts—core artifact integrity and member data paths—with broad CLI/MCP surface area; mitigated by tests, strict MCP contracts, and new smoke gates.

Overview
Delivers ADR 0088–0090: workspace blob dedup is documented retroactively; revisions gain a parent chain and upload sessions accept base_revision_id, deleted_paths, and per-file patches, with tree merge and synchronous decrypt→apply diff→verify→re-encrypt at finalize in upload (patch_conflict on failure).

api adds artifacts.fileContent (member auth, R2 get, decrypt via @agent-paste/storage, 10 MiB inline cap, 503 on crypto/storage errors). MCP exposes read_file. CLI adds pull, a per-artifact manifest cache, working-dir revise (verified unified diffs + stale-base full republish), and refactors help/publish formatting. Finalize surfaces base-unusable repository kinds as invalid_request with the kind in the message for client self-heal.

CI runs pnpm smoke:local:patch and hosted smoke-local-patch.mjs on PR preview. Specs, CONTEXT.md, and ADRs 0088–0091 (0091 planned only) are updated.

Reviewed by Cursor Bugbot for commit e5be3b6. Bugbot is set up for automated code reviews on this repo. Configure here.

Summary by CodeRabbit

  • New Features

    • Added pull CLI command to read artifact file contents.
    • Added read_file MCP tool for agents to retrieve stored files.
    • CLI now caches per-artifact manifests to publish only changed files via unified diffs, reducing bandwidth on subsequent edits.
    • Revisions now support parent-child relationships for inherited file trees.
  • Documentation

    • Added ADRs for blob deduplication, revision inheritance, and file read-back APIs.

isuttell and others added 8 commits June 14, 2026 18:22
…tructed delta)

ADR 0086 retroactively captures the shipped workspace-scoped content-addressed
blob dedup (no prior ADR existed; it was only in data-model.md/api.md + commit
dea091f). ADR 0087 decides the next step: revisions.parent_revision_id + tree
inheritance (partial-manifest publish, unlisted paths inherit the parent tree by
reference) so an agent can express "change this file" instead of the whole tree,
plus server-reconstructed intra-file delta (unified diff for text, whole-blob
fallback for binary).

Reconstruction runs in jobs, not upload: upload is write-only against R2 today
(sole op ARTIFACTS.put), while jobs already does the
read-decrypt-transform-reencrypt-write shape for Bundle generation. This keeps
content and the ADR 0063 encryption boundary untouched. Chunk stores, per-block
AEAD, Range serving, global dedup, and dropping encryption are deferred.

Spec/CONTEXT edits land with the implementation, per the spec-is-source-of-truth
rule. Staged plan in docs/ops/git-like-revisions-todo.md.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ad contract (ADR 0087 stages 1-2)

Stage 1 (schema): revisions.parent_revision_id nullable column with a deferrable
composite self-FK on (workspace_id, artifact_id, parent_revision_id) ->
revisions(workspace_id, artifact_id, id), ON DELETE SET NULL (parent_revision_id),
plus revisions_parent_idx. The composite target structurally pins a parent to the
same Workspace and Artifact; the column-list SET NULL nulls only the pointer
(plain SET NULL would violate the NOT NULL workspace_id/artifact_id). Deferrable
because claim-reparent bulk-rewrites workspace_id across all revisions inside
deferred constraints. Threaded through the Revision type, insert mapper, and
mapRevision; draft creation writes NULL (Stage 3 populates from base_revision_id).
Migration 0024 is idempotent (journal-less runner) and verified on PGlite +
snapshot regenerated.

Stage 2 (contract): CreateUploadSessionRequest gains optional base_revision_id,
deleted_paths, and a per-file patch descriptor {base_sha256, format:"unified",
result_sha256}. A superRefine enforces the structural rules (patch/deleted_paths
require base_revision_id; deleted_paths unique; a path cannot be both uploaded and
deleted; format must be unified). Stateful checks and the tree-inheritance merge /
diff reconstruction are deferred to Stages 3-4. OpenAPI golden regenerated; round-
trip tests added.

Also fixes a pre-existing Sha256Hex /u-flag leak that serialized an invalid
"^...$/u" pattern into the published upload OpenAPI (now clean in all 6 spots),
and folds the ADR 0087 spec-source-of-truth updates into data-model.md (column +
index), api.md (request fields + rules), and CONTEXT.md (Revision parent
relationship).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…(ADR 0087 stage 3)

Publishing against base_revision_id now inherits the base Revision's unchanged
blob-backed files instead of re-uploading them: a one-file change yields a full
artifact_files tree with one new blob and parent_revision_id set. The merge runs
at finalize (mergeBaseRevisionTree), recomputes file_count/size_bytes from the
merged tree, and re-runs validateUpload (caps + entrypoint) against it.

Stateful validation deferred from the Stage 2 contract now lands server-side
(6 new repo error codes -> invalid_request): published-base-only, same workspace
(base_revision_not_found) and same artifact (base_revision_artifact_mismatch,
fired before the composite parent FK would 500), deleted-path-in-base,
patch base_sha256 match, and blob-backed-only inheritance (a revision-scoped base
path is not refcount-protected, so it is rejected rather than dangled).

Patch descriptors (patch_base_sha256/patch_result_sha256) are recorded and
validated on upload_session_files with the diff uploaded as a revision object
(sha256 omitted from the signed PUT), but finalize fails loud
(patch_reconstruction_unavailable) until jobs reconstruction lands in Stage 4 -
otherwise the raw diff bytes would be served as the file. A file may not declare
both a whole-file sha256 and a patch.

Carriers: upload_sessions.base_revision_id + deleted_paths, the two patch columns
on upload_session_files (migration 0025, idempotent). Specs updated alongside.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…at finalize (ADR 0087 stage 4)

Apply an agent-uploaded unified diff to the base blob synchronously at finalize,
in the upload worker, before the new Revision commits. A patch that cannot apply
fails the finalize call with patch_conflict (HTTP 422, "patch_conflict: <path>:
<reason>") so the agent re-submits a corrected diff; a broken patch never becomes
a servable Revision. The reconstructed result is an ordinary content-addressed
blob, so content/bundles/GC are unchanged and no migration is needed.

- packages/storage: hand-rolled byte-exact unified-diff applier (no diff dep:
  jsdiff's UTF-16 round-trip breaks the raw-byte result_sha256 check) +
  workspace-blob read/write helpers (blob AAD) and a revision-file read helper.
- packages/db: RevisionReconstructor adapter (RepositoryOptions, wired in
  createPostgresRuntime + local MVP harness), invoked from mergeBaseRevisionTree
  before any DB write; result blob + content_blobs row commit with the draft;
  caps run on the reconstructed result size; removes the Stage 3
  patch_reconstruction_unavailable gate; conflict -> patch_conflict, infra
  failures -> storage_unavailable.
- contracts/worker-runtime: new patch_conflict ErrorCode (422), MCP status map
  + publishChain, route errors, openapi goldens; also maps the previously
  500-falling-back finalize codes (caps, expired, incomplete) for MCP.
- upload: widen R2 binding with get; surface the conflict path+reason as the
  error message.
- scripts: smoke-local-patch.mjs drives the real reconstruction path (local +
  preview). Verified byte-exact serve + patch_conflict on hosted preview.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… (ADR 0087 stage 4)

Stage 4's unit/integration tests use a fake reconstructor, so the only checks that
exercise the real decrypt -> apply diff -> hash-verify -> re-encrypt -> serve path
are the smoke. Wire it into both gates:

- ci.yml: `pnpm smoke:local:patch` after the existing local smoke (in-memory MVP,
  every PR via Validate).
- pr-preview.yml: `node scripts/smoke-local-patch.mjs pr` against the deployed PR
  preview Workers, reusing the per-PR deploy outputs + harness secret.

smoke-local-patch.mjs now supports local/preview/pr targets with env resolution
mirroring scripts/smoke-hosted.mjs.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Merged PR #525 claimed ADR 0086 for "publish is content-only, private-first".
This branch's earlier work also created an 0086 (workspace-scoped blob dedup)
and an 0087 (revision commit chain + reconstructed delta). Renumber this
branch's pair to 0087 (blob dedup) and 0088 (revision delta) so 0086 stays the
merged publish-privacy ADR, and sweep every cross-reference (ADR bodies, specs,
migrations, code comments, CI smoke steps) to the new numbers.

Reference-only: no schema, contract, or logic change. Full suite green
(typecheck 39/39, test 39/39, openapi:check).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…evise diff client (ADR 0089)

Stage 5 of the git-like revision model: give an agent without a working copy a
way to read back exactly what's stored so it can produce a correct unified-diff
revise, and make the CLI send only what changed.

- sha256 on Agent View file entries (optional, non-breaking add)
- new member-authed GET /v1/artifacts/{id}/file-content in the api worker: it
  decrypts the owning member's plaintext and returns
  { path, sha256, size_bytes, content_type, is_binary, body? }. Oversize text and
  binary files omit the body; oversize short-circuits before touching R2. Any
  decrypt/storage failure maps to storage_unavailable (503), never a 500.
- MCP read_file tool forwarding to it; api-client artifacts.readFile()
- CLI: per-artifact manifest cache (0600), a byte-exact unified-diff generator
  that self-checks (re-applies its own diff and verifies the digest before
  attaching it; a generator bug degrades to a whole-blob upload, never a finalize
  conflict), incremental revise wiring, a `pull` verb, and single-shot
  full-republish fallback when the cached base is unusable
- finalize now carries the precise base-* repository kind as the error detail so
  the CLI self-heal fires for all base-unusable conditions, not just patch
  conflicts (the 5 base-* kinds collapse to invalid_request on the wire)

ADR 0089 records the api-decrypts-member-plaintext trust-boundary decision.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ublic-artifacts)

Main merged ADR 0087 (public-artifacts-and-unlisted-share-links, #528) while
this revision stack was off-branch, so the prior renumber onto 0087/0088
collided. Shift the stack up one and restore main's 0087 index row:

  workspace-scoped blob dedup        0087 -> 0088
  revision commit chain + delta      0088 -> 0089
  agent file read-back               0089 -> 0090

Renames the three ADR files and rewrites every in-tree reference (filename
tokens + bare "ADR 00NN" comments across code, migrations, specs, CI, and
CONTEXT.md). Bare "ADR 0087" references that mean main's public-artifacts ADR
(CONTEXT.md, project-status.md, 0086) are preserved untouched. README index
re-adds the 0087 public-artifacts row dropped during the rebase.

Also drops four dead imports in apps/cli/src/index.ts (PublishResultShape,
formatBytes, hyperlink, paint) left unused by the Stage 5 work.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Jun 15, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

This PR adds partial-manifest revision publishing with unified-diff reconstruction, a decrypted artifact file-content read route and MCP tool, CLI pull and revise-cache support, database and contract updates for parent/base revisions and patch metadata, plus smoke tests, workflows, and documentation.

Changes

Revision patching and file readback

Layer / File(s) Summary
Contracts and storage primitives
packages/contracts/src/{artifacts.ts,uploadSessions.ts,openapi/*,routes/*,mcp/*,primitives.ts,common.ts,agentView.ts}, packages/storage/src/{unified-diff.ts,workspace-blob-bytes.ts,index.ts}, packages/api-client/src/publish.ts, packages/api-client/src/publish.test.ts
Adds Sha256Hex, ArtifactFileContent, patch-capable upload session schemas, OpenAPI/query helpers, route and MCP error declarations, unified-diff apply logic, workspace blob read/write helpers, and patch-aware publish request encoding.
Revision inheritance and finalize reconstruction
packages/db/{migrations/*,snapshot/schema.sql,src/schema.ts,src/types.ts,src/queries/*,src/repository/*,src/postgres/*,src/index.ts}, apps/upload/src/{create-session.ts,env.ts,finalize.ts,finalize.test.ts}, packages/worker-runtime/src/{errors.ts,route-repository-errors.ts}
Stores parent_revision_id, base_revision_id, deleted_paths, and patch hash metadata; reconstructs patched files during finalize; merges inherited trees; wires a revision reconstructor into runtimes; and maps new base/patch error cases.
API and MCP file-content read path
apps/api/src/{env.ts,index.ts,routes/artifact-file-content.ts,routes/artifact-file-content.test.ts}, packages/api-client/src/index.ts, packages/contracts/src/mcp/{schemas.ts,tool-schemas.ts,registry.ts}, apps/mcp/src/{tools.ts,tools.test.ts,transport.test.ts}
Adds GET /v1/artifacts/{artifact_id}/file-content, decrypts small files by workspace blob sha256 lookup, returns metadata-only responses for oversized files, and exposes the same path through MCP read_file and the API client.
CLI revise, pull, and cache flow
apps/cli/src/{index.ts,help.ts,publish-format.ts,manifest-cache.ts,revise.ts,unified-diff-gen.ts,local.ts}, apps/cli/src/*.test.ts, apps/cli/test/index.test.ts
Adds pull, manifest-cache persistence, revise planning with diff-or-whole fallback, local diff self-check generation, cached-base retry behavior, and centralized help/output formatting.
Smoke coverage and documentation
scripts/{smoke-local-patch.mjs,local-mvp-server.mjs}, .github/workflows/{ci.yml,pr-preview.yml}, package.json, docs/..., CONTEXT.md
Adds local and hosted patch smoke coverage, wires reconstruction into the local MVP server, adds the smoke:local:patch script, and documents revision inheritance, patch reconstruction, file read-back, and related ADR/spec updates.

Sequence Diagram(s)

sequenceDiagram
  participant CLI
  participant UploadAPI as uploadSessions
  participant DB
  participant R2

  CLI->>UploadAPI: create session with base_revision_id and patch
  UploadAPI->>DB: finalize patched revision
  DB->>R2: read base blob and diff object
  DB->>R2: write reconstructed workspace blob
  DB-->>UploadAPI: merged revision metadata
  UploadAPI-->>CLI: finalized revision
Loading
sequenceDiagram
  participant MCP
  participant API
  participant Repository
  participant R2

  MCP->>API: read_file(artifact_id, path, revision_id)
  API->>Repository: build agent view
  API->>R2: read decrypted blob by sha256
  API-->>MCP: file metadata and optional body
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related PRs

  • zaks-io/agent-paste#479: Also changes the CLI publish path in apps/cli/src/index.ts around file digest handling for upload session manifests.
  • zaks-io/agent-paste#520: Overlaps with the CLI publish/revise surface and publish-result formatting added here.
  • zaks-io/agent-paste#15: Connects to the OpenAPI registry and schema registration changes used for the new artifacts.fileContent endpoint.

Poem

🐇 I patched a page with whiskered care,
then chased its bytes through blobby air.
I pulled one file, exact and bright,
by moonlit hash and UTF-8 light.
In revision burrows, hop by hop,
the diff came home and did not drop.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch worktree-idempotent-juggling-music

@isuttell

Copy link
Copy Markdown
Contributor Author

bugbot run

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First-pass review (Cursor Automation)

Risk: high
Decision: needs human review

Ticket triage

  • Intended change: Git-like revision model (ADR 0088–0090): parent revision chain, partial-manifest publish with tree inheritance, synchronous unified-diff reconstruction at upload finalize, member file read-back (GET …/file-content), MCP read_file, and CLI revise/pull tooling.
  • Scope match: Yes — the PR description, ADRs, specs, migrations, and staged plan in docs/ops/git-like-revisions-todo.md align. This is the backend/agent foundation AP-324 called for, delivered as a single stacked PR rather than a single Linear slice.

Review findings

Blocking: none found in static review.

Non-blocking:

  • docs/ops/git-like-revisions-todo.md still opens with “Status: design accepted, not yet implemented” — stale after this PR; worth a one-line status update on merge.
  • No Linear issue is linked in the PR body/labels; consider attaching the driving ticket for traceability.
  • CI was still running (Validate, Postgres smoke) at review time — merge should wait for green checks.

Security / architecture notes (why human review is required)

This PR crosses several automation escalation boundaries:

  • Auth + encryption: api now decrypts workspace blobs for authenticated members (artifact-file-content.ts). Access is scoped via getAgentView + RLS-derived sha256; blob keys are never taken from client input. ADR 0090 documents the intentional boundary shift for owning members.
  • Publish/finalize hot path: mergeBaseRevisionTree + synchronous patch reconstruction changes artifact integrity semantics for all incremental publishes. Failure modes look correct (patch_conflict before any DB write; inherited paths must be blob-backed).
  • Schema migrations: 0024 / 0025 add revision parent chain and patch carriers — production migration review warranted.

Implementation quality looks strong: extensive unit/integration coverage (packages/db, packages/storage, route tests), pnpm smoke:local:patch wired into CI, and clear fail-loud invariants (never serve raw diff bytes).

Merge checklist

Check Status
Ticket linked ⚠️ No explicit Linear issue linked
Scope matches
Checks green ⏳ Pending at review time
Tests/docs appropriate ✅ Strong
No blocking findings
No high-risk areas ❌ Auth, encryption, migrations, publish path
Merge-safe for automation

Recommendation: Do not auto-approve. Human reviewer should confirm ADR 0090 member-decrypt boundary, migration rollout, and hosted patch-smoke on preview before merge.

Open in Web View Automation 

Sent by Cursor Automation: First Pass PR Reviewer

Comment thread apps/cli/src/revise.ts
isuttell and others added 2 commits June 14, 2026 19:03
…t tools

Design for @agent-paste/revise-core: a pure applyEdits core, a RevisionReader
read-side seam (twin of PublishTransport), and a reviseOnePath orchestrator that
both the CLI `edit` verb and an MCP `multi_edit` tool drive, plus a rebuilt MCP
`add_revision` that preserves the artifact title (fixing the "Revision" overwrite
bug) and sends a verified patch. Strict fail-fast; moves diffWithSelfCheck out of
apps/cli so MCP can share it; finalize render_mode inheritance invariant.

Records the planned spec in docs/ops/git-like-revisions-todo.md (live cli/mcp
specs update when the code lands). Builds on ADR 0090; reverses its
"diff half stays CLI-only" deferral.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…vision

A partial-manifest revise where every remaining file is unchanged but some
paths are deleted produced an empty files manifest. Both CreateUploadSessionRequest
(files.min(1)) and validateUpload (files.length === 0 -> file_count_cap_exceeded)
rejected it, so delete-only revises failed instead of inheriting the base tree and
dropping the paths.

Make the min-1 file rule conditional on the publish kind: a whole publish (no
base_revision_id) still requires at least one file; a base delta may send zero
files as long as it deletes at least one path. validateUpload mirrors this for the
partial-manifest path; the merged tree is still re-checked with the whole-tree caps
(entrypoint + total size) at finalize.

Bugbot finding on PR #529.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@isuttell

Copy link
Copy Markdown
Contributor Author

bugbot run

Comment thread apps/cli/src/index.ts

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 11

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
packages/contracts/src/mcp/error-codes.ts (1)

17-43: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Include the full new finalize/base error surface in MCP publish mappings.

At Line 17 and Line 64, the mapping still omits newly reachable finalize errors (patch_base_mismatch, deleted_path_not_in_base, base_revision_not_found, base_revision_artifact_mismatch, base_revision_not_publishable, inherited_path_not_blob_backed). Those can currently fall through to generic internal handling instead of returning actionable publish-chain errors.

Suggested fix shape
 publishChain: [
+  "base_revision_not_found",
+  "base_revision_artifact_mismatch",
+  "base_revision_not_publishable",
+  "deleted_path_not_in_base",
+  "inherited_path_not_blob_backed",
+  "patch_base_mismatch",
   "patch_conflict",
   ...
 ]

 MCP_API_ERROR_HTTP_STATUS: {
+  // Add explicit mappings for the same new finalize/base errors above,
+  // matching the API-layer status semantics to avoid 500 fallbacks.
   ...
 }

Also applies to: 64-99

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/contracts/src/mcp/error-codes.ts` around lines 17 - 43, The
publishChain error arrays at both locations (around line 17 and line 64) are
missing six newly reachable finalize/base error codes that should be included in
the MCP publish mappings. Add the following error codes to both publishChain
arrays: patch_base_mismatch, deleted_path_not_in_base, base_revision_not_found,
base_revision_artifact_mismatch, base_revision_not_publishable, and
inherited_path_not_blob_backed. This ensures these errors are properly mapped in
the publish chain instead of falling through to generic internal error handling.
packages/db/src/validation.ts (1)

12-16: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Allow delete-only partial manifests when wholeTree is false.

Line 14 still rejects files.length === 0 for all cases. With the new partial-manifest path, a delta containing only deleted_paths is a valid change but gets blocked as file_count_cap_exceeded.

💡 Suggested fix
 export function validateUpload(
   files: Array<{ path: string; size_bytes: number }>,
   usagePolicy: Pick<UsagePolicyConfig, "file_count_cap" | "file_size_cap_bytes" | "artifact_size_cap_bytes">,
   entrypoint = "index.html",
   options: { wholeTree?: boolean } = { wholeTree: true },
 ) {
-  if (files.length === 0 || files.length > usagePolicy.file_count_cap) {
+  const allowEmptyDelta = options.wholeTree === false;
+  if ((!allowEmptyDelta && files.length === 0) || files.length > usagePolicy.file_count_cap) {
     repositoryError("file_count_cap_exceeded");
   }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/db/src/validation.ts` around lines 12 - 16, The validation condition
in the options parameter block is rejecting files.length === 0 in all cases, but
when wholeTree is false, a delete-only partial manifest with zero added files
but containing only deleted_paths should be allowed. Modify the condition to
only reject empty files when wholeTree is true, while still rejecting files that
exceed usagePolicy.file_count_cap regardless of the wholeTree value. Check the
wholeTree option from the options parameter and adjust the logic accordingly to
permit this valid delete-only partial manifest scenario.
apps/upload/src/finalize.test.ts (1)

42-52: ⚠️ Potential issue | 🔴 Critical

Add get() method to ARTIFACTS test mocks to satisfy the R2Bucket contract.

The ARTIFACTS mocks at lines 42-52 and 201-208 in finalize.test.ts are missing the required get(key: string): Promise<R2ObjectBody | null> method declared in the R2Bucket type (apps/upload/src/env.ts, line 38). This creates a TypeScript structural typing contract violation that will fail type checking. The same issue exists in other test ARTIFACTS mocks across the suite (route-rls-boundary.test.ts, put-file-encryption.test.ts, put-hardening.test.ts).

Suggested fixes

For completeArtifacts() at lines 42-52:

 function completeArtifacts(): NonNullable<Env["ARTIFACTS"]> {
   const storedSize = ciphertextByteLengthForPlaintext(FILE_SIZE_BYTES);
   return {
     async head(key) {
       return key.endsWith("index.html") ? { size: storedSize } : null;
     },
+    async get() {
+      return null;
+    },
     async put() {
       return {};
     },
   };
 }

For the ARTIFACTS mock at lines 201-208:

         ARTIFACTS: {
           async head() {
             return null;
           },
+          async get() {
+            return null;
+          },
           async put() {
             return {};
           },
         },
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/upload/src/finalize.test.ts` around lines 42 - 52, The ARTIFACTS test
mocks are missing the required `get(key: string): Promise<R2ObjectBody | null>`
method that is declared in the R2Bucket type contract in env.ts. Add the `get()`
method to the ARTIFACTS mock in the `completeArtifacts()` function at lines
42-52 in finalize.test.ts, implementing it to return null or an appropriate
R2ObjectBody for test purposes. Apply the same fix to the ARTIFACTS mock at
lines 201-208 in finalize.test.ts. Additionally, add the `get()` method to all
other ARTIFACTS test mocks in route-rls-boundary.test.ts,
put-file-encryption.test.ts, and put-hardening.test.ts to satisfy the full
R2Bucket contract across the entire test suite.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@apps/api/src/routes/artifact-file-content.ts`:
- Around line 51-58: The oversize file response handler in the branch where
file.size_bytes exceeds Mebibytes.ten incorrectly hardcodes is_binary to false.
Instead, retrieve the actual binary status from the file object (likely
file.is_binary) and use that value in the responders.respondJson call to ensure
large binary files are correctly classified.

In `@apps/cli/src/index.ts`:
- Around line 399-424: The error message at line 423 and the comment at the top
of the pull function both claim that binary content is provided as base64 in
JSON output, but the actual code at line 417 only includes body when file.body
is defined, and for binary files body is absent. Fix the throw statement's error
message to remove the misleading claim that --json provides base64 encoding for
binary files, and update the comment describing the pull function's behavior to
accurately reflect that binary content is excluded from JSON output rather than
included as base64.

In `@apps/cli/src/publish-format.ts`:
- Around line 79-86: The token-in-query detection in the
assertClaimTokenNotInPublicUrls function is checking if claimToken appears
anywhere in claimUrl when a query string exists, but since claimToken
legitimately appears in the hash portion of the URL, this causes false
positives. Fix this by extracting only the query string portion of the claimUrl
(the part between ? and # if a hash exists, or from ? to the end otherwise) and
then check if claimToken appears specifically within that query string segment,
not in the entire URL.

In `@apps/cli/src/unified-diff-gen.ts`:
- Around line 39-51: The lcsOps function allocates a table with size (n+1)*(m+1)
cells which can cause out-of-memory errors for large files with many lines. Add
a guard clause at the beginning of the lcsOps function to check if the total
allocation would exceed a reasonable limit (e.g., comparing the product of (n+1)
and (m+1) against a cap like 10 million cells), and return null if the limit
would be exceeded. Update the function's return type from Op[] to Op[] | null to
reflect that it can now return null when the table size cap is exceeded,
allowing callers to handle overflow by falling back to whole-blob upload.

In `@apps/cli/test/index.test.ts`:
- Around line 682-718: Add two new test cases to cover the missing "text but too
large to inline" behavior. First, add a test for plain mode where readFile
returns a response with body undefined and is_binary false, and assert that main
with ["pull", artifactId, "filename"] throws an oversize error. Second, add a
test for JSON mode with the same readFile response, and assert that main with
["pull", artifactId, "filename", "--json"] emits metadata without a body field.
Both tests should use the same pattern as the existing tests by creating a
fakeClient with a mocked readFile, calling main, and then verifying the expected
error or output behavior.

In `@apps/mcp/src/tools.test.ts`:
- Around line 214-237: The test case for read_file only covers the scenario
where revision_id is provided. Add a second test case that calls callMcpTool
with read_file but without providing the revision_id parameter in the input
object, and then use routeCall to verify that the resulting URL does not include
the revision_id query parameter. This ensures the optional query parameter
behavior is correctly enforced and prevents regressions where revision_id might
be incorrectly forwarded when not provided.

In `@docs/specs/api.md`:
- Around line 114-122: The patched-file example in the API specification shows
both a `sha256` field and a `patch` field on the same file object, but the
request validator rejects this combination. Remove the `sha256` field from file
objects that contain a `patch` field, since the `result_sha256` inside the patch
object already serves to identify the reconstructed file. Apply this fix at all
affected locations in docs/specs/api.md (at lines 114-122 and also at lines
147-181) to align the examples with the actual validator contract and prevent
clients from sending invalid payloads.

In `@packages/contracts/src/uploadSessions.test.ts`:
- Around line 86-101: The test "rejects a non-unified patch format" is not
properly isolated because the file object contains both sha256 and patch fields,
which can cause the parse to fail due to mutual-exclusion rules rather than
specifically testing that format: "binary" is invalid. Remove the sha256 field
from the file object in the test payload so that the only reason for rejection
is the format: "binary" being invalid for a unified patch format.

In `@packages/db/src/repository/upload-session-lifecycle.ts`:
- Around line 469-481: The fast path for finalized sessions (referenced at line
440) currently returns session.file_count and session.size_bytes, which are
delta values for base-revision uploads. This causes retried finalize operations
to return inconsistent counts/sizes compared to the initial response. Update the
fast path to return the committed tree metrics (the same treeFileCount and
treeSizeBytes logic shown in lines 479-480) instead of the raw session delta
values, ensuring both the fast path and the merged-tree path return consistent,
committed tree metrics.

In `@packages/db/src/schema.ts`:
- Around line 295-299: The foreignKey definition for revisions_parent_fk in the
revisions table uses .onDelete("set null") on a composite key that includes NOT
NULL columns (workspaceId and artifactId), which would cause runtime constraint
violations. Since Drizzle ORM does not support column-scoped ON DELETE actions,
you must choose between two approaches: (1) remove the .onDelete("set null")
from the foreignKey definition in schema.ts and add a comment documenting that
the migration file
(packages/db/migrations/0024_revisions_parent_revision_id.sql) contains the
authoritative constraint definition with correct column-scoped ON DELETE SET
NULL (parent_revision_id) semantics, or (2) implement a post-schema hook that
executes raw SQL to alter or replace the constraint with the correct
column-scoped syntax. Choose option 1 if the migration is already deployed and
correct; choose option 2 if you need the schema definition to be the single
source of truth.

In `@packages/storage/src/unified-diff.ts`:
- Around line 59-63: The decodeUtf8Strict function's TextDecoder call does not
preserve the UTF-8 BOM, causing the round-trip validation check (bytesEqual
comparison) to fail for valid UTF-8 input with a leading BOM. The decoded text
will have the BOM stripped by default, so when re-encoded, the BOM is absent,
making the bytes not equal to the original. Pass an options object with
ignoreBOM set to true to the TextDecoder constructor to preserve the BOM in the
decoded output, allowing the round-trip check to pass for BOM-prefixed UTF-8
sequences.

---

Outside diff comments:
In `@apps/upload/src/finalize.test.ts`:
- Around line 42-52: The ARTIFACTS test mocks are missing the required `get(key:
string): Promise<R2ObjectBody | null>` method that is declared in the R2Bucket
type contract in env.ts. Add the `get()` method to the ARTIFACTS mock in the
`completeArtifacts()` function at lines 42-52 in finalize.test.ts, implementing
it to return null or an appropriate R2ObjectBody for test purposes. Apply the
same fix to the ARTIFACTS mock at lines 201-208 in finalize.test.ts.
Additionally, add the `get()` method to all other ARTIFACTS test mocks in
route-rls-boundary.test.ts, put-file-encryption.test.ts, and
put-hardening.test.ts to satisfy the full R2Bucket contract across the entire
test suite.

In `@packages/contracts/src/mcp/error-codes.ts`:
- Around line 17-43: The publishChain error arrays at both locations (around
line 17 and line 64) are missing six newly reachable finalize/base error codes
that should be included in the MCP publish mappings. Add the following error
codes to both publishChain arrays: patch_base_mismatch,
deleted_path_not_in_base, base_revision_not_found,
base_revision_artifact_mismatch, base_revision_not_publishable, and
inherited_path_not_blob_backed. This ensures these errors are properly mapped in
the publish chain instead of falling through to generic internal error handling.

In `@packages/db/src/validation.ts`:
- Around line 12-16: The validation condition in the options parameter block is
rejecting files.length === 0 in all cases, but when wholeTree is false, a
delete-only partial manifest with zero added files but containing only
deleted_paths should be allowed. Modify the condition to only reject empty files
when wholeTree is true, while still rejecting files that exceed
usagePolicy.file_count_cap regardless of the wholeTree value. Check the
wholeTree option from the options parameter and adjust the logic accordingly to
permit this valid delete-only partial manifest scenario.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: 45aa9339-cd00-48ab-b80a-5060740c6a5c

📥 Commits

Reviewing files that changed from the base of the PR and between 863bfb4 and 0855333.

⛔ Files ignored due to path filters (4)
  • packages/contracts/openapi/api.json is excluded by !packages/contracts/openapi/*.json
  • packages/contracts/openapi/content.json is excluded by !packages/contracts/openapi/*.json
  • packages/contracts/openapi/upload.json is excluded by !packages/contracts/openapi/*.json
  • pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml
📒 Files selected for processing (84)
  • .github/workflows/ci.yml
  • .github/workflows/pr-preview.yml
  • CONTEXT.md
  • apps/api/package.json
  • apps/api/src/env.ts
  • apps/api/src/index.ts
  • apps/api/src/routes/artifact-file-content.test.ts
  • apps/api/src/routes/artifact-file-content.ts
  • apps/cli/src/help.ts
  • apps/cli/src/index.ts
  • apps/cli/src/local.ts
  • apps/cli/src/manifest-cache.test.ts
  • apps/cli/src/manifest-cache.ts
  • apps/cli/src/publish-format.ts
  • apps/cli/src/revise.test.ts
  • apps/cli/src/revise.ts
  • apps/cli/src/unified-diff-gen.test.ts
  • apps/cli/src/unified-diff-gen.ts
  • apps/cli/test/index.test.ts
  • apps/mcp/src/tools.test.ts
  • apps/mcp/src/tools.ts
  • apps/mcp/src/transport.test.ts
  • apps/upload/src/create-session.ts
  • apps/upload/src/env.ts
  • apps/upload/src/finalize.test.ts
  • apps/upload/src/finalize.ts
  • docs/adr/0088-workspace-scoped-content-addressed-blob-deduplication.md
  • docs/adr/0089-revision-commit-chain-tree-inheritance-and-server-reconstructed-delta.md
  • docs/adr/0090-agent-file-read-back-api-decrypts-member-plaintext.md
  • docs/adr/README.md
  • docs/development.md
  • docs/ops/git-like-revisions-todo.md
  • docs/specs/api.md
  • docs/specs/cli.md
  • docs/specs/data-model.md
  • package.json
  • packages/api-client/src/index.ts
  • packages/api-client/src/publish.test.ts
  • packages/api-client/src/publish.ts
  • packages/contracts/src/agentView.ts
  • packages/contracts/src/artifacts.ts
  • packages/contracts/src/common.ts
  • packages/contracts/src/mcp.test.ts
  • packages/contracts/src/mcp/error-codes.ts
  • packages/contracts/src/mcp/registry.ts
  • packages/contracts/src/mcp/schemas.ts
  • packages/contracts/src/mcp/tool-schemas.ts
  • packages/contracts/src/mvp-contracts.test.ts
  • packages/contracts/src/openapi/api.artifacts.ts
  • packages/contracts/src/openapi/api.helpers.ts
  • packages/contracts/src/openapi/shared.ts
  • packages/contracts/src/primitives.ts
  • packages/contracts/src/routes/registry.artifacts.ts
  • packages/contracts/src/routes/registry.storage.ts
  • packages/contracts/src/uploadSessions.test.ts
  • packages/contracts/src/uploadSessions.ts
  • packages/db/migrations/0024_revisions_parent_revision_id.sql
  • packages/db/migrations/0025_upload_session_base_revision_and_patch.sql
  • packages/db/snapshot/schema.sql
  • packages/db/src/agent-view.ts
  • packages/db/src/index.test.ts
  • packages/db/src/index.ts
  • packages/db/src/local-mvp-sql-executor.test.ts
  • packages/db/src/postgres/revision-reconstructor.test.ts
  • packages/db/src/postgres/revision-reconstructor.ts
  • packages/db/src/postgres/worker-runtime.ts
  • packages/db/src/queries/revisions.test.ts
  • packages/db/src/queries/revisions.ts
  • packages/db/src/queries/upload-sessions.ts
  • packages/db/src/repository-error.ts
  • packages/db/src/repository/upload-session-lifecycle.ts
  • packages/db/src/repository/workflows/upload-publish-workflow.ts
  • packages/db/src/schema.ts
  • packages/db/src/types.ts
  • packages/db/src/validation.ts
  • packages/storage/src/index.ts
  • packages/storage/src/unified-diff.test.ts
  • packages/storage/src/unified-diff.ts
  • packages/storage/src/workspace-blob-bytes.test.ts
  • packages/storage/src/workspace-blob-bytes.ts
  • packages/worker-runtime/src/errors.ts
  • packages/worker-runtime/src/route-repository-errors.ts
  • scripts/local-mvp-server.mjs
  • scripts/smoke-local-patch.mjs

Comment thread apps/api/src/routes/artifact-file-content.ts
Comment thread apps/cli/src/index.ts
Comment thread apps/cli/src/publish-format.ts
Comment thread apps/cli/src/unified-diff-gen.ts
Comment thread apps/cli/test/index.test.ts
Comment thread docs/specs/api.md
Comment thread packages/contracts/src/uploadSessions.test.ts
Comment thread packages/db/src/repository/upload-session-lifecycle.ts
Comment thread packages/db/src/schema.ts
Comment thread packages/storage/src/unified-diff.ts
isuttell and others added 4 commits June 14, 2026 19:24
… leak + LCS blowup

- runPublish drops the base and sends a whole-blob manifest when the revise
  plan produces no changed files and no deletions, so an unchanged working
  tree revise no longer sends an empty delta the server rejects (bugbot).
- publish-format only treats the claim token in the URL query string as a leak,
  not a coincidental fragment substring (CodeRabbit).
- unified-diff-gen returns null instead of attempting an LCS over more than
  MAX_LCS_CELLS cells, so a pathological diff falls back to a whole blob rather
  than pinning a core (CodeRabbit).
- Corrected the pull/read-back comment that wrongly claimed base64 bytes ride
  in the JSON; oversize text/binary is metadata-only.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ent path

An oversize file is returned as metadata without reading R2, so its bytes are
never inspected. Previously is_binary defaulted to false, mislabeling an
oversize binary as text. Now the flag is derived from the stored content type
(non-text/* => binary), so a client keying on the flag does not try to inline
binary bytes. body stays absent on this branch (CodeRabbit).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ment FK scope

- The finalized-session fast path now reads the committed revision and reports
  its file_count/size_bytes, so a repeat finalize returns the merged-tree
  counts rather than the pre-merge session counts (CodeRabbit).
- Documented that migration 0024's column-scoped ON DELETE SET NULL on
  parent_revision_id is authoritative and that Drizzle cannot express the
  column list, so the snapshot's unscoped form is expected drift, not a bug to
  "fix" toward the snapshot (CodeRabbit).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…tch sha256 spec

- decodeUtf8Strict keeps a leading UTF-8 BOM (ignoreBOM) so valid BOM-prefixed
  text round-trips and is not rejected as binary; fatal is passed explicitly
  for the Worker TS lib option type (CodeRabbit).
- api.md: a patched per-file entry's size_bytes is the diff byte length and the
  entry carries no whole-file sha256; the sha256 rule now scopes to whole-file
  entries (CodeRabbit).
- Test-only: cover the BOM round-trip + invalid-UTF-8 reject, read_file omits
  revision_id from the query when absent, and isolate the non-unified-patch
  contract test from sha256.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@isuttell

Copy link
Copy Markdown
Contributor Author

bugbot run

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Bugbot reviewed your changes and found no new issues!

Comment @cursor review or bugbot run to trigger another review on this PR

Reviewed by Cursor Bugbot for commit e5be3b6. Configure here.

@isuttell isuttell merged commit 4ff11ce into main Jun 15, 2026
11 checks passed
@isuttell isuttell deleted the worktree-idempotent-juggling-music branch June 15, 2026 02:32
@github-actions

Copy link
Copy Markdown

agent-paste PR preview resources were cleaned up. The shared Preview GitHub Environment is retained for future preview deploys.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant