docs: define planned public artifact semantics#528
Merged
Conversation
Issue: AP-330
There was a problem hiding this comment.
First-pass review
Risk: simple
Decision: approve
Ticket triage
- Intended change: Document the planned split between unlisted Share Links and true Public Artifacts (stable
/p/{publicId}URL, frozen Public Version, soft Public Offline, cacheable Public Version Assets, Platform Lockdown as hard takedown) before implementation. - Scope match: Yes. AP-330 is linked in the PR body and commit; the diff adds ADR 0087, expands
CONTEXT.md, and retargets shipped-spec wording without touching runtime code.
Review findings
- Blocking: None.
- Non-blocking: A few peripheral specs still use legacy "public" phrasing (
docs/specs/ephemeral-publish.md,docs/specs/content-rendering.md). ADR 0087 already flagsmake_public/make-publicnaming debt for a follow-up rename. Fine to land as-is.
Merge checklist
- Ticket linked: yes (AP-330)
- Scope matches: yes
- Checks green: yes (CI Validate + CodeQL succeeded)
- Tests/docs appropriate: yes (docs-only; author ran verify/smoke/coverage)
- No blocking findings: yes
- No high-risk areas: yes (no auth, billing, migrations, or deploy changes)
- Merge-safe: yes
Docs-only change with clear shipped-vs-planned boundaries. ADR 0087 is marked Planned, specs say Share Links are the current no-login handoff, and CONTEXT.md scope language aligns with the unified publish/read model in contracts.
Sent by Cursor Automation: First Pass PR Reviewer
|
agent-paste PR preview resources were cleaned up. The shared Preview GitHub Environment is retained for future preview deploys. |
isuttell
added a commit
that referenced
this pull request
Jun 15, 2026
…lta, agent read-back (ADR 0088/0089/0090) (#529) * docs(adr): record Git-like revision model (commit chain + jobs-reconstructed delta) ADR 0086 retroactively captures the shipped workspace-scoped content-addressed blob dedup (no prior ADR existed; it was only in data-model.md/api.md + commit dea091f). ADR 0087 decides the next step: revisions.parent_revision_id + tree inheritance (partial-manifest publish, unlisted paths inherit the parent tree by reference) so an agent can express "change this file" instead of the whole tree, plus server-reconstructed intra-file delta (unified diff for text, whole-blob fallback for binary). Reconstruction runs in jobs, not upload: upload is write-only against R2 today (sole op ARTIFACTS.put), while jobs already does the read-decrypt-transform-reencrypt-write shape for Bundle generation. This keeps content and the ADR 0063 encryption boundary untouched. Chunk stores, per-block AEAD, Range serving, global dedup, and dropping encryption are deferred. Spec/CONTEXT edits land with the implementation, per the spec-is-source-of-truth rule. Staged plan in docs/ops/git-like-revisions-todo.md. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * feat(db,contracts): add revision commit chain + partial-manifest upload contract (ADR 0087 stages 1-2) Stage 1 (schema): revisions.parent_revision_id nullable column with a deferrable composite self-FK on (workspace_id, artifact_id, parent_revision_id) -> revisions(workspace_id, artifact_id, id), ON DELETE SET NULL (parent_revision_id), plus revisions_parent_idx. The composite target structurally pins a parent to the same Workspace and Artifact; the column-list SET NULL nulls only the pointer (plain SET NULL would violate the NOT NULL workspace_id/artifact_id). Deferrable because claim-reparent bulk-rewrites workspace_id across all revisions inside deferred constraints. Threaded through the Revision type, insert mapper, and mapRevision; draft creation writes NULL (Stage 3 populates from base_revision_id). Migration 0024 is idempotent (journal-less runner) and verified on PGlite + snapshot regenerated. Stage 2 (contract): CreateUploadSessionRequest gains optional base_revision_id, deleted_paths, and a per-file patch descriptor {base_sha256, format:"unified", result_sha256}. A superRefine enforces the structural rules (patch/deleted_paths require base_revision_id; deleted_paths unique; a path cannot be both uploaded and deleted; format must be unified). Stateful checks and the tree-inheritance merge / diff reconstruction are deferred to Stages 3-4. OpenAPI golden regenerated; round- trip tests added. Also fixes a pre-existing Sha256Hex /u-flag leak that serialized an invalid "^...$/u" pattern into the published upload OpenAPI (now clean in all 6 spots), and folds the ADR 0087 spec-source-of-truth updates into data-model.md (column + index), api.md (request fields + rules), and CONTEXT.md (Revision parent relationship). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * feat(api,db): tree inheritance at finalize for revision commit chain (ADR 0087 stage 3) Publishing against base_revision_id now inherits the base Revision's unchanged blob-backed files instead of re-uploading them: a one-file change yields a full artifact_files tree with one new blob and parent_revision_id set. The merge runs at finalize (mergeBaseRevisionTree), recomputes file_count/size_bytes from the merged tree, and re-runs validateUpload (caps + entrypoint) against it. Stateful validation deferred from the Stage 2 contract now lands server-side (6 new repo error codes -> invalid_request): published-base-only, same workspace (base_revision_not_found) and same artifact (base_revision_artifact_mismatch, fired before the composite parent FK would 500), deleted-path-in-base, patch base_sha256 match, and blob-backed-only inheritance (a revision-scoped base path is not refcount-protected, so it is rejected rather than dangled). Patch descriptors (patch_base_sha256/patch_result_sha256) are recorded and validated on upload_session_files with the diff uploaded as a revision object (sha256 omitted from the signed PUT), but finalize fails loud (patch_reconstruction_unavailable) until jobs reconstruction lands in Stage 4 - otherwise the raw diff bytes would be served as the file. A file may not declare both a whole-file sha256 and a patch. Carriers: upload_sessions.base_revision_id + deleted_paths, the two patch columns on upload_session_files (migration 0025, idempotent). Specs updated alongside. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * feat(upload,db,storage): synchronous intra-file patch reconstruction at finalize (ADR 0087 stage 4) Apply an agent-uploaded unified diff to the base blob synchronously at finalize, in the upload worker, before the new Revision commits. A patch that cannot apply fails the finalize call with patch_conflict (HTTP 422, "patch_conflict: <path>: <reason>") so the agent re-submits a corrected diff; a broken patch never becomes a servable Revision. The reconstructed result is an ordinary content-addressed blob, so content/bundles/GC are unchanged and no migration is needed. - packages/storage: hand-rolled byte-exact unified-diff applier (no diff dep: jsdiff's UTF-16 round-trip breaks the raw-byte result_sha256 check) + workspace-blob read/write helpers (blob AAD) and a revision-file read helper. - packages/db: RevisionReconstructor adapter (RepositoryOptions, wired in createPostgresRuntime + local MVP harness), invoked from mergeBaseRevisionTree before any DB write; result blob + content_blobs row commit with the draft; caps run on the reconstructed result size; removes the Stage 3 patch_reconstruction_unavailable gate; conflict -> patch_conflict, infra failures -> storage_unavailable. - contracts/worker-runtime: new patch_conflict ErrorCode (422), MCP status map + publishChain, route errors, openapi goldens; also maps the previously 500-falling-back finalize codes (caps, expired, incomplete) for MCP. - upload: widen R2 binding with get; surface the conflict path+reason as the error message. - scripts: smoke-local-patch.mjs drives the real reconstruction path (local + preview). Verified byte-exact serve + patch_conflict on hosted preview. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * ci: run the patch reconstruction smoke locally and against PR preview (ADR 0087 stage 4) Stage 4's unit/integration tests use a fake reconstructor, so the only checks that exercise the real decrypt -> apply diff -> hash-verify -> re-encrypt -> serve path are the smoke. Wire it into both gates: - ci.yml: `pnpm smoke:local:patch` after the existing local smoke (in-memory MVP, every PR via Validate). - pr-preview.yml: `node scripts/smoke-local-patch.mjs pr` against the deployed PR preview Workers, reusing the per-PR deploy outputs + harness secret. smoke-local-patch.mjs now supports local/preview/pr targets with env resolution mirroring scripts/smoke-hosted.mjs. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * docs: renumber Git-revision ADRs to 0087/0088 after merge collision Merged PR #525 claimed ADR 0086 for "publish is content-only, private-first". This branch's earlier work also created an 0086 (workspace-scoped blob dedup) and an 0087 (revision commit chain + reconstructed delta). Renumber this branch's pair to 0087 (blob dedup) and 0088 (revision delta) so 0086 stays the merged publish-privacy ADR, and sweep every cross-reference (ADR bodies, specs, migrations, code comments, CI smoke steps) to the new numbers. Reference-only: no schema, contract, or logic change. Full suite green (typecheck 39/39, test 39/39, openapi:check). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * feat(api,cli,mcp,contracts): agent file read-back + CLI incremental-revise diff client (ADR 0089) Stage 5 of the git-like revision model: give an agent without a working copy a way to read back exactly what's stored so it can produce a correct unified-diff revise, and make the CLI send only what changed. - sha256 on Agent View file entries (optional, non-breaking add) - new member-authed GET /v1/artifacts/{id}/file-content in the api worker: it decrypts the owning member's plaintext and returns { path, sha256, size_bytes, content_type, is_binary, body? }. Oversize text and binary files omit the body; oversize short-circuits before touching R2. Any decrypt/storage failure maps to storage_unavailable (503), never a 500. - MCP read_file tool forwarding to it; api-client artifacts.readFile() - CLI: per-artifact manifest cache (0600), a byte-exact unified-diff generator that self-checks (re-applies its own diff and verifies the digest before attaching it; a generator bug degrades to a whole-blob upload, never a finalize conflict), incremental revise wiring, a `pull` verb, and single-shot full-republish fallback when the cached base is unusable - finalize now carries the precise base-* repository kind as the error detail so the CLI self-heal fires for all base-unusable conditions, not just patch conflicts (the 5 base-* kinds collapse to invalid_request on the wire) ADR 0089 records the api-decrypts-member-plaintext trust-boundary decision. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * docs(adr): renumber revision stack to 0088/0089/0090 (0087 taken by public-artifacts) Main merged ADR 0087 (public-artifacts-and-unlisted-share-links, #528) while this revision stack was off-branch, so the prior renumber onto 0087/0088 collided. Shift the stack up one and restore main's 0087 index row: workspace-scoped blob dedup 0087 -> 0088 revision commit chain + delta 0088 -> 0089 agent file read-back 0089 -> 0090 Renames the three ADR files and rewrites every in-tree reference (filename tokens + bare "ADR 00NN" comments across code, migrations, specs, CI, and CONTEXT.md). Bare "ADR 0087" references that mean main's public-artifacts ADR (CONTEXT.md, project-status.md, 0086) are preserved untouched. README index re-adds the 0087 public-artifacts row dropped during the rebase. Also drops four dead imports in apps/cli/src/index.ts (PublishResultShape, formatBytes, hyperlink, paint) left unused by the Stage 5 work. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * docs(adr): record ADR 0091 — shared revise engine + literal multi-edit tools Design for @agent-paste/revise-core: a pure applyEdits core, a RevisionReader read-side seam (twin of PublishTransport), and a reviseOnePath orchestrator that both the CLI `edit` verb and an MCP `multi_edit` tool drive, plus a rebuilt MCP `add_revision` that preserves the artifact title (fixing the "Revision" overwrite bug) and sends a verified patch. Strict fail-fast; moves diffWithSelfCheck out of apps/cli so MCP can share it; finalize render_mode inheritance invariant. Records the planned spec in docs/ops/git-like-revisions-todo.md (live cli/mcp specs update when the code lands). Builds on ADR 0090; reverses its "diff half stays CLI-only" deferral. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * fix(upload): allow delete-only revise (empty files) against a base revision A partial-manifest revise where every remaining file is unchanged but some paths are deleted produced an empty files manifest. Both CreateUploadSessionRequest (files.min(1)) and validateUpload (files.length === 0 -> file_count_cap_exceeded) rejected it, so delete-only revises failed instead of inheriting the base tree and dropping the paths. Make the min-1 file rule conditional on the publish kind: a whole publish (no base_revision_id) still requires at least one file; a base delta may send zero files as long as it deletes at least one path. validateUpload mirrors this for the partial-manifest path; the merged tree is still re-checked with the whole-tree caps (entrypoint + total size) at finalize. Bugbot finding on PR #529. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * fix(cli): fall back to full publish on no-op delta; guard claim-token leak + LCS blowup - runPublish drops the base and sends a whole-blob manifest when the revise plan produces no changed files and no deletions, so an unchanged working tree revise no longer sends an empty delta the server rejects (bugbot). - publish-format only treats the claim token in the URL query string as a leak, not a coincidental fragment substring (CodeRabbit). - unified-diff-gen returns null instead of attempting an LCS over more than MAX_LCS_CELLS cells, so a pathological diff falls back to a whole blob rather than pinning a core (CodeRabbit). - Corrected the pull/read-back comment that wrongly claimed base64 bytes ride in the JSON; oversize text/binary is metadata-only. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * fix(api): infer is_binary from content type on the oversize file-content path An oversize file is returned as metadata without reading R2, so its bytes are never inspected. Previously is_binary defaulted to false, mislabeling an oversize binary as text. Now the flag is derived from the stored content type (non-text/* => binary), so a client keying on the flag does not try to inline binary bytes. body stays absent on this branch (CodeRabbit). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * fix(db): return committed revision counts on finalize fast path; document FK scope - The finalized-session fast path now reads the committed revision and reports its file_count/size_bytes, so a repeat finalize returns the merged-tree counts rather than the pre-merge session counts (CodeRabbit). - Documented that migration 0024's column-scoped ON DELETE SET NULL on parent_revision_id is authoritative and that Drizzle cannot express the column list, so the snapshot's unscoped form is expected drift, not a bug to "fix" toward the snapshot (CodeRabbit). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * fix(storage,docs): preserve UTF-8 BOM in decodeUtf8Strict; tighten patch sha256 spec - decodeUtf8Strict keeps a leading UTF-8 BOM (ignoreBOM) so valid BOM-prefixed text round-trips and is not rejected as binary; fatal is passed explicitly for the Worker TS lib option type (CodeRabbit). - api.md: a patched per-file entry's size_bytes is the diff byte length and the entry carries no whole-file sha256; the sha256 rule now scopes to whole-file entries (CodeRabbit). - Test-only: cover the BOM round-trip + invalid-UTF-8 reject, read_file omits revision_id from the query when absent, and isolate the non-unified-patch contract test from sha256. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.


@coderabbitai ignore
Summary
Documents AP-330's split between unlisted Share Links and planned true Public Artifacts.
Linear: https://linear.app/zaks-io/issue/AP-330/document-public-url-unlisted-and-cdn-backed-public-semantics
Changes
/p/{publicId}, frozen Public Version, soft Public Offline, cacheable Public Version Assets, and Platform Lockdown as the hard takedown path.CONTEXT.mdwith the new planned public-domain terms and clarifies that shipped CLI/MCP behavior still creates Share Links.Risk
Low. Docs-only change. No runtime code, schema, route, or deploy behavior changes.
Test Plan
pnpm format:docs:checkgit diff --checkpnpm verifypnpm smoke:localpnpm test:coverage(Statements 90.75%, Branches 82.48%, Functions 91.7%, Lines 90.99%)