Skip to content

docs: define planned public artifact semantics#528

Merged
isuttell merged 1 commit into
mainfrom
codex/ap-330-public-url-semantics
Jun 15, 2026
Merged

docs: define planned public artifact semantics#528
isuttell merged 1 commit into
mainfrom
codex/ap-330-public-url-semantics

Conversation

@isuttell

Copy link
Copy Markdown
Contributor

@coderabbitai ignore

Summary

Documents AP-330's split between unlisted Share Links and planned true Public Artifacts.

Linear: https://linear.app/zaks-io/issue/AP-330/document-public-url-unlisted-and-cdn-backed-public-semantics

Changes

  • Adds ADR 0087 for planned Public Artifact semantics: stable ID-only /p/{publicId}, frozen Public Version, soft Public Offline, cacheable Public Version Assets, and Platform Lockdown as the hard takedown path.
  • Updates CONTEXT.md with the new planned public-domain terms and clarifies that shipped CLI/MCP behavior still creates Share Links.
  • Tightens current specs and ADR index wording so shipped Share Links are described as unlisted no-login handoff, not true public distribution.
  • Updates project status with the current-vs-planned boundary.

Risk

Low. Docs-only change. No runtime code, schema, route, or deploy behavior changes.

Test Plan

  • pnpm format:docs:check
  • git diff --check
  • pnpm verify
  • pnpm smoke:local
  • pnpm test:coverage (Statements 90.75%, Branches 82.48%, Functions 91.7%, Lines 90.99%)

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First-pass review

Risk: simple
Decision: approve

Ticket triage

  • Intended change: Document the planned split between unlisted Share Links and true Public Artifacts (stable /p/{publicId} URL, frozen Public Version, soft Public Offline, cacheable Public Version Assets, Platform Lockdown as hard takedown) before implementation.
  • Scope match: Yes. AP-330 is linked in the PR body and commit; the diff adds ADR 0087, expands CONTEXT.md, and retargets shipped-spec wording without touching runtime code.

Review findings

  • Blocking: None.
  • Non-blocking: A few peripheral specs still use legacy "public" phrasing (docs/specs/ephemeral-publish.md, docs/specs/content-rendering.md). ADR 0087 already flags make_public / make-public naming debt for a follow-up rename. Fine to land as-is.

Merge checklist

  • Ticket linked: yes (AP-330)
  • Scope matches: yes
  • Checks green: yes (CI Validate + CodeQL succeeded)
  • Tests/docs appropriate: yes (docs-only; author ran verify/smoke/coverage)
  • No blocking findings: yes
  • No high-risk areas: yes (no auth, billing, migrations, or deploy changes)
  • Merge-safe: yes

Docs-only change with clear shipped-vs-planned boundaries. ADR 0087 is marked Planned, specs say Share Links are the current no-login handoff, and CONTEXT.md scope language aligns with the unified publish/read model in contracts.

Open in Web View Automation 

Sent by Cursor Automation: First Pass PR Reviewer

@isuttell isuttell merged commit 863bfb4 into main Jun 15, 2026
9 checks passed
@isuttell isuttell deleted the codex/ap-330-public-url-semantics branch June 15, 2026 00:22
@github-actions

Copy link
Copy Markdown

agent-paste PR preview resources were cleaned up. The shared Preview GitHub Environment is retained for future preview deploys.

isuttell added a commit that referenced this pull request Jun 15, 2026
…lta, agent read-back (ADR 0088/0089/0090) (#529)

* docs(adr): record Git-like revision model (commit chain + jobs-reconstructed delta)

ADR 0086 retroactively captures the shipped workspace-scoped content-addressed
blob dedup (no prior ADR existed; it was only in data-model.md/api.md + commit
dea091f). ADR 0087 decides the next step: revisions.parent_revision_id + tree
inheritance (partial-manifest publish, unlisted paths inherit the parent tree by
reference) so an agent can express "change this file" instead of the whole tree,
plus server-reconstructed intra-file delta (unified diff for text, whole-blob
fallback for binary).

Reconstruction runs in jobs, not upload: upload is write-only against R2 today
(sole op ARTIFACTS.put), while jobs already does the
read-decrypt-transform-reencrypt-write shape for Bundle generation. This keeps
content and the ADR 0063 encryption boundary untouched. Chunk stores, per-block
AEAD, Range serving, global dedup, and dropping encryption are deferred.

Spec/CONTEXT edits land with the implementation, per the spec-is-source-of-truth
rule. Staged plan in docs/ops/git-like-revisions-todo.md.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* feat(db,contracts): add revision commit chain + partial-manifest upload contract (ADR 0087 stages 1-2)

Stage 1 (schema): revisions.parent_revision_id nullable column with a deferrable
composite self-FK on (workspace_id, artifact_id, parent_revision_id) ->
revisions(workspace_id, artifact_id, id), ON DELETE SET NULL (parent_revision_id),
plus revisions_parent_idx. The composite target structurally pins a parent to the
same Workspace and Artifact; the column-list SET NULL nulls only the pointer
(plain SET NULL would violate the NOT NULL workspace_id/artifact_id). Deferrable
because claim-reparent bulk-rewrites workspace_id across all revisions inside
deferred constraints. Threaded through the Revision type, insert mapper, and
mapRevision; draft creation writes NULL (Stage 3 populates from base_revision_id).
Migration 0024 is idempotent (journal-less runner) and verified on PGlite +
snapshot regenerated.

Stage 2 (contract): CreateUploadSessionRequest gains optional base_revision_id,
deleted_paths, and a per-file patch descriptor {base_sha256, format:"unified",
result_sha256}. A superRefine enforces the structural rules (patch/deleted_paths
require base_revision_id; deleted_paths unique; a path cannot be both uploaded and
deleted; format must be unified). Stateful checks and the tree-inheritance merge /
diff reconstruction are deferred to Stages 3-4. OpenAPI golden regenerated; round-
trip tests added.

Also fixes a pre-existing Sha256Hex /u-flag leak that serialized an invalid
"^...$/u" pattern into the published upload OpenAPI (now clean in all 6 spots),
and folds the ADR 0087 spec-source-of-truth updates into data-model.md (column +
index), api.md (request fields + rules), and CONTEXT.md (Revision parent
relationship).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* feat(api,db): tree inheritance at finalize for revision commit chain (ADR 0087 stage 3)

Publishing against base_revision_id now inherits the base Revision's unchanged
blob-backed files instead of re-uploading them: a one-file change yields a full
artifact_files tree with one new blob and parent_revision_id set. The merge runs
at finalize (mergeBaseRevisionTree), recomputes file_count/size_bytes from the
merged tree, and re-runs validateUpload (caps + entrypoint) against it.

Stateful validation deferred from the Stage 2 contract now lands server-side
(6 new repo error codes -> invalid_request): published-base-only, same workspace
(base_revision_not_found) and same artifact (base_revision_artifact_mismatch,
fired before the composite parent FK would 500), deleted-path-in-base,
patch base_sha256 match, and blob-backed-only inheritance (a revision-scoped base
path is not refcount-protected, so it is rejected rather than dangled).

Patch descriptors (patch_base_sha256/patch_result_sha256) are recorded and
validated on upload_session_files with the diff uploaded as a revision object
(sha256 omitted from the signed PUT), but finalize fails loud
(patch_reconstruction_unavailable) until jobs reconstruction lands in Stage 4 -
otherwise the raw diff bytes would be served as the file. A file may not declare
both a whole-file sha256 and a patch.

Carriers: upload_sessions.base_revision_id + deleted_paths, the two patch columns
on upload_session_files (migration 0025, idempotent). Specs updated alongside.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* feat(upload,db,storage): synchronous intra-file patch reconstruction at finalize (ADR 0087 stage 4)

Apply an agent-uploaded unified diff to the base blob synchronously at finalize,
in the upload worker, before the new Revision commits. A patch that cannot apply
fails the finalize call with patch_conflict (HTTP 422, "patch_conflict: <path>:
<reason>") so the agent re-submits a corrected diff; a broken patch never becomes
a servable Revision. The reconstructed result is an ordinary content-addressed
blob, so content/bundles/GC are unchanged and no migration is needed.

- packages/storage: hand-rolled byte-exact unified-diff applier (no diff dep:
  jsdiff's UTF-16 round-trip breaks the raw-byte result_sha256 check) +
  workspace-blob read/write helpers (blob AAD) and a revision-file read helper.
- packages/db: RevisionReconstructor adapter (RepositoryOptions, wired in
  createPostgresRuntime + local MVP harness), invoked from mergeBaseRevisionTree
  before any DB write; result blob + content_blobs row commit with the draft;
  caps run on the reconstructed result size; removes the Stage 3
  patch_reconstruction_unavailable gate; conflict -> patch_conflict, infra
  failures -> storage_unavailable.
- contracts/worker-runtime: new patch_conflict ErrorCode (422), MCP status map
  + publishChain, route errors, openapi goldens; also maps the previously
  500-falling-back finalize codes (caps, expired, incomplete) for MCP.
- upload: widen R2 binding with get; surface the conflict path+reason as the
  error message.
- scripts: smoke-local-patch.mjs drives the real reconstruction path (local +
  preview). Verified byte-exact serve + patch_conflict on hosted preview.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* ci: run the patch reconstruction smoke locally and against PR preview (ADR 0087 stage 4)

Stage 4's unit/integration tests use a fake reconstructor, so the only checks that
exercise the real decrypt -> apply diff -> hash-verify -> re-encrypt -> serve path
are the smoke. Wire it into both gates:

- ci.yml: `pnpm smoke:local:patch` after the existing local smoke (in-memory MVP,
  every PR via Validate).
- pr-preview.yml: `node scripts/smoke-local-patch.mjs pr` against the deployed PR
  preview Workers, reusing the per-PR deploy outputs + harness secret.

smoke-local-patch.mjs now supports local/preview/pr targets with env resolution
mirroring scripts/smoke-hosted.mjs.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* docs: renumber Git-revision ADRs to 0087/0088 after merge collision

Merged PR #525 claimed ADR 0086 for "publish is content-only, private-first".
This branch's earlier work also created an 0086 (workspace-scoped blob dedup)
and an 0087 (revision commit chain + reconstructed delta). Renumber this
branch's pair to 0087 (blob dedup) and 0088 (revision delta) so 0086 stays the
merged publish-privacy ADR, and sweep every cross-reference (ADR bodies, specs,
migrations, code comments, CI smoke steps) to the new numbers.

Reference-only: no schema, contract, or logic change. Full suite green
(typecheck 39/39, test 39/39, openapi:check).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* feat(api,cli,mcp,contracts): agent file read-back + CLI incremental-revise diff client (ADR 0089)

Stage 5 of the git-like revision model: give an agent without a working copy a
way to read back exactly what's stored so it can produce a correct unified-diff
revise, and make the CLI send only what changed.

- sha256 on Agent View file entries (optional, non-breaking add)
- new member-authed GET /v1/artifacts/{id}/file-content in the api worker: it
  decrypts the owning member's plaintext and returns
  { path, sha256, size_bytes, content_type, is_binary, body? }. Oversize text and
  binary files omit the body; oversize short-circuits before touching R2. Any
  decrypt/storage failure maps to storage_unavailable (503), never a 500.
- MCP read_file tool forwarding to it; api-client artifacts.readFile()
- CLI: per-artifact manifest cache (0600), a byte-exact unified-diff generator
  that self-checks (re-applies its own diff and verifies the digest before
  attaching it; a generator bug degrades to a whole-blob upload, never a finalize
  conflict), incremental revise wiring, a `pull` verb, and single-shot
  full-republish fallback when the cached base is unusable
- finalize now carries the precise base-* repository kind as the error detail so
  the CLI self-heal fires for all base-unusable conditions, not just patch
  conflicts (the 5 base-* kinds collapse to invalid_request on the wire)

ADR 0089 records the api-decrypts-member-plaintext trust-boundary decision.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* docs(adr): renumber revision stack to 0088/0089/0090 (0087 taken by public-artifacts)

Main merged ADR 0087 (public-artifacts-and-unlisted-share-links, #528) while
this revision stack was off-branch, so the prior renumber onto 0087/0088
collided. Shift the stack up one and restore main's 0087 index row:

  workspace-scoped blob dedup        0087 -> 0088
  revision commit chain + delta      0088 -> 0089
  agent file read-back               0089 -> 0090

Renames the three ADR files and rewrites every in-tree reference (filename
tokens + bare "ADR 00NN" comments across code, migrations, specs, CI, and
CONTEXT.md). Bare "ADR 0087" references that mean main's public-artifacts ADR
(CONTEXT.md, project-status.md, 0086) are preserved untouched. README index
re-adds the 0087 public-artifacts row dropped during the rebase.

Also drops four dead imports in apps/cli/src/index.ts (PublishResultShape,
formatBytes, hyperlink, paint) left unused by the Stage 5 work.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* docs(adr): record ADR 0091 — shared revise engine + literal multi-edit tools

Design for @agent-paste/revise-core: a pure applyEdits core, a RevisionReader
read-side seam (twin of PublishTransport), and a reviseOnePath orchestrator that
both the CLI `edit` verb and an MCP `multi_edit` tool drive, plus a rebuilt MCP
`add_revision` that preserves the artifact title (fixing the "Revision" overwrite
bug) and sends a verified patch. Strict fail-fast; moves diffWithSelfCheck out of
apps/cli so MCP can share it; finalize render_mode inheritance invariant.

Records the planned spec in docs/ops/git-like-revisions-todo.md (live cli/mcp
specs update when the code lands). Builds on ADR 0090; reverses its
"diff half stays CLI-only" deferral.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* fix(upload): allow delete-only revise (empty files) against a base revision

A partial-manifest revise where every remaining file is unchanged but some
paths are deleted produced an empty files manifest. Both CreateUploadSessionRequest
(files.min(1)) and validateUpload (files.length === 0 -> file_count_cap_exceeded)
rejected it, so delete-only revises failed instead of inheriting the base tree and
dropping the paths.

Make the min-1 file rule conditional on the publish kind: a whole publish (no
base_revision_id) still requires at least one file; a base delta may send zero
files as long as it deletes at least one path. validateUpload mirrors this for the
partial-manifest path; the merged tree is still re-checked with the whole-tree caps
(entrypoint + total size) at finalize.

Bugbot finding on PR #529.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* fix(cli): fall back to full publish on no-op delta; guard claim-token leak + LCS blowup

- runPublish drops the base and sends a whole-blob manifest when the revise
  plan produces no changed files and no deletions, so an unchanged working
  tree revise no longer sends an empty delta the server rejects (bugbot).
- publish-format only treats the claim token in the URL query string as a leak,
  not a coincidental fragment substring (CodeRabbit).
- unified-diff-gen returns null instead of attempting an LCS over more than
  MAX_LCS_CELLS cells, so a pathological diff falls back to a whole blob rather
  than pinning a core (CodeRabbit).
- Corrected the pull/read-back comment that wrongly claimed base64 bytes ride
  in the JSON; oversize text/binary is metadata-only.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* fix(api): infer is_binary from content type on the oversize file-content path

An oversize file is returned as metadata without reading R2, so its bytes are
never inspected. Previously is_binary defaulted to false, mislabeling an
oversize binary as text. Now the flag is derived from the stored content type
(non-text/* => binary), so a client keying on the flag does not try to inline
binary bytes. body stays absent on this branch (CodeRabbit).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* fix(db): return committed revision counts on finalize fast path; document FK scope

- The finalized-session fast path now reads the committed revision and reports
  its file_count/size_bytes, so a repeat finalize returns the merged-tree
  counts rather than the pre-merge session counts (CodeRabbit).
- Documented that migration 0024's column-scoped ON DELETE SET NULL on
  parent_revision_id is authoritative and that Drizzle cannot express the
  column list, so the snapshot's unscoped form is expected drift, not a bug to
  "fix" toward the snapshot (CodeRabbit).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* fix(storage,docs): preserve UTF-8 BOM in decodeUtf8Strict; tighten patch sha256 spec

- decodeUtf8Strict keeps a leading UTF-8 BOM (ignoreBOM) so valid BOM-prefixed
  text round-trips and is not rejected as binary; fatal is passed explicitly
  for the Worker TS lib option type (CodeRabbit).
- api.md: a patched per-file entry's size_bytes is the diff byte length and the
  entry carries no whole-file sha256; the sha256 rule now scopes to whole-file
  entries (CodeRabbit).
- Test-only: cover the BOM round-trip + invalid-UTF-8 reject, read_file omits
  revision_id from the query when absent, and isolate the non-unified-patch
  contract test from sha256.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant