Skip to content

Tarball input mode for verify (content-addressed source submission) (STE-55)#7

Merged
yvesfracari merged 2 commits into
mainfrom
pedro/ste-55-tarball-input-mode-for-verify-content-addressed-source
Jun 12, 2026
Merged

Tarball input mode for verify (content-addressed source submission) (STE-55)#7
yvesfracari merged 2 commits into
mainfrom
pedro/ste-55-tarball-input-mode-for-verify-content-addressed-source

Conversation

@yvesfracari

Copy link
Copy Markdown
Contributor

Closes STE-55.

What

Adds the SEP-58 content-addressed source mode to verify: supply a source tarball plus its expected SHA-256 instead of a checked-out directory.

soroscan-verify verify --id <CONTRACT_ID> --tarball src.tar.gz --tarball-sha256 <hex> [--docker]
scripts/verify.sh <CONTRACT_ID> --tarball src.tar.gz --tarball-sha256 <hex> [--docker]

How it works

  1. Digest gate first: the tarball's SHA-256 must equal the supplied digest, or verification returns ERROR (non-zero exit) before any unpacking — never build from an unverified tarball.
  2. On match, the archive's entry names are validated (absolute paths and .. components rejected) before extraction into a fresh temp dir, which is cleaned up afterwards.
  3. The unpacked source is rebuilt — local pinned toolchain by default, or the pinned network-isolated Docker image with --docker — and byte-compared against the on-chain WASM exactly as the existing verify does.
  4. The result carries sourceMode: "tarball" and the verified tarballSha256, tying wasm hash ↔ source artifact digest.

Tests

TDD, 6 new tests in reader/test/verify-tarball.test.ts: digest mismatch (no unpack/build), happy path with a fixture tarball of the sample contract, ..-component rejection, absolute-path rejection, malformed archive, and build-failure → ERROR verdict (not a crash). Hostile archives are hand-rolled ustar blocks since tar CLIs refuse to create them. Full suite: 63 passed + 3 live testnet integration tests passed; typecheck clean.

Verified end-to-end against the deployed testnet fixture: tarball of contracts/ → digest gate → rebuild → FULL_MATCH (exit 0); digest mismatch → ERROR (exit 1); existing directory-based scripts/verify.sh flow still reports FULL_MATCH.

Notes

  • VerificationResult.rebuiltSha256/rebuiltByteLength are now optional: tarball-mode failures can error before any WASM exists. verifyById always sets them, as before.
  • Local (non-Docker) builds use the ambient toolchain, same as the existing directory mode; version managers that resolve per-directory (e.g. asdf) won't see the repo's .tool-versions from the temp dir. --docker is the pinned, environment-independent path.

…n (STE-55)

verify --tarball <path> --tarball-sha256 <hex> gates on the tarball's
SHA-256 (SEP-58 tarball_sha256 commitment) before unpacking, extracts to
a fresh temp dir with path-traversal validation, rebuilds with the local
toolchain or the pinned Docker image, and byte-compares as usual. The
result carries sourceMode "tarball" and the verified digest, tying the
wasm hash to the source artifact digest. scripts/verify.sh gains the
matching tarball variant. Directory-based verify is unchanged.
@linear-code

linear-code Bot commented Jun 12, 2026

Copy link
Copy Markdown

STE-55

… bytes

Review fixes from PR #7:

- Reject symlink/hardlink entries with our own header walk instead of
  relying on the host tar's default protections (name validation alone
  cannot stop a link whose TARGET escapes the extraction dir). Same rule
  cargo package enforces for crates.
- Feed tar's stdin from the exact in-memory bytes the digest gate hashed,
  closing the hash/extract TOCTOU window — the file is never re-read.
- Decompress in-process with a gzip magic check, so plain .tar archives
  work and behavior no longer depends on bsdtar-vs-GNU -z semantics.

Three new tests: symlink rejected, hardlink rejected, uncompressed .tar
accepted — assertions pinned to our error messages so host-tar stderr
can't satisfy them.
@yvesfracari

Copy link
Copy Markdown
Contributor Author

Applied the review's cheap fixes in a36cac7:

  1. Link entries rejected by our own check — a header walk over the verified bytes rejects symlink/hardlink entries before extraction (typeflags 1/2), instead of relying on host-tar defaults. Notably, the new symlink/hardlink tests passed before the fix on macOS because bsdtar happens to refuse link extraction — i.e. the protection was real but platform-dependent. Assertions are now pinned to our error messages (symlink entries are not allowed) so host-tar stderr can't satisfy them.
  2. TOCTOU closedunpackTarball now takes the in-memory bytes the digest gate hashed and feeds them to tar -xf - via stdin; the tarball file is never re-read after hashing.
  3. Compression auto-detect — gzip magic check + in-process gunzipSync, plain .tar fallback. This also removes the bsdtar-vs-GNU -z-on-read divergence (the uncompressed-tar test would have failed on Linux with the old tar -tzf approach).

Suite: 66 passed, typecheck clean, real testnet E2E re-run from the tarball → FULL_MATCH.

Not addressed (tracked as follow-ups, per review): Docker Desktop tmpdir mount note, multi-contract workspace selector for the default builder.

@yvesfracari yvesfracari merged commit fc2ca5c into main Jun 12, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant