Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 21 additions & 1 deletion .github/workflows/python.yml
Original file line number Diff line number Diff line change
Expand Up @@ -96,6 +96,26 @@ jobs:
run: maturin build --release --out dist
- name: install the built wheel
shell: bash
run: python -m pip install ordvec-python/dist/*.whl
run: |
set -euo pipefail
WHEEL="$(python - <<'PY'
from pathlib import Path
wheels = sorted(Path("ordvec-python/dist").glob("*.whl"))
if len(wheels) != 1:
raise SystemExit(f"expected exactly one wheel, found {wheels}")
print(wheels[0])
PY
)"
REQ_FILE="${RUNNER_TEMP:?RUNNER_TEMP must be set}/ordvec-wheel-requirements.txt"
python - <<'PY' "$WHEEL" > "$REQ_FILE"
import hashlib
import sys
from pathlib import Path

wheel = Path(sys.argv[1]).resolve()
digest = hashlib.sha256(wheel.read_bytes()).hexdigest()
print(f"ordvec @ {wheel.as_uri()} --hash=sha256:{digest}")
PY
python -m pip install --require-hashes --no-index --no-deps -r "$REQ_FILE"
- name: pytest
run: python -m pytest ordvec-python/tests -q
299 changes: 207 additions & 92 deletions .github/workflows/release.yml

Large diffs are not rendered by default.

2 changes: 2 additions & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,8 @@ exclude = [
"ordvec-python/",
"tests/__pycache__/",
"tests/release_environment_settings.sh",
"tests/release_pypi_canonical_dist.py",
"tests/release_pypi_canonical_dist_tests.py",
"tests/release_publish_invariants.py",
"tests/release_publish_invariants.sh",
"tests/release_signed_release_invariants.sh",
Expand Down
48 changes: 32 additions & 16 deletions RELEASING.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,9 +9,9 @@

`ordvec` (the Rust crate) and `ordvec` on PyPI (the PyO3 wheel built from
`ordvec-python/`) are released by **pushing a `vMAJOR.MINOR.PATCH` tag** to a
commit on `main`. The release workflow handles build, attestation, SLSA
provenance, Release-asset attach, and un-draft automatically; only the two
registry pushes are manual.
commit on `main`. The release workflow handles build, canonical Python artifact
selection, attestation, SLSA provenance, Release-asset attach, and un-draft
automatically; only the two registry gates are manual.

## Release pipeline controls

Expand All @@ -25,11 +25,20 @@ The unified `release.yml`:
(a *successful* run for that exact SHA on `main`);
- publishes via **OIDC trusted publishing** (no long-lived crates.io / PyPI
tokens in the repo);
- canonicalizes the Python dist before attestation and release upload: for a
new PyPI version it uses the current run's wheels/sdist; if PyPI already owns
that immutable version during recovery, it downloads the exact PyPI-served
files, verifies their SHA-256 digests from PyPI JSON, and uses those bytes as
the GitHub Release assets;
- emits **GitHub SLSA build provenance** (`actions/attest-build-provenance`)
and a **SLSA-generator `*.intoto.jsonl`** attached to the GitHub Release
**before** the gated publishes — a failed attestation fails the release
closed, so nothing ships without provenance recorded;
- stages the **`.crate`, wheels, sdist, `*.sigstore.json` bundle, and
closed, so nothing ships without provenance recorded. In recovery mode where
PyPI files already exist, the GitHub/SLSA subjects are deliberately limited
to the crate built by the current run; the Python files are verified immutable
PyPI bytes from the earlier Trusted Publishing upload, not falsely claimed as
rebuilt by the recovery run;
- stages the **`.crate`, canonical wheels, canonical sdist, `*.sigstore.json` bundle, and
`*.intoto.jsonl` provenance** on the GitHub Release while it is still **a
DRAFT** (`release-assets-draft` is the sole Release-asset writer — no manual
attach, which is what v0.2.0's manual step missed);
Expand All @@ -56,7 +65,10 @@ The unified `release.yml`:
`persist-credentials: false`, and defaults to `permissions: contents: read`.

The PyPI publish step additionally produces **PEP 740** attestations via
Trusted Publishing (served from PyPI's Integrity API).
Trusted Publishing (served from PyPI's Integrity API) on a fresh upload. If the
version already exists on PyPI during recovery, the job skips upload and instead
verifies that PyPI-served wheel/sdist hashes match the canonical files staged on
the GitHub Release.

### Environment protection (configured in repo settings, not in code)

Expand Down Expand Up @@ -145,23 +157,27 @@ filename. Until either is updated, the corresponding gated publish fails
```

`release.yml` triggers automatically. It builds the `.crate`, wheels, and
sdist; attests them (GitHub attestation store + `*.sigstore.json`);
generates the SLSA `*.intoto.jsonl`; and stages every artifact, the
attestation bundle, and the provenance on the GitHub Release — **as a
DRAFT**. It then pauses at the two registry environment gates.
sdist; selects the canonical Python dist (current build for a new PyPI
version, verified PyPI bytes for an existing immutable version); attests the
files this run can honestly attest (GitHub attestation store +
`*.sigstore.json`); generates the SLSA `*.intoto.jsonl`; and stages every
artifact, the attestation bundle, and the provenance on the GitHub Release
— **as a DRAFT**. It then pauses at the two registry environment gates.
7. **Approve the two publish environments** when they pause in the Actions UI
(one for `crates-io`, one for `pypi`). The required-reviewer approval is
what authorises the registry push.
- `publish-crate` first sha256-compares its repackaged `.crate` to the
SLSA-attested artifact — if they diverge (toolchain drift, etc.) the job
fails closed BEFORE the OIDC token is minted, so nothing reaches
crates.io. Re-run / investigate.
- Once **both** publishes succeed, `publish-github-release` un-drafts the
GitHub Release automatically. If one publish fails, the Release stays
DRAFT — re-run the failed job, the un-draft then completes.
- `publish-pypi` also queries PyPI after upload and compares every served
wheel/sdist SHA-256 digest against the staged `dist/` files before the
GitHub Release can un-draft.
- Once **both** registry gates succeed, `publish-github-release` un-drafts
the GitHub Release automatically. If one gate fails, the Release stays
DRAFT — investigate and re-run from a fixed workflow rather than approving
the other registry into another partial state.
- `publish-pypi` either uploads the fresh canonical dist or, if PyPI already
serves that version, skips upload and verifies the existing files. In both
modes it compares every PyPI-served wheel/sdist SHA-256 digest against the
canonical `dist/` files before the GitHub Release can un-draft.
8. Verify each published artifact and its provenance:
- crates.io / docs.rs;
- PyPI (confirm the post-publish hash-verification log, optionally
Expand Down
199 changes: 152 additions & 47 deletions tests/release_publish_invariants.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@


WORKFLOW_PATH = os.environ.get("RELEASE_WORKFLOW_PATH", ".github/workflows/release.yml")
PYTHON_WORKFLOW_PATH = os.environ.get("PYTHON_WORKFLOW_PATH", ".github/workflows/python.yml")


def fail(message: str) -> None:
Expand Down Expand Up @@ -93,79 +94,180 @@ def empty(value: Any) -> bool:
return value is None or value == ""


def has_need(job: dict[str, Any], needed: str) -> bool:
needs = job.get("needs")
if isinstance(needs, str):
return needs == needed
if isinstance(needs, list):
return needed in needs
return False


def contains_text(value: Any, needle: str) -> bool:
return isinstance(value, str) and needle in value


def read_text(path: str) -> str:
try:
with open(path, encoding="utf-8") as fh:
return fh.read()
except OSError as exc:
fail(f"{path}: could not read workflow: {exc}")


def check_hash_requirement_temp_paths(paths: list[str]) -> None:
for path in paths:
workflow_text = read_text(path)
if "/tmp/ordvec-" in workflow_text:
fail(f"{path}: hash requirement files must be written under ${{RUNNER_TEMP}}, not /tmp")


def check_aarch64_smoke_selector(workflow: dict[str, Any], path: str) -> None:
jobs = mapping(workflow.get("jobs"), f"{path}: jobs")
job = mapping(jobs.get("smoke-linux-aarch64-wheel"), f"{path}: jobs.smoke-linux-aarch64-wheel")
steps = sequence(job.get("steps"), f"{path}: jobs.smoke-linux-aarch64-wheel.steps")

matching_steps: list[dict[str, Any]] = []
for raw_step in steps:
step = mapping(raw_step, f"{path}: jobs.smoke-linux-aarch64-wheel.steps[]")
if step.get("name") == "Install exact wheel and run tiny RankQuant/Bitmap smoke":
matching_steps.append(step)

if len(matching_steps) != 1:
fail(f"{path}: smoke-linux-aarch64-wheel must have exactly one install/smoke step")

run = matching_steps[0].get("run")
if not isinstance(run, str):
fail(f"{path}: smoke-linux-aarch64-wheel install/smoke step must be a run step")
if "manylinux_2_17_aarch64" in run:
fail(f"{path}: linux/aarch64 wheel selector must not pin a specific manylinux policy tag")
if not all(needle in run for needle in ('"aarch64"', '"manylinux"', '"musllinux"', "len(wheels) != 1")):
fail(f"{path}: linux/aarch64 wheel selector must match architecture and assert exactly one wheel")


def check_pypi_canonical_dist(workflow: dict[str, Any], path: str) -> None:
jobs = mapping(workflow.get("jobs"), f"{path}: jobs")
job = mapping(jobs.get("pypi-canonical-dist"), f"{path}: jobs.pypi-canonical-dist")
steps = sequence(job.get("steps"), f"{path}: jobs.pypi-canonical-dist.steps")

for needed in ("build-wheels", "build-sdist"):
if not has_need(job, needed):
fail(f"{path}: pypi-canonical-dist must need {needed}")

outputs = mapping(job.get("outputs"), f"{path}: jobs.pypi-canonical-dist.outputs")
if outputs.get("source") != "${{ steps.canonicalize.outputs.source }}":
fail(f"{path}: pypi-canonical-dist must expose the canonical source output")

wheels_downloads: list[int] = []
sdist_downloads: list[int] = []
canonicalize_steps: list[dict[str, Any]] = []
uploads: list[tuple[int, dict[str, Any], dict[str, Any]]] = []

for index, raw_step in enumerate(steps):
step = mapping(raw_step, f"{path}: jobs.pypi-canonical-dist.steps[{index}]")
action = action_name(step)
if action == "actions/download-artifact":
with_map = mapping(step.get("with", {}), f"{path}: {step_label(index, step)} with")
artifact_path = norm_path(with_map.get("path"))
if with_map.get("pattern") == "wheels-*" and boolish_true(with_map.get("merge-multiple")):
if artifact_path != "built-dist":
fail(f"{path}: canonical wheel download must target built-dist")
wheels_downloads.append(index)
elif with_map.get("name") == "sdist":
if artifact_path != "built-dist":
fail(f"{path}: canonical sdist download must target built-dist")
sdist_downloads.append(index)
elif action == "actions/upload-artifact":
with_map = mapping(step.get("with", {}), f"{path}: {step_label(index, step)} with")
if with_map.get("name") == "pypi-canonical-dist":
uploads.append((index, step, with_map))

run = step.get("run")
if contains_text(run, "tests/release_pypi_canonical_dist.py canonicalize"):
canonicalize_steps.append(step)
if "--built-dir built-dist" not in run or "--out-dir canonical-dist" not in run:
fail(f"{path}: canonicalize step must read built-dist and write canonical-dist")

if len(wheels_downloads) != 1:
fail(f"{path}: pypi-canonical-dist must download exactly one wheels-* artifact set")
if len(sdist_downloads) != 1:
fail(f"{path}: pypi-canonical-dist must download exactly one sdist artifact")
if len(canonicalize_steps) != 1:
fail(f"{path}: pypi-canonical-dist must run release_pypi_canonical_dist.py canonicalize")
if len(uploads) != 1:
fail(f"{path}: pypi-canonical-dist must upload exactly one pypi-canonical-dist artifact")

_, _, upload_with = uploads[0]
upload_path = upload_with.get("path")
if not (
contains_text(upload_path, "canonical-dist/*.whl")
and contains_text(upload_path, "canonical-dist/*.tar.gz")
):
fail(f"{path}: pypi-canonical-dist upload must include canonical wheels and sdist")


def check_publish_pypi(workflow: dict[str, Any], path: str) -> None:
jobs = mapping(workflow.get("jobs"), f"{path}: jobs")
job = mapping(jobs.get("publish-pypi"), f"{path}: jobs.publish-pypi")
steps = sequence(job.get("steps"), f"{path}: jobs.publish-pypi.steps")

if not has_need(job, "pypi-canonical-dist"):
fail(f"{path}: publish-pypi must need pypi-canonical-dist")

publish_steps: list[tuple[int, dict[str, Any]]] = []
artifact_downloads: list[tuple[int, dict[str, Any], dict[str, Any]]] = []
canonical_downloads: list[tuple[int, dict[str, Any], dict[str, Any]]] = []
verify_steps: list[dict[str, Any]] = []

for index, raw_step in enumerate(steps):
step = mapping(raw_step, f"{path}: jobs.publish-pypi.steps[{index}]")
action = action_name(step)
if action == "pypa/gh-action-pypi-publish":
publish_steps.append((index, step))
if action != "actions/download-artifact":
continue
if action == "actions/download-artifact":
with_block = step.get("with", {})
with_map = mapping(with_block, f"{path}: {step_label(index, step)} with")
if with_map.get("name") == "pypi-canonical-dist":
canonical_downloads.append((index, step, with_map))
elif norm_path(with_map.get("path")) == "dist":
fail(f"{path}: {step_label(index, step)} downloads a non-canonical artifact into dist")

with_block = step.get("with", {})
with_map = mapping(with_block, f"{path}: {step_label(index, step)} with")
artifact_downloads.append((index, step, with_map))
run = step.get("run")
if contains_text(run, "tests/release_pypi_canonical_dist.py verify"):
verify_steps.append(step)
if "--dist-dir dist" not in run:
fail(f"{path}: PyPI verify step must verify dist")

if len(publish_steps) != 1:
fail(f"{path}: publish-pypi must have exactly one pypa/gh-action-pypi-publish step")

publish_index, publish_step = publish_steps[0]
if publish_step.get("if") != "needs.pypi-canonical-dist.outputs.source == 'build'":
fail(f"{path}: PyPI publish step must only run when canonical source is the current build")
publish_with = mapping(
publish_step.get("with", {}), f"{path}: {step_label(publish_index, publish_step)} with"
)
if norm_path(publish_with.get("packages-dir")) != "dist":
fail(f"{path}: PyPI publish step must upload packages-dir: dist")
if not boolish_true(publish_with.get("skip-existing")):
fail(
f"{path}: PyPI publish step must set skip-existing: true so a recovery "
"rerun is idempotent after PyPI has already accepted the version"
)

wheels: list[int] = []
sdists: list[int] = []
for index, step, with_map in artifact_downloads:
label = step_label(index, step)
artifact_path = norm_path(with_map.get("path"))
if artifact_path != "dist":
fail(
f"{path}: {label} downloads artifacts to {artifact_path or 'the default path'!r}; "
"publish-pypi may only download wheels-* and sdist into dist"
)
if index > publish_index:
fail(f"{path}: {label} downloads into dist after the PyPI publish step")

name = with_map.get("name")
pattern = with_map.get("pattern")
is_wheels = (
pattern == "wheels-*"
and empty(name)
and boolish_true(with_map.get("merge-multiple"))
)
is_sdist = name == "sdist" and empty(pattern)

if is_wheels:
wheels.append(index)
continue
if is_sdist:
sdists.append(index)
continue
if len(canonical_downloads) != 1:
fail(f"{path}: publish-pypi must download exactly one pypi-canonical-dist artifact")
download_index, download_step, download_with = canonical_downloads[0]
if download_index > publish_index:
fail(f"{path}: {step_label(download_index, download_step)} must run before the PyPI publish step")
if norm_path(download_with.get("path")) != "dist":
fail(f"{path}: publish-pypi must download pypi-canonical-dist into dist")

fail(
f"{path}: {label} downloads into dist but is not the allowed "
"'pattern: wheels-*' or 'name: sdist' artifact"
)
if len(verify_steps) != 1:
fail(f"{path}: publish-pypi must run release_pypi_canonical_dist.py verify exactly once")

if len(wheels) != 1:
fail(f"{path}: publish-pypi must download exactly one wheels-* artifact set into dist")
if len(sdists) != 1:
fail(f"{path}: publish-pypi must download exactly one sdist artifact into dist")
for index, step in enumerate(steps):
if action_name(step) != "actions/download-artifact":
continue
with_map = mapping(step.get("with", {}), f"{path}: {step_label(index, step)} with")
label = step_label(index, step)
artifact_path = norm_path(with_map.get("path"))
if artifact_path == "dist" and with_map.get("name") != "pypi-canonical-dist":
fail(f"{path}: {label} must not place non-canonical artifacts in dist")


def check_publish_crate(workflow: dict[str, Any], path: str) -> None:
Expand Down Expand Up @@ -225,6 +327,9 @@ def check_publish_crate(workflow: dict[str, Any], path: str) -> None:

def main() -> None:
workflow = load_workflow(WORKFLOW_PATH)
check_hash_requirement_temp_paths([WORKFLOW_PATH, PYTHON_WORKFLOW_PATH])
check_aarch64_smoke_selector(workflow, WORKFLOW_PATH)
check_pypi_canonical_dist(workflow, WORKFLOW_PATH)
check_publish_crate(workflow, WORKFLOW_PATH)
check_publish_pypi(workflow, WORKFLOW_PATH)

Expand Down
Loading
Loading