Skip to content

CI: shrink checkouts via shallow / treeless partial clones#2177

Merged
leofang merged 1 commit into
NVIDIA:mainfrom
leofang:leofang/shallow-clone-ci
Jun 8, 2026
Merged

CI: shrink checkouts via shallow / treeless partial clones#2177
leofang merged 1 commit into
NVIDIA:mainfrom
leofang:leofang/shallow-clone-ci

Conversation

@leofang

@leofang leofang commented Jun 7, 2026

Copy link
Copy Markdown
Member

Summary

Closes #2091.

15 workflow checkouts used fetch-depth: 0 (full clone with every blob in history). The audit in #2091 showed only 7 actually need the commit graph (setuptools-scm, git merge-base, git worktree), and none need historical blobs. The remaining 8 either do no local git operations or only need tag resolution. This PR applies the three patterns from that audit.

Patterns applied

  • Plain shallow clonefetch-depth: 1 (6 sites, no local git ops)

    • ci.yml ci-vars
    • build-docs.yml
    • release-upload.yml
    • release.yml check-tag
    • release-cuda-pathfinder.yml upload-assets
    • coverage.yml coverage-windows
  • Shallow + tagsfetch-depth: 1 + fetch-tags: true (2 sites, lookup-run-id does git rev-parse <tag>)

    • release.yml determine-run-id
    • release-cuda-pathfinder.yml prepare
  • Treelessfetch-depth: 0 + filter: blob:none (7 sites, commit graph needed but not historical blobs)

    • build-wheel.yml (setuptools-scm)
    • ci.yml detect-changes (git merge-base + git diff --name-only)
    • cleanup-pr-previews.yml (git worktree add gh-pages)
    • coverage.yml coverage-linux (setuptools-scm)
    • coverage.yml build-wheel-windows (setuptools-scm)
    • test-sdist-linux.yml (python -m build → setuptools-scm)
    • test-sdist-windows.yml (same)

Should noticeably reduce checkout time, especially on Windows where git operations are slower.

Test plan

  • CI passes on this PR (exercises all three patterns: detect-changes, build-wheel, coverage, test-sdist)
  • release dry-run / next release exercises release.yml and release-cuda-pathfinder.yml paths (these aren't triggered by PR CI)

-- Leo's bot

Of the 15 workflow checkouts that used fetch-depth: 0, only 7 actually
need the commit graph (setuptools-scm, git merge-base, git worktree),
and none of them need historical blobs. The remaining 8 either do no
git operations or only need tag resolution.

- 6 sites with no local git ops: fetch-depth: 1
  (ci.yml ci-vars, build-docs.yml, release-upload.yml,
   release.yml check-tag, release-cuda-pathfinder.yml upload-assets,
   coverage.yml coverage-windows)
- 2 sites that resolve a tag to a SHA via lookup-run-id:
  fetch-depth: 1 + fetch-tags: true
  (release.yml determine-run-id, release-cuda-pathfinder.yml prepare)
- 7 sites that need the commit graph but not blobs:
  fetch-depth: 0 + filter: blob:none
  (build-wheel.yml, ci.yml detect-changes, cleanup-pr-previews.yml,
   coverage.yml coverage-linux, coverage.yml build-wheel-windows,
   test-sdist-linux.yml, test-sdist-windows.yml)

Closes NVIDIA#2091
@copy-pr-bot

copy-pr-bot Bot commented Jun 7, 2026

Copy link
Copy Markdown
Contributor

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@github-actions github-actions Bot added the CI/CD CI/CD infrastructure label Jun 7, 2026
@leofang

leofang commented Jun 7, 2026

Copy link
Copy Markdown
Member Author

/ok to test e673fe1

@leofang leofang self-assigned this Jun 7, 2026
@leofang leofang added this to the cuda.core v1.1.0 milestone Jun 7, 2026
@github-actions

github-actions Bot commented Jun 7, 2026

Copy link
Copy Markdown

@leofang leofang added the P0 High priority - Must do! label Jun 7, 2026
@leofang

leofang commented Jun 7, 2026

Copy link
Copy Markdown
Member Author

CI is fully green, and the checkout-step speedup is significant. Sampling representative jobs from this PR's run (27097717939) against the latest main baseline (27079959023, commit 7ea8a46) — same workflow infra, same runner types:

Job Runner Before After Savings
ci-vars ubuntu-latest 133s 1s −132s (−99%)
detect-changes ubuntu-latest 108s 4s −104s (−96%)
Build linux-64 / py3.12 linux-amd64-cpu8 86s 3s −83s (−97%)
Build win-64 / py3.12 windows-2022 149s 7s −142s (−95%)
Test sdist win-64 windows-2022 133s 8s −125s (−94%)

Every checkout is now seconds rather than minutes. Cumulative compute time saved per CI run across the full matrix (6 linux-64 + 6 linux-aarch64 + 6 win-64 builds, 2 sdist jobs, ci-vars, detect-changes, etc.) is on the order of 30+ minutes. Wall-clock on the critical path (ci-varsdetect-changes → build → test) shaves roughly 3 minutes off the start. Windows benefits the most, as expected.

No correctness issues observed: setuptools-scm (git describe), detect-changes (git merge-base + git diff --name-only), and the release-archive flows (git archive, git worktree) all work correctly under the new clone shapes.

-- Leo's bot

@leofang leofang requested a review from mdboom June 8, 2026 13:43

@mdboom mdboom left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice.

@leofang leofang merged commit e462bbc into NVIDIA:main Jun 8, 2026
189 of 194 checks passed
@leofang

leofang commented Jun 8, 2026

Copy link
Copy Markdown
Member Author

Thanks, Mike!

@leofang leofang deleted the leofang/shallow-clone-ci branch June 8, 2026 21:09
github-actions Bot pushed a commit that referenced this pull request Jun 9, 2026
Removed preview folders for the following PRs:
- PR #2177
rwgk added a commit that referenced this pull request Jun 12, 2026
PR #2177 made the release workflow checkout shallow, so the dry-run tag validation now needs tags fetched explicitly while keeping history shallow.
rwgk added a commit that referenced this pull request Jun 13, 2026
* Check release notes for planned backports

Require mainline cuda-bindings and cuda-python releases to explicitly declare a planned backport tag or mark it not planned. Keep actual backport releases unblocked while surfacing missing notes as warnings, and preserve docs builds for older tags that still use ci/versions.json.

* Add dry-run release workflow mode

Validate release docs, archives, and wheels without publishing to GitHub Releases, GitHub Pages, TestPyPI, or PyPI.

* Allow dry-run docs branch deployment

Add an explicit dry-run docs branch input so release dry-runs can optionally write generated docs to a seeded non-production branch while keeping artifact-only dry-runs as the default.

* Default release workflow dispatches to dry-run

Make dry-run the first and default release action so production publishing must be deliberately selected for manual release workflow runs.

* Constrain dry-run docs deploy branches

Require optional dry-run docs deployments to target a non-production gh-pages-* branch so manual release dry-runs cannot accidentally publish docs to production or source branches.

* CI: fetch tags for release dry-run validation

PR #2177 made the release workflow checkout shallow, so the dry-run tag validation now needs tags fetched explicitly while keeping history shallow.

* CI: document release dry-run retest matrix

Add workflow-local guidance for validating non-trivial release workflow changes with focused dry-run coverage.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CI/CD CI/CD infrastructure P0 High priority - Must do! performance

Projects

None yet

Development

Successfully merging this pull request may close these issues.

CI: Optimize fetch-depth: 0 usage across workflows

2 participants