[Test] Compile-only mode for tests on non-supporting GPUs by yaoyaoding · Pull Request #153 · NVIDIA/tilus

yaoyaoding · 2026-05-04T17:33:19Z

Adds InstantiatedScript.compile(*args, **kwargs) -> JitInstance, a public API that transpiles + builds every schedule for the given arguments without executing the kernel, benchmarking, or persisting a dispatch choice. Adds tilus.target.scope(target) as a context manager for temporarily overriding the build target.

Changes tilus.testing.requires.X behavior: when the current GPU does not support X, the test now runs in compile-only mode instead of being hard-skipped -- the build target is scoped to X, InstantiatedScript.call is patched to delegate to compile() and raise an internal sentinel, and the wrapper catches the sentinel so a successful compile counts as a passing test. Lets CI on older arches (e.g. sm89) cover compilation paths for newer arches (e.g. sm100a) without requiring matching hardware.

Adds InstantiatedScript.compile(*args, **kwargs) -> JitInstance, a public API that transpiles + builds every schedule for the given arguments without executing the kernel, benchmarking, or persisting a dispatch choice. Adds tilus.target.scope(target) as a context manager for temporarily overriding the build target. Changes tilus.testing.requires.X behavior: when the current GPU does not support X, the test now runs in compile-only mode instead of being hard-skipped -- the build target is scoped to X, InstantiatedScript.__call__ is patched to delegate to compile() and raise an internal sentinel, and the wrapper catches the sentinel so a successful compile counts as a passing test. Lets CI on older arches (e.g. sm89) cover compilation paths for newer arches (e.g. sm100a) without requiring matching hardware. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Yaoyao Ding <dingyaoyao.cs@gmail.com>

The CI runner has an L4 (sm89) but tests for newer-arch instructions need to compile against compute_100 / compute_100a. The docker image was nvidia/cuda:12.6.2-devel-ubuntu22.04, whose nvcc is 12.6 and does not know compute_100. Bump to nvidia/cuda:13.0.0-devel-ubuntu22.04 so the compile-only paths can build sm_100/sm_100a kernels (matches the torch 13.0 binaries already pulled at runtime). Also tighten two test annotations whose kernels emit instructions unsupported below sm_100a: - test_copy_async_tensor_cta uses cp.async.bulk.tensor with the .cta_group::1 modifier (sm_100+); was annotated sm_90. - test_cluster_launch_control uses clusterlaunchcontrol.try_cancel with the multicast variant (sm_100a only); was annotated sm_100. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Yaoyao Ding <dingyaoyao.cs@gmail.com>

yaoyaoding and others added 2 commits May 4, 2026 13:32

yaoyaoding merged commit 3a7ee97 into main May 5, 2026
8 of 10 checks passed

github-actions Bot added a commit that referenced this pull request May 5, 2026

Deploy docs: latest (2026-05-05) #153

df389f2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Test] Compile-only mode for tests on non-supporting GPUs#153

[Test] Compile-only mode for tests on non-supporting GPUs#153
yaoyaoding merged 2 commits intomainfrom
compile-only-tests

yaoyaoding commented May 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

yaoyaoding commented May 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant