fix(ci): cudaRoundMode typing failure in FP8 test by kaeun97 · Pull Request #834 · NVIDIA/numba-cuda

kaeun97 · 2026-03-11T22:13:14Z

Attempt to fix this issue.

cuda-bindings 13.2.0 changed cudaRoundMode from a standard Python IntEnum to a FastEnumMetaclass type. Numba's type inference cannot resolve FastEnumMetaclass types, causing three FP8 tests to fail (ref).

This PR replaces adds local IntEnum in cuda_fp8.py, matching the pattern already used for saturation_t and `fp8_interpretation_t.

If this works, let file a issue on the Numbast side (if needed).

copy-pr-bot · 2026-03-11T22:13:18Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

kaeun97 · 2026-03-11T22:13:56Z

Thanks in advance - unable to reproduce the error locally so would have to use CI to test this. @gmarkall

leofang · 2026-03-12T04:15:22Z

    E5M2 = 1


+class cudaRoundMode(IntEnum):


@gmarkall @isVoid what does it take for numba-cuda to understand cudaRoundMode from cuda-bindings as a legal enum, without numba-cuda having to repeat the definitions?

On the cuda-bindings side, we switched to a custom fast enum from the builtin IntEnum to reduce Python overhead (cc @mdboom), and we can evaluate if a patch on cuda-bindings makes better sense, or if we could just register the type in numba-cuda (or both).

if it only needs a simple patch to cuda-bindings (and @mdboom deems it's acceptable), it'd be better to fix it on the cuda-bindings side and publish 13.2.1/12.9.7, instead of breaking numba-cuda.

cc @rparolin for vis

We'd need to add typing to Numba for the FastEnumMetaclass type for Numba to recognise it.

Would it help if we add __int__ to the fast enum class?

I don't think Numba-CUDA will see that. Apart from a small set of cases:

numba-cuda/numba_cuda/numba/cuda/typing/typeof.py

Lines 50 to 97 in 893dde7

@singledispatch

def typeof_impl(val, c):

"""

Generic typeof() implementation.

"""

tp = getattr(val, "_numba_type_", None)

if tp is not None:

return tp

# Check for dlpack objects

dlpack = getattr(val, "__dlpack__", None)

if dlpack is not None:

tp = _typeof_dlpack(dlpack, c)

if tp is not None:

return tp

# Check for __cuda_array_interface__ objects (third-party device arrays)

# Numba's own DeviceNDArray is handled above via _numba_type_.

cai = getattr(val, "__cuda_array_interface__", None)

if cai is not None:

tp = _typeof_cuda_array_interface(cai, c)

if tp is not None:

return tp

tp = _typeof_buffer(val, c)

if tp is not None:

return tp

# cffi is handled here as it does not expose a public base class

# for exported functions or CompiledFFI instances.

from numba.cuda.typing import cffi_utils

if cffi_utils.SUPPORTED:

if cffi_utils.is_cffi_func(val):

return cffi_utils.make_function_type(val)

if cffi_utils.is_ffi_instance(val):

return types.ffi

if HAS_NUMBA:

# Fallback to Numba's typeof_impl for third-party registrations

from numba.core.typing.typeof import typeof_impl as core_typeof_impl

tp = core_typeof_impl(val, c)

if tp is not None:

return tp

return None

Numba-CUDA looks at the Python type to map to the Numba type.

Maybe we could / should add recognition of __int__ as well, but it would need a Numba-CUDA change in addition to adding __int__ to the fast enum class.

Ah, it's done by the same function I just touched recently... Got it. So this is a corner case where Python types do matter and duck-typing does not work.

I would suggest that we merge this PR as a workaround to unblock the CI, and discuss a long-term fix. (My 2c is if the surface area grows to other enums we should register the fast enum type, but one-off patches like this are not bad.) WDYT?

Sorry I got around to this a bit late. I saw this error yesterday in my Numbast CI but didn't trace it to the bottom. The cuda_fp8.py file was auto generated by Numbast. If we decided on a fast enum type for cuda types we should probably inform Numbast to generate corresponding logics.

leofang · 2026-03-12T14:43:59Z

/ok to test cfc3787

To support the new `FastEnum` class in `cuda_bindings` 13.2, this adds new type registrations to support them. These instances are otherwise 100% API-compatible with `enum.IntEnum`, so there is no new logic. This should hopefully be a more sustainable solution than overriding individual enums, so this also reverts #834.

- bump version to 0.29.0 - fix: normalize numpy integer types to python int to prevent overflow errors (#774) - Support cuda.core.GraphBuilder as a kernel-launch stream (#836) - Support cuda_bindings FastEnum (#837) - fix(ci): cudaRoundMode typing failure in FP8 test (#834) - Use `cuda-python` for `nvvm` bindings (#818) - Fix mixed-IR liveness for inline overload DCE (#795) - Use dbg.declare for scalar kernel parameters (#828) - Fix FP8 uint64 cast flake on Windows (#829) - Extend dbg.value coverage to loadvar for scalar kernel parameters (#813)  Co-authored-by: Michael Wang <isVoid@users.noreply.github.com>

fix: add cudaRoundMode enum

cfc3787

leofang reviewed Mar 12, 2026

View reviewed changes

gmarkall approved these changes Mar 12, 2026

View reviewed changes

gmarkall merged commit 84360da into NVIDIA:main Mar 12, 2026
104 checks passed

mdboom mentioned this pull request Mar 13, 2026

Support cuda_bindings FastEnum #837

Merged

isVoid mentioned this pull request Mar 17, 2026

Bump Version to 0.29.0 #838

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(ci): cudaRoundMode typing failure in FP8 test#834

fix(ci): cudaRoundMode typing failure in FP8 test#834
gmarkall merged 1 commit into
NVIDIA:mainfrom
kaeun97:kaeun97/unblock-ci

kaeun97 commented Mar 11, 2026

Uh oh!

copy-pr-bot Bot commented Mar 11, 2026

Uh oh!

kaeun97 commented Mar 11, 2026 •

edited

Loading

Uh oh!

leofang Mar 12, 2026 •

edited

Loading

Uh oh!

leofang Mar 12, 2026

Uh oh!

gmarkall Mar 12, 2026

Uh oh!

leofang Mar 12, 2026

Uh oh!

gmarkall Mar 12, 2026

Uh oh!

leofang Mar 12, 2026

Uh oh!

isVoid Mar 12, 2026

Uh oh!

leofang commented Mar 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

	@singledispatch
	def typeof_impl(val, c):
	"""
	Generic typeof() implementation.
	"""
	tp = getattr(val, "_numba_type_", None)
	if tp is not None:
	return tp

	# Check for dlpack objects
	dlpack = getattr(val, "__dlpack__", None)
	if dlpack is not None:
	tp = _typeof_dlpack(dlpack, c)
	if tp is not None:
	return tp

	# Check for __cuda_array_interface__ objects (third-party device arrays)

	# Numba's own DeviceNDArray is handled above via _numba_type_.
	cai = getattr(val, "__cuda_array_interface__", None)
	if cai is not None:
	tp = _typeof_cuda_array_interface(cai, c)
	if tp is not None:
	return tp

	tp = _typeof_buffer(val, c)
	if tp is not None:
	return tp

	# cffi is handled here as it does not expose a public base class
	# for exported functions or CompiledFFI instances.
	from numba.cuda.typing import cffi_utils

	if cffi_utils.SUPPORTED:
	if cffi_utils.is_cffi_func(val):
	return cffi_utils.make_function_type(val)
	if cffi_utils.is_ffi_instance(val):
	return types.ffi

	if HAS_NUMBA:
	# Fallback to Numba's typeof_impl for third-party registrations
	from numba.core.typing.typeof import typeof_impl as core_typeof_impl

	tp = core_typeof_impl(val, c)
	if tp is not None:
	return tp

	return None

Conversation

kaeun97 commented Mar 11, 2026

Uh oh!

copy-pr-bot Bot commented Mar 11, 2026

Uh oh!

kaeun97 commented Mar 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

leofang Mar 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

leofang Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

gmarkall Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

leofang Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

gmarkall Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

leofang Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

isVoid Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

leofang commented Mar 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

kaeun97 commented Mar 11, 2026 •

edited

Loading

leofang Mar 12, 2026 •

edited

Loading