Skip to content

Avoid unordered_map for runtime datatype mapping#3223

Open
LwhJesse wants to merge 1 commit into
NVIDIA:mainfrom
LwhJesse:perf/runtime-datatype-switch
Open

Avoid unordered_map for runtime datatype mapping#3223
LwhJesse wants to merge 1 commit into
NVIDIA:mainfrom
LwhJesse:perf/runtime-datatype-switch

Conversation

@LwhJesse
Copy link
Copy Markdown

@LwhJesse LwhJesse commented May 11, 2026

Summary

Replace per-call std::unordered_map construction used for RuntimeDatatype to cute::UMMA::MXF8F6F4Format conversion in GEMM operation wrappers with a switch-based local helper.

The mapping is fixed and small, so this avoids constructing a temporary hash map during argument update while preserving the supported mappings for:

  • RuntimeDatatype::kE4M3
  • RuntimeDatatype::kE5M2
  • RuntimeDatatype::kE3M2
  • RuntimeDatatype::kE2M1

This also fixes the unsupported runtime datatype path by replacing the previous assert string expression with a real debug assertion and an explicit Status::kErrorInvalidProblem return.

Changed files

  • tools/library/src/gemm_operation_3x.hpp
  • tools/library/src/sparse_gemm_operation_3x.hpp
  • tools/library/src/blockwise_gemm_operation_3x.hpp

Local validation

  • git diff --check HEAD~1..HEAD
  • Verified no remaining RuntimeDatatype std::unordered_map mapping in the touched files
  • Verified the corrected debug assertion path appears once in each touched file

Full local CUTLASS builds are not practical on my machine, so I am relying on project CI and maintainer review for full validation.

@LwhJesse LwhJesse marked this pull request as ready for review May 11, 2026 08:06
@LwhJesse
Copy link
Copy Markdown
Author

Hi maintainers, gentle ping on this runtime datatype mapping cleanup.

The change is intentionally small: it replaces repeated temporary std::unordered_map construction with a fixed switch helper across the three GEMM operation wrappers, and keeps the unsupported datatype path explicit with Status::kErrorInvalidProblem.

I have done local diff/text validation, but a full CUTLASS build is not practical on my machine, so I would appreciate CI/maintainer review when someone has bandwidth.

Is there a specific owner I should route this to?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant