Refactor kernel#237
Conversation
…load - builtin.py: _install_sdpa() now only runs when torch_npu is importable, preventing the NPU (boolean-mask-inverting) SDPA impl from contaminating the global ALL_ATTENTION_FUNCTIONS['sdpa'] registry on CUDA/CPU hosts. - builtin.py: drop dead _SdpaPatchSentinel + add/pop scaffolding. - fla.py: flip is_flash_linear_attention_available only after the MindSpeed kernel imports successfully; previously a MindSpeed-missing NPU host would be left with FLA flagged available but no kernel installed -> Qwen3.5 runtime failure.
This reverts commit 126efc3.
There was a problem hiding this comment.
Code Review
This pull request refactors the Twinkle kernel module to introduce a mapping-driven kernel replacement API, exposing kernelize, hub, and npu_builtin while removing legacy registration and patch helpers. It also modularizes NPU-specific optimizations under src/twinkle/kernel/npu_impls/ and updates documentation and tests. A critical issue was identified in src/twinkle/kernel/core.py where the helper function _infer_device is missing, causing an ImportError in the test suite.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
PR type
PR information
Refactor
twinkle.kernelfrom registry-based API to a minimal mapping-driven API. Public surface reduced tokernelize,hub,npu_builtin.kernelize(model, mappings)applies class/attr replacements onto a live model (exacttype(m) is target_clsmatch)hub(*entries)declares kernels resolved lazily from the optionalkernelspackagenpu_builtin()returns the standard NPU bundle (RMSNorm, rotary, swiglu, SDPA, MoE, FLA); GMM opts in manuallyDeleted legacy
registry.py,function.py,layer.py,base.py,monkey_patch_npu.py; addednpu_impls/package. Migratedcookbook/transformers/{fsdp2,sp_fsdp_dense,ep_fsdp2_lora_qwen3_5_moe}.pyand rewrote zh/en Kernel docs.