Skip to content

Add name override for bind_cxx_struct and register_cxx_type API#320

Open
isVoid wants to merge 3 commits intomainfrom
fea-struct-name-override
Open

Add name override for bind_cxx_struct and register_cxx_type API#320
isVoid wants to merge 3 commits intomainfrom
fea-struct-name-override

Conversation

@isVoid
Copy link
Copy Markdown
Collaborator

@isVoid isVoid commented Apr 13, 2026

Summary

  • Add optional name parameter to bind_cxx_struct for overriding the C++ name used in shims and type registration
  • Add register_cxx_type() public API for registering external C++ types in numbast's type registry

Motivation

When binding class template specializations, struct_decl.name is the unqualified template name (e.g. "Matrix"), but the shim code and type registry need the fully qualified specialization name (e.g. "Namespace::Matrix<float, 3, 1>"). Without a name override, users must wrap the declaration object — the name parameter provides a clean way to do this.

Similarly, when binding functions that use types from external C++ libraries, those types may not be parseable by ast_canopy but their memory layout is known. register_cxx_type() lets users register these types so to_numba_type() returns the correct Numba type instead of Opaque.

Changes

bind_cxx_struct(name=...)struct.py

New keyword-only parameter name: str | None = None. When provided, overrides struct_decl.name throughout:

  • Numba type name
  • Python API class name
  • CTYPE_MAPS registration key
  • Shim template code (constructor, method, conversion operator shims)

Plumbed through bind_cxx_struct_ctors, bind_cxx_struct_regular_methods, and bind_cxx_struct_conversion_opeartors via a new struct_name parameter.

register_cxx_type(cxx_name, numba_type)types.py

Simple public function that registers a C++ type name → Numba type mapping in CTYPE_MAPS. Follows the same pattern as the existing register_enum_type().

Test plan

  • Existing numbast tests pass (no behavioral change when name is not provided)
  • bind_cxx_struct(..., name="Foo::Bar<int>") uses the override in shims and type registry

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Bug Fixes

    • Increased parser robustness: problematic declarations are skipped instead of crashing; unsupported template arguments and null/ dependent types are handled gracefully; mangling falls back when signatures are dependent.
  • New Features

    • Optional override for generated C++ struct name in the binding API.
    • New public helper to register custom C++→Numba type mappings.
  • Tests

    • Added tests and a sample source exercising complex template hierarchies.

isVoid and others added 2 commits April 13, 2026 14:08
ast_canopy segfaulted or threw unhandled exceptions when parsing headers
with deep template metaprogramming (CRTP hierarchies, dependent types,
template template parameters, expression-based template arguments).

Root causes and fixes:

1. function.cpp: mangleName() segfaults on dependent (uninstantiated)
   template types. Added has_dependent_signature() check to skip mangling
   for methods with dependent parameter/return types. Also fixed a memory
   leak (ItaniumMangleContext now uses unique_ptr).

2. type.cpp: getCanonicalType() crashes on dependent types. Added null-
   QualType guard and skip canonicalization for dependent types.

3. record.cpp: getTypeSize()/getTypeAlign() crash on dependent/incomplete
   types. Added isDependentType()/isIncompleteType() guards and try-catch
   around child declaration construction.

4. typedef.cpp: record_id_to_name->at() throws for template instantiations
   not in the map. Replaced with safe find() + fallback to Clang's name.
   Also added null guard for getAsCXXRecordDecl().

5. template_param.cpp: TemplateTemplateParmDecl constructor threw. Now
   handles gracefully with kind=template_.

6. class_template_specialization.cpp: Unsupported template argument kinds
   (Pack, Template, Expression) threw. Now produces "<unsupported>"
   placeholder.

7. All matchers (7 files): Added try-catch around push_back calls so a
   single bad declaration is skipped instead of crashing the entire parse.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two changes to support binding class template specializations and
externally-defined C++ types:

1. bind_cxx_struct: add optional `name` parameter

   When binding a class template specialization, struct_decl.name is
   the unqualified template name (e.g. "Matrix"), but the shim code
   and type registry need the fully qualified specialization name
   (e.g. "Namespace::Matrix<float, 3, 1>"). The new `name` parameter
   overrides struct_decl.name throughout: Numba type name, Python API
   class name, CTYPE_MAPS registration key, and shim templates.

   Plumbed through bind_cxx_struct_ctors, bind_cxx_struct_regular_methods,
   and bind_cxx_struct_conversion_opeartors.

2. register_cxx_type: new public API in types.py

   Allows registering external C++ types in numbast's type registry so
   that to_numba_type() returns the correct Numba type instead of
   Opaque. Useful for types whose headers are too complex to parse
   directly but whose layout is known.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 13, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: 679a329c-0581-4af8-b6c1-cd2a3d0791a2

📥 Commits

Reviewing files that changed from the base of the PR and between 52db994 and f6cebfb.

📒 Files selected for processing (2)
  • ast_canopy/cpp/src/record.cpp
  • ast_canopy/cpp/src/type.cpp

📝 Walkthrough

Walkthrough

Multiple C++ AST parsing components and matchers now tolerate construction failures by catching exceptions, template/type handling was made more defensive (placeholders, null/ dependent guards), record and function processing avoid unsafe operations, new complex-template tests were added, and Numbast struct/type binding APIs gained naming and registration capabilities.

Changes

Cohort / File(s) Summary
Matcher exception hardening
ast_canopy/cpp/src/detail/matchers/class_template_matcher.cpp, ast_canopy/cpp/src/detail/matchers/class_template_specialization_matcher.cpp, ast_canopy/cpp/src/detail/matchers/enum_matcher.cpp, ast_canopy/cpp/src/detail/matchers/function_matcher.cpp, ast_canopy/cpp/src/detail/matchers/function_template_matcher.cpp, ast_canopy/cpp/src/detail/matchers/record_matcher.cpp, ast_canopy/cpp/src/detail/matchers/typedef_matcher.cpp
Wrapped non-implicit declaration push_back/emplace operations in broad try/catch (...) blocks; exceptions are swallowed and problematic declarations are omitted rather than propagating.
Class template specialization & template handling
ast_canopy/cpp/src/class_template_specialization.cpp, ast_canopy/cpp/src/template.cpp, ast_canopy/cpp/src/template_param.cpp
Changed loop index type, replaced throwing on unrecognized template argument/parameter kinds with placeholder or skip behavior ("<unsupported>" or skip/initialize instead of std::runtime_error), and replaced std::transform with explicit loops to populate template parameters.
Function mangling / dependent signatures
ast_canopy/cpp/src/function.cpp
Added detection for template-dependent function signatures; when dependent, skip Itanium mangling and use qualified name as mangled fallback; refactored mangling to use std::unique_ptr and conditional path.
Type, typedef, and record robustness
ast_canopy/cpp/src/type.cpp, ast_canopy/cpp/src/typedef.cpp, ast_canopy/cpp/src/record.cpp
Guard against null QualType, set "<null-type>" early; conditional canonicalization heuristics preserved; typedef resolution now checks record_id_to_name and falls back to decl names/qualified strings; record member/ nested entity emplaces wrapped in try/catch (...); sizeof/align only computed for non-dependent, non-incomplete types.
Tests: complex template fixtures
ast_canopy/tests/data/sample_complex_templates.cu, ast_canopy/tests/test_complex_template_parsing.py
Added a CUDA C++ fixture exercising CRTP, template-typedefs, template-template params, dependent-return function templates, deep CRTP chains, and non-type default-expression specializations; added tests that parse this fixture with bypassed parse errors and assert expected declarations are present.
Numbast struct binding & type registration
numbast/src/numbast/struct.py, numbast/src/numbast/types.py
Extended bind_cxx_struct(...) API with optional `name: str

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

🐇 I nibble on templates with soft little paws,
I catch noisy errors and patch up the flaws.
Dependent names bowed, null-types tucked in tight,
Parsers keep hopping through day and night. ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 38.10% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main additions of the PR: the name override feature for bind_cxx_struct and the new register_cxx_type API, which are the primary new features implemented across the numbast module.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fea-struct-name-override

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@ast_canopy/cpp/src/detail/matchers/function_matcher.cpp`:
- Around line 30-36: The unconditional catch around constructing Function(FD)
swallows failures and silently omits entries from payload->decls->functions;
change this so failures are only bypassed when the caller explicitly opted into
best-effort parsing (the parse/bypass flag used by the caller), otherwise
rethrow or record the error. Locate the try/catch in the block guarded by
FD->isImplicit() and: 1) check the explicit best-effort/bypass flag before
swallowing exceptions, 2) if bypass is enabled, log a diagnostic entry (e.g.,
record the FD source location/name) indicating the dropped declaration instead
of silently ignoring it, and 3) if bypass is not enabled, propagate the
exception so the caller sees the failure (or convert it to a clear diagnostic)
rather than leaving payload->decls->functions partially populated.

In `@ast_canopy/cpp/src/detail/matchers/record_matcher.cpp`:
- Around line 56-64: The code updates record_id_map (via
payload->record_id_to_name) before constructing the Record, which can leave a
dangling map entry if Record(RD, RecordAncestor::ANCESTOR_IS_NOT_TEMPLATE,
rename_for_unnamed) throws; move the map assignment record_id_map[id] =
rename_for_unnamed inside the try block immediately before pushing into
payload->decls->records (or, if you must assign early, add an
erase(record_id_map, id) in the catch block) so map state and
payload->decls->records remain consistent; update references to record_id_map,
payload->record_id_to_name, payload->decls->records, Record, id, and
rename_for_unnamed accordingly.

In `@ast_canopy/cpp/src/record.cpp`:
- Around line 57-62: Replace the silent catch(...) blocks around the
field/method/nested record construction (e.g. the block that contains
fields.emplace_back(Field(FD, access))) with a two-tiered handler that logs what
was skipped: first catch(const std::exception& e) that emits a debug message
including the declaration kind and name (e.g. "field", FD->getNameAsString())
and e.what(), then a fallback catch(...) that emits a debug message with the
declaration kind and name but no exception text; apply the same pattern to the
corresponding try/catch sites for methods, templated_methods,
nested_class_templates, and nested_records so skipped declarations are reported
for debugging.

In `@ast_canopy/cpp/src/template.cpp`:
- Around line 21-38: The loop over clang::NamedDecl* in template.cpp currently
skips unrecognized template parameter kinds, causing template_parameters.size()
to differ from the original template arity; update the final else branch to
append a sentinel/placeholder TemplateParam (e.g., a default-constructed or
explicit "unknown" TemplateParam) to the template_parameters vector so
positional indexes remain stable; ensure the placeholder is distinguishable (via
an is_unknown flag or type) so downstream logic that uses template_parameters
and num_min_required_args (from the full TemplateParameterList) can detect and
handle unknown params without crashing.

In `@ast_canopy/tests/test_complex_template_parsing.py`:
- Around line 25-30: Create a module-scoped fixture named `decls` that calls the
current TU parse routine once (using the existing `source_path` fixture and the
same parse args currently duplicated across tests) and returns the parsed
declarations; update tests that currently reparse the CUDA TU to accept/use
`decls` instead of reparsing so the translation unit is parsed only once per
module (ensure the fixture name `decls` matches where tests expect parsed
declarations and reuse any helper like `parse_translation_unit` or the
project-specific parse function you currently call in each test).

In `@numbast/src/numbast/struct.py`:
- Around line 232-240: The shim symbol is being derived from struct_name (e.g.,
"Namespace::Matrix<float,3,1>") which injects invalid characters into the
generated extern "C" shim name; update bind_cxx_struct_conversion_opeartor and
the caller loop over struct_decl.conversion_operators() so the shim identifier
is generated from a safe canonical token (e.g., a mangled name, a stable hash of
s_type/struct_decl, or an incremental index) instead of raw struct_name; ensure
the new symbol contains only C identifier characters (letters, digits,
underscore) and use that sanitized symbol when emitting the extern "C"
__device__ function declaration in ShimWriter.
- Around line 479-482: The current registration only assigns C2N[struct_name] =
s_type which misses cases where bind_cxx_struct was called with a name override;
update bind_cxx_struct to also register the original declaration name by adding
C2N[struct_decl.name] = s_type when struct_decl.name != struct_name (and still
apply the aliases loop), mirroring the pattern used in class_template.py so
unqualified self-type AST references resolve correctly.

In `@numbast/src/numbast/types.py`:
- Around line 94-108: register_cxx_type currently overwrites entries in
CTYPE_MAPS silently; add a safety guard in register_cxx_type that checks if
cxx_name is already present and raises a clear exception (e.g., ValueError)
unless an explicit allow_overwrite / force boolean parameter is passed and true;
include the existing mapped type in the error message for clarity and update the
function signature/docstring to document the new parameter and behavior.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: 74b65c68-382c-45e1-bc94-a879c0b01820

📥 Commits

Reviewing files that changed from the base of the PR and between f29a975 and 52db994.

📒 Files selected for processing (18)
  • ast_canopy/cpp/src/class_template_specialization.cpp
  • ast_canopy/cpp/src/detail/matchers/class_template_matcher.cpp
  • ast_canopy/cpp/src/detail/matchers/class_template_specialization_matcher.cpp
  • ast_canopy/cpp/src/detail/matchers/enum_matcher.cpp
  • ast_canopy/cpp/src/detail/matchers/function_matcher.cpp
  • ast_canopy/cpp/src/detail/matchers/function_template_matcher.cpp
  • ast_canopy/cpp/src/detail/matchers/record_matcher.cpp
  • ast_canopy/cpp/src/detail/matchers/typedef_matcher.cpp
  • ast_canopy/cpp/src/function.cpp
  • ast_canopy/cpp/src/record.cpp
  • ast_canopy/cpp/src/template.cpp
  • ast_canopy/cpp/src/template_param.cpp
  • ast_canopy/cpp/src/type.cpp
  • ast_canopy/cpp/src/typedef.cpp
  • ast_canopy/tests/data/sample_complex_templates.cu
  • ast_canopy/tests/test_complex_template_parsing.py
  • numbast/src/numbast/struct.py
  • numbast/src/numbast/types.py

Comment on lines +30 to +36
if (!FD->isImplicit()) {
try {
payload->decls->functions.push_back(Function(FD));
} catch (...) {
// Skip functions that cannot be processed (e.g. those with
// dependent types or other issues during construction).
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don't silently drop functions unless the caller opted into best-effort parsing.

This catch is unconditional, so a failed Function(FD) now yields a partial decls->functions result even when the caller did not request bypass behavior, and nothing records which symbol was skipped. Please gate the swallow path on the parse/bypass flag and/or persist a diagnostic for the dropped declaration.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ast_canopy/cpp/src/detail/matchers/function_matcher.cpp` around lines 30 -
36, The unconditional catch around constructing Function(FD) swallows failures
and silently omits entries from payload->decls->functions; change this so
failures are only bypassed when the caller explicitly opted into best-effort
parsing (the parse/bypass flag used by the caller), otherwise rethrow or record
the error. Locate the try/catch in the block guarded by FD->isImplicit() and: 1)
check the explicit best-effort/bypass flag before swallowing exceptions, 2) if
bypass is enabled, log a diagnostic entry (e.g., record the FD source
location/name) indicating the dropped declaration instead of silently ignoring
it, and 3) if bypass is not enabled, propagate the exception so the caller sees
the failure (or convert it to a clear diagnostic) rather than leaving
payload->decls->functions partially populated.

Comment on lines 56 to +64
auto &record_id_map = *payload->record_id_to_name;
record_id_map[id] = rename_for_unnamed;
payload->decls->records.push_back(Record(
RD, RecordAncestor::ANCESTOR_IS_NOT_TEMPLATE, rename_for_unnamed));
try {
payload->decls->records.push_back(Record(
RD, RecordAncestor::ANCESTOR_IS_NOT_TEMPLATE, rename_for_unnamed));
} catch (...) {
// Skip records that cannot be fully processed (e.g. records with
// dependent child declarations that trigger issues).
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Potential inconsistency: record_id_map updated before Record construction may fail.

If Record construction throws, the record_id_map[id] entry persists but no corresponding Record is added to payload->decls->records. This could cause downstream code (like typedef resolution using record_id_to_name) to reference records that don't exist.

🛠️ Proposed fix: Move map update inside try block or remove on failure
       auto &record_id_map = *payload->record_id_to_name;
-      record_id_map[id] = rename_for_unnamed;
       try {
+        record_id_map[id] = rename_for_unnamed;
         payload->decls->records.push_back(Record(
             RD, RecordAncestor::ANCESTOR_IS_NOT_TEMPLATE, rename_for_unnamed));
       } catch (...) {
         // Skip records that cannot be fully processed (e.g. records with
         // dependent child declarations that trigger issues).
       }

Alternatively, erase the entry in the catch block if insertion into record_id_map must happen early for some reason.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
auto &record_id_map = *payload->record_id_to_name;
record_id_map[id] = rename_for_unnamed;
payload->decls->records.push_back(Record(
RD, RecordAncestor::ANCESTOR_IS_NOT_TEMPLATE, rename_for_unnamed));
try {
payload->decls->records.push_back(Record(
RD, RecordAncestor::ANCESTOR_IS_NOT_TEMPLATE, rename_for_unnamed));
} catch (...) {
// Skip records that cannot be fully processed (e.g. records with
// dependent child declarations that trigger issues).
}
auto &record_id_map = *payload->record_id_to_name;
try {
record_id_map[id] = rename_for_unnamed;
payload->decls->records.push_back(Record(
RD, RecordAncestor::ANCESTOR_IS_NOT_TEMPLATE, rename_for_unnamed));
} catch (...) {
// Skip records that cannot be fully processed (e.g. records with
// dependent child declarations that trigger issues).
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ast_canopy/cpp/src/detail/matchers/record_matcher.cpp` around lines 56 - 64,
The code updates record_id_map (via payload->record_id_to_name) before
constructing the Record, which can leave a dangling map entry if Record(RD,
RecordAncestor::ANCESTOR_IS_NOT_TEMPLATE, rename_for_unnamed) throws; move the
map assignment record_id_map[id] = rename_for_unnamed inside the try block
immediately before pushing into payload->decls->records (or, if you must assign
early, add an erase(record_id_map, id) in the catch block) so map state and
payload->decls->records remain consistent; update references to record_id_map,
payload->record_id_to_name, payload->decls->records, Record, id, and
rename_for_unnamed accordingly.

Comment on lines +57 to +62
try {
fields.emplace_back(Field(FD, access));
} catch (...) {
// Skip fields that cannot be processed (e.g. dependent types in
// uninstantiated templates).
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Consider adding debug logging for skipped declarations.

The broad catch (...) pattern is appropriate for parser resilience, but silently swallowing exceptions makes debugging harder when legitimate declarations are unexpectedly skipped.

🔧 Optional: Add debug logging for skipped declarations
       if (auto const *FD = clang::dyn_cast<clang::FieldDecl>(D)) {
         try {
           fields.emplace_back(Field(FD, access));
         } catch (...) {
-          // Skip fields that cannot be processed (e.g. dependent types in
-          // uninstantiated templates).
+#ifndef NDEBUG
+          std::cerr << "Skipped field: " << FD->getNameAsString()
+                    << " in " << name << std::endl;
+#endif
         }
       }

Apply similar logging to the other catch blocks for methods, templated_methods, nested_class_templates, and nested_records.

Also applies to: 71-76, 80-84, 88-92, 96-100

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ast_canopy/cpp/src/record.cpp` around lines 57 - 62, Replace the silent
catch(...) blocks around the field/method/nested record construction (e.g. the
block that contains fields.emplace_back(Field(FD, access))) with a two-tiered
handler that logs what was skipped: first catch(const std::exception& e) that
emits a debug message including the declaration kind and name (e.g. "field",
FD->getNameAsString()) and e.what(), then a fallback catch(...) that emits a
debug message with the declaration kind and name but no exception text; apply
the same pattern to the corresponding try/catch sites for methods,
templated_methods, nested_class_templates, and nested_records so skipped
declarations are reported for debugging.

Comment on lines +21 to +38
for (const clang::NamedDecl *ND : *TPL) {
if (const clang::TemplateTypeParmDecl *TPD =
clang::dyn_cast<clang::TemplateTypeParmDecl>(ND)) {
template_parameters.push_back(TemplateParam(TPD));
} else if (const clang::NonTypeTemplateParmDecl *TPD =
clang::dyn_cast<clang::NonTypeTemplateParmDecl>(ND)) {
template_parameters.push_back(TemplateParam(TPD));
} else if (const clang::TemplateTemplateParmDecl *TPD =
clang::dyn_cast<clang::TemplateTemplateParmDecl>(ND)) {
// Template template parameters (e.g. template<template<class> class X>).
// Use the TemplateTemplateParmDecl overload which now handles this
// gracefully instead of throwing.
template_parameters.push_back(TemplateParam(TPD));
} else {
// Unknown template parameter kind -- skip it rather than crashing.
// This can happen with future Clang additions or unusual AST nodes.
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Preserve template parameter arity when the kind is unknown.

num_min_required_args still comes from the full TemplateParameterList, but this loop now omits unrecognized parameters entirely. That makes template_parameters.size() diverge from the source template’s actual arity and can break any downstream positional logic. Push an explicit placeholder/sentinel TemplateParam instead of skipping the entry.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ast_canopy/cpp/src/template.cpp` around lines 21 - 38, The loop over
clang::NamedDecl* in template.cpp currently skips unrecognized template
parameter kinds, causing template_parameters.size() to differ from the original
template arity; update the final else branch to append a sentinel/placeholder
TemplateParam (e.g., a default-constructed or explicit "unknown" TemplateParam)
to the template_parameters vector so positional indexes remain stable; ensure
the placeholder is distinguishable (via an is_unknown flag or type) so
downstream logic that uses template_parameters and num_min_required_args (from
the full TemplateParameterList) can detect and handle unknown params without
crashing.

Comment on lines +25 to +30
@pytest.fixture(scope="module")
def source_path():
current_directory = os.path.dirname(os.path.abspath(__file__))
return os.path.join(
current_directory, "data", "sample_complex_templates.cu"
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Share a parsed declarations fixture across this module.

Every test reparses the same CUDA TU with the same arguments. A module-scoped decls fixture would keep the coverage but cut the clang parse cost to once per module, which should make this suite noticeably faster and less flaky in CI.

Also applies to: 41-145

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ast_canopy/tests/test_complex_template_parsing.py` around lines 25 - 30,
Create a module-scoped fixture named `decls` that calls the current TU parse
routine once (using the existing `source_path` fixture and the same parse args
currently duplicated across tests) and returns the parsed declarations; update
tests that currently reparse the CUDA TU to accept/use `decls` instead of
reparsing so the translation unit is parsed only once per module (ensure the
fixture name `decls` matches where tests expect parsed declarations and reuse
any helper like `parse_translation_unit` or the project-specific parse function
you currently call in each test).

Comment on lines +232 to +240
struct_decl: Struct,
struct_name: str,
s_type: nbtypes.Type,
shim_writer: ShimWriter,
):
"""Bind all conversion operators for a C++ struct."""
for conv in struct_decl.conversion_operators():
bind_cxx_struct_conversion_opeartor(
conv, struct_decl.name, s_type, shim_writer
conv, struct_name, s_type, shim_writer
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "Trace conversion-operator symbol generation"
rg -nC4 'bind_cxx_struct_conversion_opeartor|deduplicate_overloads|shim_func_name|struct_name' numbast/src/numbast/struct.py

echo
echo "Quick identifier sanity check for specialization-style names"
python - <<'PY'
import re
struct_name = "Namespace::Matrix<float,3,1>"
candidate = f"__{struct_name}__Zdummy"
print("candidate:", candidate)
print("is_c_identifier:", bool(re.match(r'^[A-Za-z_][A-Za-z0-9_]*$', candidate)))
PY

Repository: NVIDIA/numbast

Length of output: 7607


🏁 Script executed:

rg -A5 'struct_method_shim_layer_template\s*=' numbast/src/numbast/struct.py

Repository: NVIDIA/numbast

Length of output: 245


Do not derive shim symbol identifiers from struct_name.

With name="Namespace::Matrix<float,3,1>", the conversion-operator path feeds non-identifier characters (:, <, ,, >) into the generated shim function name via extern "C" __device__ int {func_name}(...), causing C/C++ compile failures.

Safe symbol-generation fix
 def bind_cxx_struct_conversion_opeartor(
     conv: StructMethod,
     struct_name: str,
     s_type: nbtypes.Type,
     shim_writer: ShimWriter,
 ):
     casted_type = to_numba_type(conv.return_type.unqualified_non_ref_type_name)
 
     # Lowering:
-    mangled_name = deduplicate_overloads(
-        f"__{struct_name}__{conv.mangled_name}"
-    )
+    # Keep user-facing C++ type spelling for shim signature (`name=struct_name`),
+    # but use a symbol-safe key for generated function identifiers.
+    mangled_name = deduplicate_overloads(f"__{conv.mangled_name}")
     shim_func_name = f"{mangled_name}_nbst"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@numbast/src/numbast/struct.py` around lines 232 - 240, The shim symbol is
being derived from struct_name (e.g., "Namespace::Matrix<float,3,1>") which
injects invalid characters into the generated extern "C" shim name; update
bind_cxx_struct_conversion_opeartor and the caller loop over
struct_decl.conversion_operators() so the shim identifier is generated from a
safe canonical token (e.g., a mangled name, a stable hash of s_type/struct_decl,
or an incremental index) instead of raw struct_name; ensure the new symbol
contains only C identifier characters (letters, digits, underscore) and use that
sanitized symbol when emitting the extern "C" __device__ function declaration in
ShimWriter.

Comment on lines +479 to 482
C2N[struct_name] = s_type
if struct_name in aliases:
for alias in aliases[struct_name]:
C2N[alias] = s_type
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "Show where unqualified names are used for type resolution in struct binding"
rg -nC3 'unqualified_non_ref_type_name|to_numba_arg_type|to_numba_type' numbast/src/numbast/struct.py numbast/src/numbast/types.py

echo
echo "Find current tests/uses of bind_cxx_struct name override"
rg -nC2 'bind_cxx_struct\(|name=' --glob '**/*test*.py' --glob '**/*.py'

Repository: NVIDIA/numbast

Length of output: 50370


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Get the full function signature and first few lines of bind_cxx_struct
echo "=== bind_cxx_struct function signature and context ==="
sed -n '420,450p' numbast/src/numbast/struct.py

echo
echo "=== Where struct_name variable is assigned within bind_cxx_struct ==="
sed -n '420,490p' numbast/src/numbast/struct.py | grep -n 'struct_name'

echo
echo "=== Look for usages of bind_cxx_struct with explicit 'name' parameter ==="
rg -A2 'bind_cxx_struct\s*\(' numbast/tests/ numbast/src/numbast/ | head -30

Repository: NVIDIA/numbast

Length of output: 2459


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Check the actual lines around 479-482 to see full context
echo "=== Full context around lines 479-482 of struct.py ==="
sed -n '465,490p' numbast/src/numbast/struct.py

echo
echo "=== Look for 'name' parameter usage in bind_cxx_struct ==="
sed -n '420,530p' numbast/src/numbast/struct.py | grep -n 'name\s*='

Repository: NVIDIA/numbast

Length of output: 1196


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Check function signature more explicitly
echo "=== Full bind_cxx_struct signature ==="
sed -n '420,445p' numbast/src/numbast/struct.py

echo
echo "=== Check if struct_decl has a 'name' attribute being used elsewhere ==="
rg 'struct_decl\.name' numbast/src/numbast/struct.py | head -20

Repository: NVIDIA/numbast

Length of output: 1576


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Search for calls to bind_cxx_struct with explicit name parameter
echo "=== Calls to bind_cxx_struct with explicit 'name=' parameter ==="
rg 'bind_cxx_struct.*name\s*=' --multiline numbast/src/numbast/

echo
echo "=== Check class_template.py for name override usage ==="
rg -C5 'bind_cxx_struct' numbast/src/numbast/class_template.py | head -50

Repository: NVIDIA/numbast

Length of output: 122


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Look for where struct_decl.name might be used in type resolution after bind_cxx_struct
echo "=== Search for unqualified self-type resolution in templated structs ==="
grep -n 'qual_name\|unqualified_non_ref_type_name' numbast/src/numbast/class_template.py | head -20

echo
echo "=== Check if struct_decl.name is referenced in method/ctor binding within struct.py ==="
sed -n '420,600p' numbast/src/numbast/struct.py | grep -n 'struct_decl\.' | head -15

Repository: NVIDIA/numbast

Length of output: 2211


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Verify if the 'name' parameter to bind_cxx_struct is ever actually used
echo "=== All calls to bind_cxx_struct in the entire codebase ==="
rg -B2 -A5 'bind_cxx_struct\(' numbast/

echo
echo "=== Check static/struct.py for how it handles struct names ==="
rg 'struct_name|qual_name' numbast/src/numbast/static/struct.py | head -30

Repository: NVIDIA/numbast

Length of output: 2980


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Look for tests that might verify name override behavior
echo "=== Search test files for bind_cxx_struct usage ==="
find numbast/tests -name '*.py' -exec grep -l 'bind_cxx_struct' {} \;

echo
echo "=== Check if there are any tests with name parameter ==="
grep -r 'bind_cxx_struct.*name\s*=' numbast/tests/ || echo "No matches found"

echo
echo "=== Verify the actual implementation path for the 'name' parameter ==="
sed -n '463,475p' numbast/src/numbast/struct.py

Repository: NVIDIA/numbast

Length of output: 643


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Check class_template.py more carefully to see if bind_cxx_struct is called there
echo "=== Occurrences of bind_cxx_struct in class_template.py ==="
grep -n 'bind_cxx_struct' numbast/src/numbast/class_template.py

echo
echo "=== Look for any dynamic calls or indirect invocations ==="
rg 'bind_cxx_struct' numbast/src/numbast/ --type py

echo
echo "=== Check if the 'name' parameter docstring is actually documented ==="
sed -n '420,455p' numbast/src/numbast/struct.py | tail -20

Repository: NVIDIA/numbast

Length of output: 4425


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Search for any scenario where struct_decl.name might differ from the registered struct_name
echo "=== Check if there are comment or type hints suggesting name override scenarios ==="
sed -n '420,465p' numbast/src/numbast/struct.py

echo
echo "=== Verify if methods/ctors in struct use struct_decl.name anywhere ==="
rg 'struct_decl\.name' numbast/src/numbast/struct.py

echo
echo "=== Check how static binding handles this ==="
grep -n 'struct_name\|python_struct_name' numbast/src/numbast/static/struct.py | head -10

Repository: NVIDIA/numbast

Length of output: 3224


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Check if methods/constructors emit unqualified self-type references
echo "=== Look for where self-type (unqualified struct name) might be emitted in method bindings ==="
sed -n '250,290p' numbast/src/numbast/struct.py

echo
echo "=== Check return types in method binding ==="
sed -n '265,280p' numbast/src/numbast/struct.py

Repository: NVIDIA/numbast

Length of output: 2738


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Look for how class template specializations handle struct names in practice
echo "=== Check if class_template actually uses the name parameter ==="
grep -n 'bind_cxx_struct(' numbast/src/numbast/class_template.py | head -5

echo
echo "=== Look for class template specialization handling ==="
sed -n '850,900p' numbast/src/numbast/class_template.py | head -40

Repository: NVIDIA/numbast

Length of output: 122


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Let's search for how class templates actually handle struct binding
echo "=== How does bind_cxx_structs (plural) work - does it pass name parameter? ==="
sed -n '560,610p' numbast/src/numbast/struct.py

echo
echo "=== Check if static struct binding has similar structure ==="
grep -n 'bind.*struct\|struct_name\|qual_name' numbast/src/numbast/static/struct.py | grep -A2 -B2 'qual_name'

Repository: NVIDIA/numbast

Length of output: 2042


🏁 Script executed:

#!/bin/bash
set -euo zipfail

# Verify: does a method returning the struct type need to look up struct_decl.name?
echo "=== Check constructor that returns self-type ==="
sed -n '55,75p' numbast/src/numbast/struct.py

echo
echo "=== If methods return unqualified struct name, how is it resolved? ==="
grep -n 'return.*struct\|return_type' numbast/src/numbast/struct.py | head -20

Repository: NVIDIA/numbast

Length of output: 110


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Direct test: create scenario where struct_name != struct_decl.name
# and see if C2N lookup would fail for struct_decl.name
echo "=== Trace through: if struct_name='Foo<int>' but struct_decl.name='Foo', what gets registered? ==="
sed -n '479,485p' numbast/src/numbast/struct.py

echo
echo "=== If a method body references unqualified 'Foo' type, would it resolve? ==="
echo "This would only fail if:"
echo "1. struct_name is overridden (name != struct_decl.name)"
echo "2. Method AST emits unqualified struct_decl.name reference"
echo "3. Code tries to lookup that unqualified name in C2N"

echo
echo "=== Search for any mechanism that would use struct_decl.name for type lookup ==="
rg 'struct_decl\.name' numbast/src/numbast/ --type py

Repository: NVIDIA/numbast

Length of output: 1580


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Critical finding: class_template.py DOES register struct_decl.name!
echo "=== COMPARISON: struct.py vs class_template.py C2N registration ==="
echo "--- struct.py lines 479-485 ---"
sed -n '479,485p' numbast/src/numbast/struct.py

echo
echo "--- class_template.py C2N registration ---"
rg -A3 'C2N\[struct_decl\.name\]' numbast/src/numbast/class_template.py

echo
echo "=== Where does class_template.py define this? Context around that registration ==="
rg -B10 -A5 'C2N\[struct_decl\.name\]' numbast/src/numbast/class_template.py | head -30

Repository: NVIDIA/numbast

Length of output: 1186


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Check if bind_cxx_struct name param is ever actually used in the codebase
echo "=== Final check: is the 'name' parameter ever supplied to bind_cxx_struct? ==="
rg 'bind_cxx_struct\(' numbast/ -A3 | grep -E 'name\s*='

echo
echo "=== If no matches, the 'name' feature is documented but never used ==="
echo "This confirms the gap: name parameter exists but isn't used, and struct_decl.name isn't registered separately"

Repository: NVIDIA/numbast

Length of output: 136


Register struct_decl.name separately to support name override scenarios.

The name parameter to bind_cxx_struct allows overriding the registered struct name (documented for class template specializations). When struct_name differs from struct_decl.name, only registering C2N[struct_name] misses unqualified self-type references that may be emitted by the AST. The pattern in class_template.py correctly registers both names separately.

Backward-compatible registration tweak
     C2N[struct_name] = s_type
+    if struct_name != struct_decl.name:
+        C2N[struct_decl.name] = s_type
     if struct_name in aliases:
         for alias in aliases[struct_name]:
             C2N[alias] = s_type
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@numbast/src/numbast/struct.py` around lines 479 - 482, The current
registration only assigns C2N[struct_name] = s_type which misses cases where
bind_cxx_struct was called with a name override; update bind_cxx_struct to also
register the original declaration name by adding C2N[struct_decl.name] = s_type
when struct_decl.name != struct_name (and still apply the aliases loop),
mirroring the pattern used in class_template.py so unqualified self-type AST
references resolve correctly.

Comment on lines +94 to +108
def register_cxx_type(cxx_name: str, numba_type: nbtypes.Type):
"""
Register an external C++ type in numbast's type registry.

After registration, ``to_numba_type(cxx_name)`` returns *numba_type*
instead of ``Opaque``. This is useful for binding types that were not
parsed by ast_canopy (e.g. third-party library types whose layout is
known but whose headers are too complex to parse).

Parameters:
cxx_name (str): The C++ type name as it appears in function signatures.
numba_type (nbtypes.Type): The Numba type to map it to.
"""
CTYPE_MAPS[cxx_name] = numba_type

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Add an overwrite guard for registry safety.

register_cxx_type currently allows silent replacement of existing keys in CTYPE_MAPS. A typo like "int" would globally remap built-ins and break downstream bindings.

Suggested hardening
-def register_cxx_type(cxx_name: str, numba_type: nbtypes.Type):
+def register_cxx_type(
+    cxx_name: str,
+    numba_type: nbtypes.Type,
+    *,
+    allow_override: bool = False,
+) -> None:
@@
-    CTYPE_MAPS[cxx_name] = numba_type
+    if not allow_override and cxx_name in CTYPE_MAPS:
+        raise ValueError(f"C++ type '{cxx_name}' is already registered")
+    CTYPE_MAPS[cxx_name] = numba_type
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def register_cxx_type(cxx_name: str, numba_type: nbtypes.Type):
"""
Register an external C++ type in numbast's type registry.
After registration, ``to_numba_type(cxx_name)`` returns *numba_type*
instead of ``Opaque``. This is useful for binding types that were not
parsed by ast_canopy (e.g. third-party library types whose layout is
known but whose headers are too complex to parse).
Parameters:
cxx_name (str): The C++ type name as it appears in function signatures.
numba_type (nbtypes.Type): The Numba type to map it to.
"""
CTYPE_MAPS[cxx_name] = numba_type
def register_cxx_type(
cxx_name: str,
numba_type: nbtypes.Type,
*,
allow_override: bool = False,
) -> None:
"""
Register an external C++ type in numbast's type registry.
After registration, ``to_numba_type(cxx_name)`` returns *numba_type*
instead of ``Opaque``. This is useful for binding types that were not
parsed by ast_canopy (e.g. third-party library types whose layout is
known but whose headers are too complex to parse).
Parameters:
cxx_name (str): The C++ type name as it appears in function signatures.
numba_type (nbtypes.Type): The Numba type to map it to.
"""
if not allow_override and cxx_name in CTYPE_MAPS:
raise ValueError(f"C++ type '{cxx_name}' is already registered")
CTYPE_MAPS[cxx_name] = numba_type
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@numbast/src/numbast/types.py` around lines 94 - 108, register_cxx_type
currently overwrites entries in CTYPE_MAPS silently; add a safety guard in
register_cxx_type that checks if cxx_name is already present and raises a clear
exception (e.g., ValueError) unless an explicit allow_overwrite / force boolean
parameter is passed and true; include the existing mapped type in the error
message for clarity and update the function signature/docstring to document the
new parameter and behavior.

…Type guard

- record.cpp: Add #include <limits> for std::numeric_limits used by
  INVALID_SIZE_OF/INVALID_ALIGN_OF constants. The header was transitively
  included in conda builds but not in the manylinux wheel build container.

- type.cpp: Revert the isDependentType() guard on getCanonicalType().
  Template type parameters (e.g. T) are dependent types in Clang's type
  system, so the guard was preventing canonicalization for ALL template
  parameters, changing their printed name from "type-parameter-0-0" to "T"
  and breaking 6 existing tests. The null-type guard is retained. The
  actual crash cases (mangling, size computation) are already guarded in
  function.cpp and record.cpp respectively.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@copy-pr-bot
Copy link
Copy Markdown
Contributor

copy-pr-bot bot commented Apr 15, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@isVoid
Copy link
Copy Markdown
Collaborator Author

isVoid commented Apr 15, 2026

/ok to test f6cebfb

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant