Add name override for bind_cxx_struct and register_cxx_type API#320
Add name override for bind_cxx_struct and register_cxx_type API#320
Conversation
ast_canopy segfaulted or threw unhandled exceptions when parsing headers with deep template metaprogramming (CRTP hierarchies, dependent types, template template parameters, expression-based template arguments). Root causes and fixes: 1. function.cpp: mangleName() segfaults on dependent (uninstantiated) template types. Added has_dependent_signature() check to skip mangling for methods with dependent parameter/return types. Also fixed a memory leak (ItaniumMangleContext now uses unique_ptr). 2. type.cpp: getCanonicalType() crashes on dependent types. Added null- QualType guard and skip canonicalization for dependent types. 3. record.cpp: getTypeSize()/getTypeAlign() crash on dependent/incomplete types. Added isDependentType()/isIncompleteType() guards and try-catch around child declaration construction. 4. typedef.cpp: record_id_to_name->at() throws for template instantiations not in the map. Replaced with safe find() + fallback to Clang's name. Also added null guard for getAsCXXRecordDecl(). 5. template_param.cpp: TemplateTemplateParmDecl constructor threw. Now handles gracefully with kind=template_. 6. class_template_specialization.cpp: Unsupported template argument kinds (Pack, Template, Expression) threw. Now produces "<unsupported>" placeholder. 7. All matchers (7 files): Added try-catch around push_back calls so a single bad declaration is skipped instead of crashing the entire parse. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two changes to support binding class template specializations and externally-defined C++ types: 1. bind_cxx_struct: add optional `name` parameter When binding a class template specialization, struct_decl.name is the unqualified template name (e.g. "Matrix"), but the shim code and type registry need the fully qualified specialization name (e.g. "Namespace::Matrix<float, 3, 1>"). The new `name` parameter overrides struct_decl.name throughout: Numba type name, Python API class name, CTYPE_MAPS registration key, and shim templates. Plumbed through bind_cxx_struct_ctors, bind_cxx_struct_regular_methods, and bind_cxx_struct_conversion_opeartors. 2. register_cxx_type: new public API in types.py Allows registering external C++ types in numbast's type registry so that to_numba_type() returns the correct Numba type instead of Opaque. Useful for types whose headers are too complex to parse directly but whose layout is known. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Plus Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughMultiple C++ AST parsing components and matchers now tolerate construction failures by catching exceptions, template/type handling was made more defensive (placeholders, null/ dependent guards), record and function processing avoid unsafe operations, new complex-template tests were added, and Numbast struct/type binding APIs gained naming and registration capabilities. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 8
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@ast_canopy/cpp/src/detail/matchers/function_matcher.cpp`:
- Around line 30-36: The unconditional catch around constructing Function(FD)
swallows failures and silently omits entries from payload->decls->functions;
change this so failures are only bypassed when the caller explicitly opted into
best-effort parsing (the parse/bypass flag used by the caller), otherwise
rethrow or record the error. Locate the try/catch in the block guarded by
FD->isImplicit() and: 1) check the explicit best-effort/bypass flag before
swallowing exceptions, 2) if bypass is enabled, log a diagnostic entry (e.g.,
record the FD source location/name) indicating the dropped declaration instead
of silently ignoring it, and 3) if bypass is not enabled, propagate the
exception so the caller sees the failure (or convert it to a clear diagnostic)
rather than leaving payload->decls->functions partially populated.
In `@ast_canopy/cpp/src/detail/matchers/record_matcher.cpp`:
- Around line 56-64: The code updates record_id_map (via
payload->record_id_to_name) before constructing the Record, which can leave a
dangling map entry if Record(RD, RecordAncestor::ANCESTOR_IS_NOT_TEMPLATE,
rename_for_unnamed) throws; move the map assignment record_id_map[id] =
rename_for_unnamed inside the try block immediately before pushing into
payload->decls->records (or, if you must assign early, add an
erase(record_id_map, id) in the catch block) so map state and
payload->decls->records remain consistent; update references to record_id_map,
payload->record_id_to_name, payload->decls->records, Record, id, and
rename_for_unnamed accordingly.
In `@ast_canopy/cpp/src/record.cpp`:
- Around line 57-62: Replace the silent catch(...) blocks around the
field/method/nested record construction (e.g. the block that contains
fields.emplace_back(Field(FD, access))) with a two-tiered handler that logs what
was skipped: first catch(const std::exception& e) that emits a debug message
including the declaration kind and name (e.g. "field", FD->getNameAsString())
and e.what(), then a fallback catch(...) that emits a debug message with the
declaration kind and name but no exception text; apply the same pattern to the
corresponding try/catch sites for methods, templated_methods,
nested_class_templates, and nested_records so skipped declarations are reported
for debugging.
In `@ast_canopy/cpp/src/template.cpp`:
- Around line 21-38: The loop over clang::NamedDecl* in template.cpp currently
skips unrecognized template parameter kinds, causing template_parameters.size()
to differ from the original template arity; update the final else branch to
append a sentinel/placeholder TemplateParam (e.g., a default-constructed or
explicit "unknown" TemplateParam) to the template_parameters vector so
positional indexes remain stable; ensure the placeholder is distinguishable (via
an is_unknown flag or type) so downstream logic that uses template_parameters
and num_min_required_args (from the full TemplateParameterList) can detect and
handle unknown params without crashing.
In `@ast_canopy/tests/test_complex_template_parsing.py`:
- Around line 25-30: Create a module-scoped fixture named `decls` that calls the
current TU parse routine once (using the existing `source_path` fixture and the
same parse args currently duplicated across tests) and returns the parsed
declarations; update tests that currently reparse the CUDA TU to accept/use
`decls` instead of reparsing so the translation unit is parsed only once per
module (ensure the fixture name `decls` matches where tests expect parsed
declarations and reuse any helper like `parse_translation_unit` or the
project-specific parse function you currently call in each test).
In `@numbast/src/numbast/struct.py`:
- Around line 232-240: The shim symbol is being derived from struct_name (e.g.,
"Namespace::Matrix<float,3,1>") which injects invalid characters into the
generated extern "C" shim name; update bind_cxx_struct_conversion_opeartor and
the caller loop over struct_decl.conversion_operators() so the shim identifier
is generated from a safe canonical token (e.g., a mangled name, a stable hash of
s_type/struct_decl, or an incremental index) instead of raw struct_name; ensure
the new symbol contains only C identifier characters (letters, digits,
underscore) and use that sanitized symbol when emitting the extern "C"
__device__ function declaration in ShimWriter.
- Around line 479-482: The current registration only assigns C2N[struct_name] =
s_type which misses cases where bind_cxx_struct was called with a name override;
update bind_cxx_struct to also register the original declaration name by adding
C2N[struct_decl.name] = s_type when struct_decl.name != struct_name (and still
apply the aliases loop), mirroring the pattern used in class_template.py so
unqualified self-type AST references resolve correctly.
In `@numbast/src/numbast/types.py`:
- Around line 94-108: register_cxx_type currently overwrites entries in
CTYPE_MAPS silently; add a safety guard in register_cxx_type that checks if
cxx_name is already present and raises a clear exception (e.g., ValueError)
unless an explicit allow_overwrite / force boolean parameter is passed and true;
include the existing mapped type in the error message for clarity and update the
function signature/docstring to document the new parameter and behavior.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro Plus
Run ID: 74b65c68-382c-45e1-bc94-a879c0b01820
📒 Files selected for processing (18)
ast_canopy/cpp/src/class_template_specialization.cppast_canopy/cpp/src/detail/matchers/class_template_matcher.cppast_canopy/cpp/src/detail/matchers/class_template_specialization_matcher.cppast_canopy/cpp/src/detail/matchers/enum_matcher.cppast_canopy/cpp/src/detail/matchers/function_matcher.cppast_canopy/cpp/src/detail/matchers/function_template_matcher.cppast_canopy/cpp/src/detail/matchers/record_matcher.cppast_canopy/cpp/src/detail/matchers/typedef_matcher.cppast_canopy/cpp/src/function.cppast_canopy/cpp/src/record.cppast_canopy/cpp/src/template.cppast_canopy/cpp/src/template_param.cppast_canopy/cpp/src/type.cppast_canopy/cpp/src/typedef.cppast_canopy/tests/data/sample_complex_templates.cuast_canopy/tests/test_complex_template_parsing.pynumbast/src/numbast/struct.pynumbast/src/numbast/types.py
| if (!FD->isImplicit()) { | ||
| try { | ||
| payload->decls->functions.push_back(Function(FD)); | ||
| } catch (...) { | ||
| // Skip functions that cannot be processed (e.g. those with | ||
| // dependent types or other issues during construction). | ||
| } |
There was a problem hiding this comment.
Don't silently drop functions unless the caller opted into best-effort parsing.
This catch is unconditional, so a failed Function(FD) now yields a partial decls->functions result even when the caller did not request bypass behavior, and nothing records which symbol was skipped. Please gate the swallow path on the parse/bypass flag and/or persist a diagnostic for the dropped declaration.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@ast_canopy/cpp/src/detail/matchers/function_matcher.cpp` around lines 30 -
36, The unconditional catch around constructing Function(FD) swallows failures
and silently omits entries from payload->decls->functions; change this so
failures are only bypassed when the caller explicitly opted into best-effort
parsing (the parse/bypass flag used by the caller), otherwise rethrow or record
the error. Locate the try/catch in the block guarded by FD->isImplicit() and: 1)
check the explicit best-effort/bypass flag before swallowing exceptions, 2) if
bypass is enabled, log a diagnostic entry (e.g., record the FD source
location/name) indicating the dropped declaration instead of silently ignoring
it, and 3) if bypass is not enabled, propagate the exception so the caller sees
the failure (or convert it to a clear diagnostic) rather than leaving
payload->decls->functions partially populated.
| auto &record_id_map = *payload->record_id_to_name; | ||
| record_id_map[id] = rename_for_unnamed; | ||
| payload->decls->records.push_back(Record( | ||
| RD, RecordAncestor::ANCESTOR_IS_NOT_TEMPLATE, rename_for_unnamed)); | ||
| try { | ||
| payload->decls->records.push_back(Record( | ||
| RD, RecordAncestor::ANCESTOR_IS_NOT_TEMPLATE, rename_for_unnamed)); | ||
| } catch (...) { | ||
| // Skip records that cannot be fully processed (e.g. records with | ||
| // dependent child declarations that trigger issues). | ||
| } |
There was a problem hiding this comment.
Potential inconsistency: record_id_map updated before Record construction may fail.
If Record construction throws, the record_id_map[id] entry persists but no corresponding Record is added to payload->decls->records. This could cause downstream code (like typedef resolution using record_id_to_name) to reference records that don't exist.
🛠️ Proposed fix: Move map update inside try block or remove on failure
auto &record_id_map = *payload->record_id_to_name;
- record_id_map[id] = rename_for_unnamed;
try {
+ record_id_map[id] = rename_for_unnamed;
payload->decls->records.push_back(Record(
RD, RecordAncestor::ANCESTOR_IS_NOT_TEMPLATE, rename_for_unnamed));
} catch (...) {
// Skip records that cannot be fully processed (e.g. records with
// dependent child declarations that trigger issues).
}Alternatively, erase the entry in the catch block if insertion into record_id_map must happen early for some reason.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| auto &record_id_map = *payload->record_id_to_name; | |
| record_id_map[id] = rename_for_unnamed; | |
| payload->decls->records.push_back(Record( | |
| RD, RecordAncestor::ANCESTOR_IS_NOT_TEMPLATE, rename_for_unnamed)); | |
| try { | |
| payload->decls->records.push_back(Record( | |
| RD, RecordAncestor::ANCESTOR_IS_NOT_TEMPLATE, rename_for_unnamed)); | |
| } catch (...) { | |
| // Skip records that cannot be fully processed (e.g. records with | |
| // dependent child declarations that trigger issues). | |
| } | |
| auto &record_id_map = *payload->record_id_to_name; | |
| try { | |
| record_id_map[id] = rename_for_unnamed; | |
| payload->decls->records.push_back(Record( | |
| RD, RecordAncestor::ANCESTOR_IS_NOT_TEMPLATE, rename_for_unnamed)); | |
| } catch (...) { | |
| // Skip records that cannot be fully processed (e.g. records with | |
| // dependent child declarations that trigger issues). | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@ast_canopy/cpp/src/detail/matchers/record_matcher.cpp` around lines 56 - 64,
The code updates record_id_map (via payload->record_id_to_name) before
constructing the Record, which can leave a dangling map entry if Record(RD,
RecordAncestor::ANCESTOR_IS_NOT_TEMPLATE, rename_for_unnamed) throws; move the
map assignment record_id_map[id] = rename_for_unnamed inside the try block
immediately before pushing into payload->decls->records (or, if you must assign
early, add an erase(record_id_map, id) in the catch block) so map state and
payload->decls->records remain consistent; update references to record_id_map,
payload->record_id_to_name, payload->decls->records, Record, id, and
rename_for_unnamed accordingly.
| try { | ||
| fields.emplace_back(Field(FD, access)); | ||
| } catch (...) { | ||
| // Skip fields that cannot be processed (e.g. dependent types in | ||
| // uninstantiated templates). | ||
| } |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Consider adding debug logging for skipped declarations.
The broad catch (...) pattern is appropriate for parser resilience, but silently swallowing exceptions makes debugging harder when legitimate declarations are unexpectedly skipped.
🔧 Optional: Add debug logging for skipped declarations
if (auto const *FD = clang::dyn_cast<clang::FieldDecl>(D)) {
try {
fields.emplace_back(Field(FD, access));
} catch (...) {
- // Skip fields that cannot be processed (e.g. dependent types in
- // uninstantiated templates).
+#ifndef NDEBUG
+ std::cerr << "Skipped field: " << FD->getNameAsString()
+ << " in " << name << std::endl;
+#endif
}
}Apply similar logging to the other catch blocks for methods, templated_methods, nested_class_templates, and nested_records.
Also applies to: 71-76, 80-84, 88-92, 96-100
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@ast_canopy/cpp/src/record.cpp` around lines 57 - 62, Replace the silent
catch(...) blocks around the field/method/nested record construction (e.g. the
block that contains fields.emplace_back(Field(FD, access))) with a two-tiered
handler that logs what was skipped: first catch(const std::exception& e) that
emits a debug message including the declaration kind and name (e.g. "field",
FD->getNameAsString()) and e.what(), then a fallback catch(...) that emits a
debug message with the declaration kind and name but no exception text; apply
the same pattern to the corresponding try/catch sites for methods,
templated_methods, nested_class_templates, and nested_records so skipped
declarations are reported for debugging.
| for (const clang::NamedDecl *ND : *TPL) { | ||
| if (const clang::TemplateTypeParmDecl *TPD = | ||
| clang::dyn_cast<clang::TemplateTypeParmDecl>(ND)) { | ||
| template_parameters.push_back(TemplateParam(TPD)); | ||
| } else if (const clang::NonTypeTemplateParmDecl *TPD = | ||
| clang::dyn_cast<clang::NonTypeTemplateParmDecl>(ND)) { | ||
| template_parameters.push_back(TemplateParam(TPD)); | ||
| } else if (const clang::TemplateTemplateParmDecl *TPD = | ||
| clang::dyn_cast<clang::TemplateTemplateParmDecl>(ND)) { | ||
| // Template template parameters (e.g. template<template<class> class X>). | ||
| // Use the TemplateTemplateParmDecl overload which now handles this | ||
| // gracefully instead of throwing. | ||
| template_parameters.push_back(TemplateParam(TPD)); | ||
| } else { | ||
| // Unknown template parameter kind -- skip it rather than crashing. | ||
| // This can happen with future Clang additions or unusual AST nodes. | ||
| } | ||
| } |
There was a problem hiding this comment.
Preserve template parameter arity when the kind is unknown.
num_min_required_args still comes from the full TemplateParameterList, but this loop now omits unrecognized parameters entirely. That makes template_parameters.size() diverge from the source template’s actual arity and can break any downstream positional logic. Push an explicit placeholder/sentinel TemplateParam instead of skipping the entry.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@ast_canopy/cpp/src/template.cpp` around lines 21 - 38, The loop over
clang::NamedDecl* in template.cpp currently skips unrecognized template
parameter kinds, causing template_parameters.size() to differ from the original
template arity; update the final else branch to append a sentinel/placeholder
TemplateParam (e.g., a default-constructed or explicit "unknown" TemplateParam)
to the template_parameters vector so positional indexes remain stable; ensure
the placeholder is distinguishable (via an is_unknown flag or type) so
downstream logic that uses template_parameters and num_min_required_args (from
the full TemplateParameterList) can detect and handle unknown params without
crashing.
| @pytest.fixture(scope="module") | ||
| def source_path(): | ||
| current_directory = os.path.dirname(os.path.abspath(__file__)) | ||
| return os.path.join( | ||
| current_directory, "data", "sample_complex_templates.cu" | ||
| ) |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Share a parsed declarations fixture across this module.
Every test reparses the same CUDA TU with the same arguments. A module-scoped decls fixture would keep the coverage but cut the clang parse cost to once per module, which should make this suite noticeably faster and less flaky in CI.
Also applies to: 41-145
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@ast_canopy/tests/test_complex_template_parsing.py` around lines 25 - 30,
Create a module-scoped fixture named `decls` that calls the current TU parse
routine once (using the existing `source_path` fixture and the same parse args
currently duplicated across tests) and returns the parsed declarations; update
tests that currently reparse the CUDA TU to accept/use `decls` instead of
reparsing so the translation unit is parsed only once per module (ensure the
fixture name `decls` matches where tests expect parsed declarations and reuse
any helper like `parse_translation_unit` or the project-specific parse function
you currently call in each test).
| struct_decl: Struct, | ||
| struct_name: str, | ||
| s_type: nbtypes.Type, | ||
| shim_writer: ShimWriter, | ||
| ): | ||
| """Bind all conversion operators for a C++ struct.""" | ||
| for conv in struct_decl.conversion_operators(): | ||
| bind_cxx_struct_conversion_opeartor( | ||
| conv, struct_decl.name, s_type, shim_writer | ||
| conv, struct_name, s_type, shim_writer |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "Trace conversion-operator symbol generation"
rg -nC4 'bind_cxx_struct_conversion_opeartor|deduplicate_overloads|shim_func_name|struct_name' numbast/src/numbast/struct.py
echo
echo "Quick identifier sanity check for specialization-style names"
python - <<'PY'
import re
struct_name = "Namespace::Matrix<float,3,1>"
candidate = f"__{struct_name}__Zdummy"
print("candidate:", candidate)
print("is_c_identifier:", bool(re.match(r'^[A-Za-z_][A-Za-z0-9_]*$', candidate)))
PYRepository: NVIDIA/numbast
Length of output: 7607
🏁 Script executed:
rg -A5 'struct_method_shim_layer_template\s*=' numbast/src/numbast/struct.pyRepository: NVIDIA/numbast
Length of output: 245
Do not derive shim symbol identifiers from struct_name.
With name="Namespace::Matrix<float,3,1>", the conversion-operator path feeds non-identifier characters (:, <, ,, >) into the generated shim function name via extern "C" __device__ int {func_name}(...), causing C/C++ compile failures.
Safe symbol-generation fix
def bind_cxx_struct_conversion_opeartor(
conv: StructMethod,
struct_name: str,
s_type: nbtypes.Type,
shim_writer: ShimWriter,
):
casted_type = to_numba_type(conv.return_type.unqualified_non_ref_type_name)
# Lowering:
- mangled_name = deduplicate_overloads(
- f"__{struct_name}__{conv.mangled_name}"
- )
+ # Keep user-facing C++ type spelling for shim signature (`name=struct_name`),
+ # but use a symbol-safe key for generated function identifiers.
+ mangled_name = deduplicate_overloads(f"__{conv.mangled_name}")
shim_func_name = f"{mangled_name}_nbst"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@numbast/src/numbast/struct.py` around lines 232 - 240, The shim symbol is
being derived from struct_name (e.g., "Namespace::Matrix<float,3,1>") which
injects invalid characters into the generated extern "C" shim name; update
bind_cxx_struct_conversion_opeartor and the caller loop over
struct_decl.conversion_operators() so the shim identifier is generated from a
safe canonical token (e.g., a mangled name, a stable hash of s_type/struct_decl,
or an incremental index) instead of raw struct_name; ensure the new symbol
contains only C identifier characters (letters, digits, underscore) and use that
sanitized symbol when emitting the extern "C" __device__ function declaration in
ShimWriter.
| C2N[struct_name] = s_type | ||
| if struct_name in aliases: | ||
| for alias in aliases[struct_name]: | ||
| C2N[alias] = s_type |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "Show where unqualified names are used for type resolution in struct binding"
rg -nC3 'unqualified_non_ref_type_name|to_numba_arg_type|to_numba_type' numbast/src/numbast/struct.py numbast/src/numbast/types.py
echo
echo "Find current tests/uses of bind_cxx_struct name override"
rg -nC2 'bind_cxx_struct\(|name=' --glob '**/*test*.py' --glob '**/*.py'Repository: NVIDIA/numbast
Length of output: 50370
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Get the full function signature and first few lines of bind_cxx_struct
echo "=== bind_cxx_struct function signature and context ==="
sed -n '420,450p' numbast/src/numbast/struct.py
echo
echo "=== Where struct_name variable is assigned within bind_cxx_struct ==="
sed -n '420,490p' numbast/src/numbast/struct.py | grep -n 'struct_name'
echo
echo "=== Look for usages of bind_cxx_struct with explicit 'name' parameter ==="
rg -A2 'bind_cxx_struct\s*\(' numbast/tests/ numbast/src/numbast/ | head -30Repository: NVIDIA/numbast
Length of output: 2459
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Check the actual lines around 479-482 to see full context
echo "=== Full context around lines 479-482 of struct.py ==="
sed -n '465,490p' numbast/src/numbast/struct.py
echo
echo "=== Look for 'name' parameter usage in bind_cxx_struct ==="
sed -n '420,530p' numbast/src/numbast/struct.py | grep -n 'name\s*='Repository: NVIDIA/numbast
Length of output: 1196
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Check function signature more explicitly
echo "=== Full bind_cxx_struct signature ==="
sed -n '420,445p' numbast/src/numbast/struct.py
echo
echo "=== Check if struct_decl has a 'name' attribute being used elsewhere ==="
rg 'struct_decl\.name' numbast/src/numbast/struct.py | head -20Repository: NVIDIA/numbast
Length of output: 1576
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Search for calls to bind_cxx_struct with explicit name parameter
echo "=== Calls to bind_cxx_struct with explicit 'name=' parameter ==="
rg 'bind_cxx_struct.*name\s*=' --multiline numbast/src/numbast/
echo
echo "=== Check class_template.py for name override usage ==="
rg -C5 'bind_cxx_struct' numbast/src/numbast/class_template.py | head -50Repository: NVIDIA/numbast
Length of output: 122
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Look for where struct_decl.name might be used in type resolution after bind_cxx_struct
echo "=== Search for unqualified self-type resolution in templated structs ==="
grep -n 'qual_name\|unqualified_non_ref_type_name' numbast/src/numbast/class_template.py | head -20
echo
echo "=== Check if struct_decl.name is referenced in method/ctor binding within struct.py ==="
sed -n '420,600p' numbast/src/numbast/struct.py | grep -n 'struct_decl\.' | head -15Repository: NVIDIA/numbast
Length of output: 2211
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Verify if the 'name' parameter to bind_cxx_struct is ever actually used
echo "=== All calls to bind_cxx_struct in the entire codebase ==="
rg -B2 -A5 'bind_cxx_struct\(' numbast/
echo
echo "=== Check static/struct.py for how it handles struct names ==="
rg 'struct_name|qual_name' numbast/src/numbast/static/struct.py | head -30Repository: NVIDIA/numbast
Length of output: 2980
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Look for tests that might verify name override behavior
echo "=== Search test files for bind_cxx_struct usage ==="
find numbast/tests -name '*.py' -exec grep -l 'bind_cxx_struct' {} \;
echo
echo "=== Check if there are any tests with name parameter ==="
grep -r 'bind_cxx_struct.*name\s*=' numbast/tests/ || echo "No matches found"
echo
echo "=== Verify the actual implementation path for the 'name' parameter ==="
sed -n '463,475p' numbast/src/numbast/struct.pyRepository: NVIDIA/numbast
Length of output: 643
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Check class_template.py more carefully to see if bind_cxx_struct is called there
echo "=== Occurrences of bind_cxx_struct in class_template.py ==="
grep -n 'bind_cxx_struct' numbast/src/numbast/class_template.py
echo
echo "=== Look for any dynamic calls or indirect invocations ==="
rg 'bind_cxx_struct' numbast/src/numbast/ --type py
echo
echo "=== Check if the 'name' parameter docstring is actually documented ==="
sed -n '420,455p' numbast/src/numbast/struct.py | tail -20Repository: NVIDIA/numbast
Length of output: 4425
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Search for any scenario where struct_decl.name might differ from the registered struct_name
echo "=== Check if there are comment or type hints suggesting name override scenarios ==="
sed -n '420,465p' numbast/src/numbast/struct.py
echo
echo "=== Verify if methods/ctors in struct use struct_decl.name anywhere ==="
rg 'struct_decl\.name' numbast/src/numbast/struct.py
echo
echo "=== Check how static binding handles this ==="
grep -n 'struct_name\|python_struct_name' numbast/src/numbast/static/struct.py | head -10Repository: NVIDIA/numbast
Length of output: 3224
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Check if methods/constructors emit unqualified self-type references
echo "=== Look for where self-type (unqualified struct name) might be emitted in method bindings ==="
sed -n '250,290p' numbast/src/numbast/struct.py
echo
echo "=== Check return types in method binding ==="
sed -n '265,280p' numbast/src/numbast/struct.pyRepository: NVIDIA/numbast
Length of output: 2738
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Look for how class template specializations handle struct names in practice
echo "=== Check if class_template actually uses the name parameter ==="
grep -n 'bind_cxx_struct(' numbast/src/numbast/class_template.py | head -5
echo
echo "=== Look for class template specialization handling ==="
sed -n '850,900p' numbast/src/numbast/class_template.py | head -40Repository: NVIDIA/numbast
Length of output: 122
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Let's search for how class templates actually handle struct binding
echo "=== How does bind_cxx_structs (plural) work - does it pass name parameter? ==="
sed -n '560,610p' numbast/src/numbast/struct.py
echo
echo "=== Check if static struct binding has similar structure ==="
grep -n 'bind.*struct\|struct_name\|qual_name' numbast/src/numbast/static/struct.py | grep -A2 -B2 'qual_name'Repository: NVIDIA/numbast
Length of output: 2042
🏁 Script executed:
#!/bin/bash
set -euo zipfail
# Verify: does a method returning the struct type need to look up struct_decl.name?
echo "=== Check constructor that returns self-type ==="
sed -n '55,75p' numbast/src/numbast/struct.py
echo
echo "=== If methods return unqualified struct name, how is it resolved? ==="
grep -n 'return.*struct\|return_type' numbast/src/numbast/struct.py | head -20Repository: NVIDIA/numbast
Length of output: 110
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Direct test: create scenario where struct_name != struct_decl.name
# and see if C2N lookup would fail for struct_decl.name
echo "=== Trace through: if struct_name='Foo<int>' but struct_decl.name='Foo', what gets registered? ==="
sed -n '479,485p' numbast/src/numbast/struct.py
echo
echo "=== If a method body references unqualified 'Foo' type, would it resolve? ==="
echo "This would only fail if:"
echo "1. struct_name is overridden (name != struct_decl.name)"
echo "2. Method AST emits unqualified struct_decl.name reference"
echo "3. Code tries to lookup that unqualified name in C2N"
echo
echo "=== Search for any mechanism that would use struct_decl.name for type lookup ==="
rg 'struct_decl\.name' numbast/src/numbast/ --type pyRepository: NVIDIA/numbast
Length of output: 1580
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Critical finding: class_template.py DOES register struct_decl.name!
echo "=== COMPARISON: struct.py vs class_template.py C2N registration ==="
echo "--- struct.py lines 479-485 ---"
sed -n '479,485p' numbast/src/numbast/struct.py
echo
echo "--- class_template.py C2N registration ---"
rg -A3 'C2N\[struct_decl\.name\]' numbast/src/numbast/class_template.py
echo
echo "=== Where does class_template.py define this? Context around that registration ==="
rg -B10 -A5 'C2N\[struct_decl\.name\]' numbast/src/numbast/class_template.py | head -30Repository: NVIDIA/numbast
Length of output: 1186
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Check if bind_cxx_struct name param is ever actually used in the codebase
echo "=== Final check: is the 'name' parameter ever supplied to bind_cxx_struct? ==="
rg 'bind_cxx_struct\(' numbast/ -A3 | grep -E 'name\s*='
echo
echo "=== If no matches, the 'name' feature is documented but never used ==="
echo "This confirms the gap: name parameter exists but isn't used, and struct_decl.name isn't registered separately"Repository: NVIDIA/numbast
Length of output: 136
Register struct_decl.name separately to support name override scenarios.
The name parameter to bind_cxx_struct allows overriding the registered struct name (documented for class template specializations). When struct_name differs from struct_decl.name, only registering C2N[struct_name] misses unqualified self-type references that may be emitted by the AST. The pattern in class_template.py correctly registers both names separately.
Backward-compatible registration tweak
C2N[struct_name] = s_type
+ if struct_name != struct_decl.name:
+ C2N[struct_decl.name] = s_type
if struct_name in aliases:
for alias in aliases[struct_name]:
C2N[alias] = s_type🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@numbast/src/numbast/struct.py` around lines 479 - 482, The current
registration only assigns C2N[struct_name] = s_type which misses cases where
bind_cxx_struct was called with a name override; update bind_cxx_struct to also
register the original declaration name by adding C2N[struct_decl.name] = s_type
when struct_decl.name != struct_name (and still apply the aliases loop),
mirroring the pattern used in class_template.py so unqualified self-type AST
references resolve correctly.
| def register_cxx_type(cxx_name: str, numba_type: nbtypes.Type): | ||
| """ | ||
| Register an external C++ type in numbast's type registry. | ||
|
|
||
| After registration, ``to_numba_type(cxx_name)`` returns *numba_type* | ||
| instead of ``Opaque``. This is useful for binding types that were not | ||
| parsed by ast_canopy (e.g. third-party library types whose layout is | ||
| known but whose headers are too complex to parse). | ||
|
|
||
| Parameters: | ||
| cxx_name (str): The C++ type name as it appears in function signatures. | ||
| numba_type (nbtypes.Type): The Numba type to map it to. | ||
| """ | ||
| CTYPE_MAPS[cxx_name] = numba_type | ||
|
|
There was a problem hiding this comment.
Add an overwrite guard for registry safety.
register_cxx_type currently allows silent replacement of existing keys in CTYPE_MAPS. A typo like "int" would globally remap built-ins and break downstream bindings.
Suggested hardening
-def register_cxx_type(cxx_name: str, numba_type: nbtypes.Type):
+def register_cxx_type(
+ cxx_name: str,
+ numba_type: nbtypes.Type,
+ *,
+ allow_override: bool = False,
+) -> None:
@@
- CTYPE_MAPS[cxx_name] = numba_type
+ if not allow_override and cxx_name in CTYPE_MAPS:
+ raise ValueError(f"C++ type '{cxx_name}' is already registered")
+ CTYPE_MAPS[cxx_name] = numba_type📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def register_cxx_type(cxx_name: str, numba_type: nbtypes.Type): | |
| """ | |
| Register an external C++ type in numbast's type registry. | |
| After registration, ``to_numba_type(cxx_name)`` returns *numba_type* | |
| instead of ``Opaque``. This is useful for binding types that were not | |
| parsed by ast_canopy (e.g. third-party library types whose layout is | |
| known but whose headers are too complex to parse). | |
| Parameters: | |
| cxx_name (str): The C++ type name as it appears in function signatures. | |
| numba_type (nbtypes.Type): The Numba type to map it to. | |
| """ | |
| CTYPE_MAPS[cxx_name] = numba_type | |
| def register_cxx_type( | |
| cxx_name: str, | |
| numba_type: nbtypes.Type, | |
| *, | |
| allow_override: bool = False, | |
| ) -> None: | |
| """ | |
| Register an external C++ type in numbast's type registry. | |
| After registration, ``to_numba_type(cxx_name)`` returns *numba_type* | |
| instead of ``Opaque``. This is useful for binding types that were not | |
| parsed by ast_canopy (e.g. third-party library types whose layout is | |
| known but whose headers are too complex to parse). | |
| Parameters: | |
| cxx_name (str): The C++ type name as it appears in function signatures. | |
| numba_type (nbtypes.Type): The Numba type to map it to. | |
| """ | |
| if not allow_override and cxx_name in CTYPE_MAPS: | |
| raise ValueError(f"C++ type '{cxx_name}' is already registered") | |
| CTYPE_MAPS[cxx_name] = numba_type |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@numbast/src/numbast/types.py` around lines 94 - 108, register_cxx_type
currently overwrites entries in CTYPE_MAPS silently; add a safety guard in
register_cxx_type that checks if cxx_name is already present and raises a clear
exception (e.g., ValueError) unless an explicit allow_overwrite / force boolean
parameter is passed and true; include the existing mapped type in the error
message for clarity and update the function signature/docstring to document the
new parameter and behavior.
…Type guard - record.cpp: Add #include <limits> for std::numeric_limits used by INVALID_SIZE_OF/INVALID_ALIGN_OF constants. The header was transitively included in conda builds but not in the manylinux wheel build container. - type.cpp: Revert the isDependentType() guard on getCanonicalType(). Template type parameters (e.g. T) are dependent types in Clang's type system, so the guard was preventing canonicalization for ALL template parameters, changing their printed name from "type-parameter-0-0" to "T" and breaking 6 existing tests. The null-type guard is retained. The actual crash cases (mangling, size computation) are already guarded in function.cpp and record.cpp respectively. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
/ok to test f6cebfb |
Summary
nameparameter tobind_cxx_structfor overriding the C++ name used in shims and type registrationregister_cxx_type()public API for registering external C++ types in numbast's type registryMotivation
When binding class template specializations,
struct_decl.nameis the unqualified template name (e.g."Matrix"), but the shim code and type registry need the fully qualified specialization name (e.g."Namespace::Matrix<float, 3, 1>"). Without a name override, users must wrap the declaration object — thenameparameter provides a clean way to do this.Similarly, when binding functions that use types from external C++ libraries, those types may not be parseable by ast_canopy but their memory layout is known.
register_cxx_type()lets users register these types soto_numba_type()returns the correct Numba type instead ofOpaque.Changes
bind_cxx_struct(name=...)—struct.pyNew keyword-only parameter
name: str | None = None. When provided, overridesstruct_decl.namethroughout:CTYPE_MAPSregistration keyPlumbed through
bind_cxx_struct_ctors,bind_cxx_struct_regular_methods, andbind_cxx_struct_conversion_opeartorsvia a newstruct_nameparameter.register_cxx_type(cxx_name, numba_type)—types.pySimple public function that registers a C++ type name → Numba type mapping in
CTYPE_MAPS. Follows the same pattern as the existingregister_enum_type().Test plan
nameis not provided)bind_cxx_struct(..., name="Foo::Bar<int>")uses the override in shims and type registry🤖 Generated with Claude Code
Summary by CodeRabbit
Bug Fixes
New Features
Tests