Skip to content

Handle 0-d ndarrays in scalar isinstance checks#1415

Open
h-mayorquin wants to merge 8 commits intoremove_duck_typing_for_array_detectionfrom
remove_duck_typing_for_type
Open

Handle 0-d ndarrays in scalar isinstance checks#1415
h-mayorquin wants to merge 8 commits intoremove_duck_typing_for_array_detectionfrom
remove_duck_typing_for_type

Conversation

@h-mayorquin
Copy link
Contributor

Chained after #1414 (_is_collection). Motivated by hdmf-dev/hdmf-zarr#325 (zarr v2 to v3 migration).

hdmf's get_type functions infer element dtype by recursively indexing with data[0] until reaching a scalar, then calling type() on it. With numpy and zarr v2, data[0] on a 1-d float array returns a numpy scalar (e.g., numpy.float64), which passes isinstance(val, float) and has no __len__, so all downstream checks work. With zarr v3 (following the Python array API standard), data[0] returns a 0-d ndarray instead. A 0-d ndarray fails isinstance(val, (int, float, str, bool)) and type() returns numpy.ndarray rather than the element dtype. PR #1414 fixed the crash path (__len__ heuristic), but isinstance checks in other parts of the codebase still silently take the wrong branch when they encounter a 0-d ndarray.

This PR adds a _unwrap_scalar helper in hdmf.utils that converts 0-d ndarrays to numpy scalars via .item(), and applies it at the remaining isinstance checks that compare against Python scalar types.

Together with #1414, this eliminates the need for the __getitem__ monkey-patch in hdmf-zarr PR #325.

Checklist

  • Did you update CHANGELOG.md with your changes?
  • Does the PR clearly describe the problem and the solution?
  • Have you reviewed our Contributing Guide?
  • Does the PR use "Fix #XXX" notation to tell GitHub to close the relevant issue numbered XXX when the PR is merged?

@codecov
Copy link

codecov bot commented Mar 3, 2026

Codecov Report

❌ Patch coverage is 84.21053% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 92.87%. Comparing base (3bf5c8f) to head (dac32ea).

Files with missing lines Patch % Lines
src/hdmf/backends/hdf5/h5tools.py 0.00% 1 Missing and 1 partial ⚠️
src/hdmf/common/table.py 85.71% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@                           Coverage Diff                           @@
##           remove_duck_typing_for_array_detection    #1415   +/-   ##
=======================================================================
  Coverage                                   92.87%   92.87%           
=======================================================================
  Files                                          41       41           
  Lines                                       10007    10014    +7     
  Branches                                     2060     2061    +1     
=======================================================================
+ Hits                                         9294     9301    +7     
  Misses                                        435      435           
  Partials                                      278      278           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

h-mayorquin and others added 6 commits March 3, 2026 02:20
zarr v3 scalar indexing returns 0-d ndarrays, which fail check_type(arg, int).
Unwrap before the type check so ElementIdentifiers validation works with zarr v3 arrays.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… guards

- container.py: Data.__len__ uses _get_length for zarr v3 Arrays without __len__
- h5tools.py: use _get_length when sizing datasets during export
- objectmapper.py: use _get_length for compound dtype shape, unwrap 0-d rows

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
VectorData.get, DynamicTableRegion.get, DynamicTableRegion.shape,
DynamicTable.add_column, and EnumData.__add_term all call len() on
self.data which fails with zarr v3 Arrays that lack __len__.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant