Skip to content

Fix #775: handle empty datasets in get_type with informative error#1421

Open
sejalpunwatkar wants to merge 5 commits intohdmf-dev:devfrom
sejalpunwatkar:fix-empty-dataset-validation
Open

Fix #775: handle empty datasets in get_type with informative error#1421
sejalpunwatkar wants to merge 5 commits intohdmf-dev:devfrom
sejalpunwatkar:fix-empty-dataset-validation

Conversation

@sejalpunwatkar
Copy link
Contributor

This PR addresses issue #775 where validation of datasets with empty shapes caused an IndexError.

Changes:

  • Updated get_type in validator.py to check for empty/null data before indexing.
  • Replaced the generic IndexError with a more informative EmptyArrayError.
  • Added a unit test in tests/unit/test_validator.py to verify the fix and prevent regressions.

Fixes #775

@codecov
Copy link

codecov bot commented Mar 6, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 92.76%. Comparing base (0897b93) to head (42aa600).

Additional details and impacted files
@@            Coverage Diff             @@
##              dev    #1421      +/-   ##
==========================================
- Coverage   92.85%   92.76%   -0.09%     
==========================================
  Files          41       41              
  Lines        9989     9993       +4     
  Branches     2054     2056       +2     
==========================================
- Hits         9275     9270       -5     
- Misses        436      441       +5     
- Partials      278      282       +4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@sejalpunwatkar sejalpunwatkar force-pushed the fix-empty-dataset-validation branch 3 times, most recently from e57ee4e to 3a9b976 Compare March 6, 2026 20:53
@sejalpunwatkar
Copy link
Contributor Author

Hi @rly , I've updated the PR to fix #775. I added the safety check for empty datasets in get_type and included a corresponding unit test.
Note on file changes: You will notice a large number of line changes (1018 insertions, 845 deletions) in the modified files. This is because I ran ruff format to resolve the linting failures in CI and align these specific files with the project's updated style standards. I also simplified the str and bytes checks in get_type to maintain the required complexity score (C901).
I see that some environment-specific tests are still failing; I am looking into those logs now and will provide an update shortly.

@rly
Copy link
Contributor

rly commented Mar 6, 2026

The optional and zarr tests are expected to fail until the next pynwb release, so those are OK.

The ruff / style changes are significant and should be moved to a separate PR so that this one is easy to review and self-contained. However, I don't think ruff should be complaining about changing single quote to double quote if you are running it from the repo, which should use the configuration specified in pyproject.toml.

Could you please limit this PR to just the changes relevant to the fix?

@sejalpunwatkar sejalpunwatkar force-pushed the fix-empty-dataset-validation branch from 3a9b976 to 996d7f4 Compare March 7, 2026 08:04
@sejalpunwatkar
Copy link
Contributor Author

@rly, I've cleaned up the PR. It now contains only the fix for #775 and the new test (16 lines total). I've avoided the bulk formatting changes to keep this PR self-contained. The Zarr and optional test failures are the ones you mentioned are expected to fail.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Validation returns "IndexError: Index (0) out of range for empty dimension" for dataset with empty shape

2 participants