Conversation
|
@claude review this PR. |
|
Claude finished @hanaol's task in 1m 8s —— View job PR Review: Add zarr support to dataloaderTasks
Overall, the implementation is clean and minimal. A few bugs and gaps worth addressing before merging. Bugs1.
if not member_list:
raise ValueError(f"Filelist at {datapath} is empty.")
first = member_list[0]2. Missing If a zarr store was written without the if "structure" not in z.attrs:
raise KeyError(f"'structure' attribute missing from zarr store at {path}")
volume = json.loads(z.attrs["structure"])["lattice"]["volume"]Inconsistencies3. The rest of the codebase (e.g., z = zarr.open_group(str(path), mode="r")4. The zarr stores are written as Missing tests5. No tests for There are tests for
Given that the zarr fixture infrastructure already exists in the test file, adding a couple of tests for Minor nit6. This pre-existing check silently rejects | |
Summary
This PR extends the
RhoDatadataloader to support.zarrformat in addition to the existing CHGCAR format, without requiring any changes to the config beyond pointingrootat amp_filelist.txtwhose parent directory includesdata/*.zarrandlabel/*.zarr. The format is auto-detected at runtime based on which file extension (.CHGCARor.zarr) is present in thedata/directory.Files changed
dataloader/utils.py: Addedload_zarr()which readscharge_density_totalfrom a local.zarrstore and normalizes by lattice volume (from the stored structure JSON attribute), matching the CHGCAR normalization convention. Added afmtparameter toload_numpy_rhoto dispatch to the correct loader.dataloader/dataset.py:RhoData.__init__now auto-detects format by checking whether the first filelist entry resolves to a.zarror.CHGCARfile. Raises a clear error if neither is found.