Skip to content

Bug in Data Parser leads to issue for loading IAD with HIP IDs 1-120 #55

@nicochunger

Description

@nicochunger

I've encountered a bug within the data parser responsible for loading IAD for HIP IDs ranging from 1 to 120. The issue arises from the parser's method of searching for the corresponding IAD data file within the intermediate_data_directory. Specifically, the search algorithm erroneously identifies both the intended data file and the parent directory with matching ID as valid matches.

Example Case:
When attempting to load data for HIP 32, the parser correctly identifies the data file located at H000/H000032.d. However, it incorrectly also matches the parent directory named H032/ as a potential data file. This results in the parser reporting an error indicating that multiple files were found, subsequently halting the process.

Fix
The fix is very simple, in line 714 of the htof/parse.py file a simple os.path.isfile(f) has to be added to check that the path is actually a file and not a directory.

def match_filename(paths, star_id):
    return [f for f in paths if os.path.isfile(f) and digits_only(os.path.basename(f).split('.')[0]).zfill(6) == star_id.zfill(6)]

This fixes the bug, and HIP IDs 1-120 work as expected.

Alternatively, in line 62 of parse.py the wildcards can be restricted a bit to only search for files with the ".d" file extension.

filepath = os.path.join(os.path.join(intermediate_data_directory, '**/'), '*' + star_id.lstrip('0') + '*.d')

Either method works.

Thanks for the tool :)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions