I've encountered a bug within the data parser responsible for loading IAD for HIP IDs ranging from 1 to 120. The issue arises from the parser's method of searching for the corresponding IAD data file within the intermediate_data_directory. Specifically, the search algorithm erroneously identifies both the intended data file and the parent directory with matching ID as valid matches.
Example Case:
When attempting to load data for HIP 32, the parser correctly identifies the data file located at H000/H000032.d. However, it incorrectly also matches the parent directory named H032/ as a potential data file. This results in the parser reporting an error indicating that multiple files were found, subsequently halting the process.
Fix
The fix is very simple, in line 714 of the htof/parse.py file a simple os.path.isfile(f) has to be added to check that the path is actually a file and not a directory.
def match_filename(paths, star_id):
return [f for f in paths if os.path.isfile(f) and digits_only(os.path.basename(f).split('.')[0]).zfill(6) == star_id.zfill(6)]
This fixes the bug, and HIP IDs 1-120 work as expected.
Alternatively, in line 62 of parse.py the wildcards can be restricted a bit to only search for files with the ".d" file extension.
filepath = os.path.join(os.path.join(intermediate_data_directory, '**/'), '*' + star_id.lstrip('0') + '*.d')
Either method works.
Thanks for the tool :)
I've encountered a bug within the data parser responsible for loading IAD for HIP IDs ranging from 1 to 120. The issue arises from the parser's method of searching for the corresponding IAD data file within the
intermediate_data_directory. Specifically, the search algorithm erroneously identifies both the intended data file and the parent directory with matching ID as valid matches.Example Case:
When attempting to load data for HIP 32, the parser correctly identifies the data file located at
H000/H000032.d. However, it incorrectly also matches the parent directory namedH032/as a potential data file. This results in the parser reporting an error indicating that multiple files were found, subsequently halting the process.Fix
The fix is very simple, in line 714 of the
htof/parse.pyfile a simpleos.path.isfile(f)has to be added to check that the path is actually a file and not a directory.This fixes the bug, and HIP IDs 1-120 work as expected.
Alternatively, in line 62 of
parse.pythe wildcards can be restricted a bit to only search for files with the ".d" file extension.Either method works.
Thanks for the tool :)