Objective
Validate models on GALAH overlap sample and assess systematic differences between surveys.
Dependencies
- Phase 4: Parameter Regression (requires trained models)
Tasks
Files to Create
| File |
Purpose |
src/data/galah_loader.py |
Load GALAH data |
src/data/crossmatch.py |
Cross-match utilities |
src/evaluation/cross_survey.py |
Cross-survey metrics |
Starter Code
# src/data/crossmatch.py
"""Cross-match utilities for multi-survey data."""
import numpy as np
from astropy.coordinates import SkyCoord
from astropy import units as u
def crossmatch_by_coordinates(
ra1: np.ndarray, dec1: np.ndarray,
ra2: np.ndarray, dec2: np.ndarray,
max_sep_arcsec: float = 2.0
) -> tuple[np.ndarray, np.ndarray, np.ndarray]:
"""
Cross-match two catalogs by sky coordinates.
Returns indices of matches and separations.
"""
coords1 = SkyCoord(ra=ra1*u.deg, dec=dec1*u.deg)
coords2 = SkyCoord(ra=ra2*u.deg, dec=dec2*u.deg)
idx, sep2d, _ = coords1.match_to_catalog_sky(coords2)
# Filter by maximum separation
mask = sep2d.arcsec < max_sep_arcsec
idx1 = np.where(mask)[0]
idx2 = idx[mask]
separations = sep2d[mask].arcsec
return idx1, idx2, separations
Definition of Done
Technical Notes
- Use 2 arcsec matching radius for cross-match
- Different wavelength coverage may cause systematic differences
- Document any calibration choices for reproducibility
Part of #1 (Meta Issue)
Objective
Validate models on GALAH overlap sample and assess systematic differences between surveys.
Dependencies
Tasks
src/data/galah_loader.pyfor GALAH data parsingsrc/data/crossmatch.pyfor coordinate cross-matchingnotebooks/04_cross_validation.ipynbFiles to Create
src/data/galah_loader.pysrc/data/crossmatch.pysrc/evaluation/cross_survey.pyStarter Code
Definition of Done
Technical Notes
Part of #1 (Meta Issue)