Skip to content

fix: correct _is_phased() early return and prevent KeyError on duplicate variants#227

Open
haoyu-haoyu wants to merge 1 commit intoPharmGKB:mainfrom
haoyu-haoyu:fix/phased-logic-and-keyerror
Open

fix: correct _is_phased() early return and prevent KeyError on duplicate variants#227
haoyu-haoyu wants to merge 1 commit intoPharmGKB:mainfrom
haoyu-haoyu:fix/phased-logic-and-keyerror

Conversation

@haoyu-haoyu
Copy link

Summary

Two bugs fixed in preprocessor/pcat/utilities.py:

1. _is_phased() returns on first sample (line ~879)

The function checks phasing status but returns immediately on the first GT field, never examining subsequent samples.

# Before (bug): returns on first sample
for x in gt_field:
    if '/' in x:
        return False
    else:
        return True  # exits loop on first iteration!

# After (fix): checks all samples
for x in gt_field:
    if '/' in x:
        return False
return True  # only returns True after checking ALL samples

Impact: Multi-sample VCFs may have incorrect phasing determination, leading to wrong star allele calls.

2. KeyError on duplicate normalized variants (fixes #222)

ref_pos_dynamic[pos].pop(ref_alt) crashes when multiple input variants normalize to the same coordinate/allele pair. Changed to .pop(key, None) to handle collisions gracefully.

Test plan

  • Run existing preprocessing tests: cd preprocessor && pytest tests/
  • Test with multi-sample phased VCF to verify phasing preserved

…ate variants

1. _is_phased(): The function returned on the first sample without
   checking all samples. Now iterates through all GT fields —
   returns False if ANY sample is unphased, True only if ALL are
   phased.

2. extract_pgx_variants(): .pop() calls on ref_pos_dynamic can
   raise KeyError when multiple input variants normalize to the
   same coordinate. Changed to .pop(key, None) to handle
   collisions gracefully.

   Fixes PharmGKB#222
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant