Conversation
|
Warning Rate limit exceeded
To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (4)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Review rate limit: 0/1 reviews remaining, refill in 3 minutes and 36 seconds.Comment |
af9be47 to
683d68e
Compare
683d68e to
8199960
Compare
81823ec to
659ee6e
Compare
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@divref/divref/haplotype.py`:
- Around line 168-187: The current computation of start/end (using min_variant
and max_variant) produces 1-based inclusive coordinates; to make them true
0-based half-open adjust both endpoints by subtracting 1 from locus positions:
set start = min_variant.locus.position - window_size - 1 and set end =
max_variant.locus.position - 1 + hl.len(max_variant.alleles[0]) + window_size
(and update the docstring to state 0-based half-open). Locate the expressions
that build hl.struct(start=..., end=...) referencing sorted_variants,
min_variant, max_variant, window_size, and max_variant.alleles[0] and apply
these arithmetic changes so downstream code persists correct 0-based half-open
spans (or alternatively, if you prefer the 1-based convention, rename/document
the fields explicitly rather than changing math).
In `@divref/divref/tools/create_divref_fasta.py`:
- Around line 22-28: The current loop only iterates over contigs present in df
and omits creating files for configured contigs with zero rows; change the
function to accept the expected contig list (e.g., contigs or
configured_contigs) and iterate over that list instead of
sorted(df["contig"].unique().to_list()), then for each contig build df_chrom =
df.filter(df["contig"] == chrom) and always create out_path =
Path(f"{output_base}.{chrom}.fasta"); if df_chrom has rows write the sequences
as before, otherwise write an empty FASTA (or a comment/header line) so the file
exists; update logging (logger.info) to show when an empty FASTA is emitted.
In `@divref/divref/tools/create_duckdb_index.py`:
- Around line 173-176: The argmax computation is using hl.max on a scalar
(hl.max(x.AF)) which is invalid; change the mapping to produce an array of
scalar AFs and feed that to hl.argmax by replacing
hl.argmax(va.gnomad_freqs.map(lambda x: hl.max(x.AF))) with
hl.argmax(va.gnomad_freqs.map(lambda x: x.AF)), so that argmax_pop, used in
va.select for max_pop and max_empirical_AF (va.gnomad_freqs[argmax_pop].AF),
correctly indexes the population with the highest AF.
In `@workflows/config/config_schema.yml`:
- Around line 60-63: The description for the "Minimum variant allele frequency"
config entry is truncated ("Also applied to")—complete the sentence to
explicitly name the other command(s) or consumer(s) that use this value (e.g.,
append "Also applied to `divref <other-command>`") so generated docs are
unambiguous; update the description string in the config_schema.yml entry that
currently references `divref extract-gnomad-afs` and `divref compute-haplotypes`
to include the missing consumer(s) wrapped in backticks.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: df69de02-d1f4-4b53-8920-a176e33606c7
⛔ Files ignored due to path filters (2)
divref/uv.lockis excluded by!**/*.lockpixi.lockis excluded by!**/*.lock
📒 Files selected for processing (10)
divref/divref/haplotype.pydivref/divref/main.pydivref/divref/tools/create_divref_fasta.pydivref/divref/tools/create_duckdb_index.pydivref/divref/tools/create_fasta_and_index.pydivref/pyproject.tomldivref/tests/test_haplotype.pypixi.tomlworkflows/config/config_schema.ymlworkflows/generate_divref.smk
💤 Files with no reviewable changes (2)
- pixi.toml
- divref/divref/tools/create_fasta_and_index.py
| for chrom in sorted(df["contig"].unique().to_list()): | ||
| logger.info("Creating FASTA for chromosome %s", chrom) | ||
| df_chrom = df.filter(df["contig"] == chrom) | ||
| out_path = Path(f"{output_base}.{chrom}.fasta") | ||
| with open(out_path, "w") as fasta_out: | ||
| for sequence_id, sequence in df_chrom.select("sequence_id", "sequence").iter_rows(): | ||
| fasta_out.write(f">{sequence_id}\n{sequence}\n") |
There was a problem hiding this comment.
Don’t skip configured contigs that end up with zero sequences.
This loop only writes FASTAs for contigs present in df["contig"], but workflows/generate_divref.smk Lines 271-274 declare one output per configured chromosome. If a chromosome is filtered down to zero rows, no file gets created and the workflow fails on missing outputs. Pass the expected contig list into this tool and emit empty FASTAs for absent contigs.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@divref/divref/tools/create_divref_fasta.py` around lines 22 - 28, The current
loop only iterates over contigs present in df and omits creating files for
configured contigs with zero rows; change the function to accept the expected
contig list (e.g., contigs or configured_contigs) and iterate over that list
instead of sorted(df["contig"].unique().to_list()), then for each contig build
df_chrom = df.filter(df["contig"] == chrom) and always create out_path =
Path(f"{output_base}.{chrom}.fasta"); if df_chrom has rows write the sequences
as before, otherwise write an empty FASTA (or a comment/header line) so the file
exists; update logging (logger.info) to show when an empty FASTA is emitted.
Summary by CodeRabbit
New Features
Refactor
Tests