Open
Conversation
dingyifei
commented
Feb 17, 2026
- remove selenium, use NCBI dataset API for N50
- add method to query brc by taxon id (for 1a)
- Replace brc FTP downloads with HTTPS downloads
get_scaffold_n50_for_species() used Selenium + Chrome to scrape NCBI web pages, which fails in headless environments (WSL, CI). Replace with a direct call to the NCBI Datasets v2 REST API endpoint: https://api.ncbi.nlm.nih.gov/datasets/v2/genome/taxon/{id}/dataset_report Remove now-unused selenium, webdriver-manager, and beautifulsoup4 dependencies. Update test fixture to use exact API value (4641652) instead of rounded Selenium scrape (4600000). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add a function to query the BV-BRC Data API for genome records by taxon ID. Uses taxon_lineage_ids (not taxon_id) to include subspecies and strain-level descendants. Supports optional filtering by genome_status and genome_quality, with automatic pagination. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
BV-BRC's FTP server now requires SSL/TLS on the control channel, causing all genome downloads via urllib FTP to fail silently. Switch download_genomes_bvbrc() to use HTTPS Data API endpoints with proper content-type negotiation. Also fix stale loop variable bug in the bad_genomes cleanup code (was using `genome` instead of `bad_genome`). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.