Skip to content

Remove biopython dependency, use .fai indexing with pyfaidx#24

Open
mdshw5 wants to merge 5 commits intopachterlab:masterfrom
mdshw5:master
Open

Remove biopython dependency, use .fai indexing with pyfaidx#24
mdshw5 wants to merge 5 commits intopachterlab:masterfrom
mdshw5:master

Conversation

@mdshw5
Copy link
Copy Markdown

@mdshw5 mdshw5 commented Dec 24, 2018

Thanks for sharing your analysis. I plan on using this in the new year, and noticed that running the pre-processing scripts on our shared HPC cluster resulting in some permissions issues for shared reference genome assemblies. The issue is that pyfasta creates its own index sidecar files, and fails if the user cannot write to the shared resource directory. I've swapped out pyfasta for my pyfaidx module, which will read or create a samtools .fai index file, which is likely already present in such a scenario. This also allowed me to drop the biopython dependency, as pyfaidx has a FASTA wrapping function.

The output on the example dataset is identical using the code in this PR vs the current master branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant