Orchid FASTA File Parser with Biopython

This project demonstrates how to download and parse FASTA sequence data using Biopython. The dataset used here is the ls_orchid.fasta file from the Biopython documentation examples.

📂 Files in this Project

ls_orchid.fasta → FASTA file containing orchid DNA sequences (downloaded from Biopython GitHub examples).
parser.py → Python script to parse and store sequences using Biopython's SeqIO module.

▶️ Code Explanation

Step 1: Import Biopython's SeqIO

from Bio import SeqIO

The SeqIO module allows reading and writing of sequence file formats such as FASTA, GenBank, etc.

Step 2: Initialize a List to Store Sequences

sequences = []

We create an empty list called sequences to store the DNA sequences extracted from the FASTA file.

Step 3: Parse the FASTA File

for seq_record in SeqIO.parse("ls_orchid.fasta", "fasta"):
    sequences.append(seq_record.seq)

SeqIO.parse() reads the FASTA file one record at a time.
Each record (seq_record) contains:
- seq_record.id → Identifier of the sequence.
- seq_record.seq → Actual DNA sequence.
We append only the sequence (seq_record.seq) to our sequences list.

Step 4: Output

After running the script, the list sequences will hold all DNA sequences from the FASTA file.

Example output (first few sequences):

[Seq('MATTYGGTTGGA...'), Seq('CTTAGGCTCCTG...'), ...]

⚡ Usage

Install Biopython:

pip install biopython

Download the FASTA file (Python version of wget):

import urllib.request

url = "https://raw.githubusercontent.com/biopython/biopython/master/Doc/examples/ls_orchid.fasta"
urllib.request.urlretrieve(url, "ls_orchid.fasta")

Run the parser script to load sequences.

✅ Applications

DNA sequence analysis
Motif finding
Sequence alignment
Bioinformatics pipelines

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.gitignore		.gitignore
README.md		README.md
Visualization_biopython.ipynb		Visualization_biopython.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Orchid FASTA File Parser with Biopython

📂 Files in this Project

▶️ Code Explanation

Step 1: Import Biopython's SeqIO

Step 2: Initialize a List to Store Sequences

Step 3: Parse the FASTA File

Step 4: Output

⚡ Usage

✅ Applications

📖 References

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Orchid FASTA File Parser with Biopython

📂 Files in this Project

▶️ Code Explanation

Step 1: Import Biopython's SeqIO

Step 2: Initialize a List to Store Sequences

Step 3: Parse the FASTA File

Step 4: Output

⚡ Usage

✅ Applications

📖 References

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages