Orchid FASTA File Parser with Biopython

This project demonstrates how to download and parse FASTA sequence data using Biopython. The dataset used here is the ls_orchid.fasta file from the Biopython documentation examples.

📂 Files in this Project

ls_orchid.fasta → FASTA file containing orchid DNA sequences (downloaded from Biopython GitHub examples).
parser.py → Python script to parse and store sequences using Biopython's SeqIO module.

▶️ Code Explanation

Step 1: Import Biopython's SeqIO

from Bio import SeqIO

The SeqIO module allows reading and writing of sequence file formats such as FASTA, GenBank, etc.

Step 2: Initialize a List to Store Sequences

sequences = []

We create an empty list called sequences to store the DNA sequences extracted from the FASTA file.

Step 3: Parse the FASTA File

for seq_record in SeqIO.parse("ls_orchid.fasta", "fasta"):
    sequences.append(seq_record.seq)

SeqIO.parse() reads the FASTA file one record at a time.
Each record (seq_record) contains:
- seq_record.id → Identifier of the sequence.
- seq_record.seq → Actual DNA sequence.
We append only the sequence (seq_record.seq) to our sequences list.

Step 4: Output

After running the script, the list sequences will hold all DNA sequences from the FASTA file.

Example output (first few sequences):

[Seq('MATTYGGTTGGA...'), Seq('CTTAGGCTCCTG...'), ...]

⚡ Usage

Install Biopython:

pip install biopython

Download the FASTA file (Python version of wget):

import urllib.request

url = "https://raw.githubusercontent.com/biopython/biopython/master/Doc/examples/ls_orchid.fasta"
urllib.request.urlretrieve(url, "ls_orchid.fasta")

Run the parser script to load sequences.

✅ Applications

DNA sequence analysis
Motif finding
Sequence alignment
Bioinformatics pipelines

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Orchid FASTA File Parser with Biopython

📂 Files in this Project

▶️ Code Explanation

Step 1: Import Biopython's SeqIO

Step 2: Initialize a List to Store Sequences

Step 3: Parse the FASTA File

Step 4: Output

⚡ Usage

✅ Applications

📖 References

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

Orchid FASTA File Parser with Biopython

📂 Files in this Project

▶️ Code Explanation

Step 1: Import Biopython's SeqIO

Step 2: Initialize a List to Store Sequences

Step 3: Parse the FASTA File

Step 4: Output

⚡ Usage

✅ Applications

📖 References