Introduction to Biopython’s SeqIO Module
- Biopython’s
SeqIO
module provides a powerful and flexible interface for reading and writing biological sequences. - It supports various file formats commonly used in bioinformatics, including FASTA, GenBank, and more.
SeqIO
simplifies the handling of sequences, allowing easy access to sequence data and associated metadata.
Reading Sequences with SeqIO
SeqIO
provides methods to read sequences from files in different formats.SeqIO.read()
reads a single sequence record from a file.SeqIO.parse()
reads multiple sequence records from a file.
from Bio import SeqIO
file_path = "sequence.fasta"
record = SeqIO.read(file_path, "fasta")
print("Header:", record.id)
print("Sequence:", record.seq)
SeqIO.read()
reads a single sequence record from a file.- The
file_path
specifies the path to the sequence file, and “fasta” indicates the file format. - The returned
record
object contains the sequence and associated metadata
Reading Multiple Sequences
from Bio import SeqIO
file_path = "sequences.fasta"
records = SeqIO.parse(file_path, "fasta")
for record in records:
print("Header:", record.id)
print("Sequence:", record.seq)
print()
SeqIO.parse()
reads multiple sequence records from a file.- The
file_path
specifies the path to the sequence file, and “fasta” indicates the file format. - The returned
records
object is an iterator that can be looped over to access each sequence record.
Writing Sequences with SeqIO
SeqIO.write()
is used to write sequences to a file in a specified format.- The method requires a sequence record and the output file handle.
from Bio import SeqIO
from Bio.Seq import Seq
from Bio.SeqRecord import SeqRecord
sequence = "ATCGATCGATCG"
record = SeqRecord(Seq(sequence), id="Seq1", description="Sample sequence")
output_file = "output.fasta"
SeqIO.write(record, output_file, "fasta")
- Import the necessary modules from Biopython:
SeqIO
,Seq
, andSeqRecord
. - Define the DNA sequence as a string:
sequence = "ATCGATCGATCG"
. - Create a
SeqRecord
object using the sequence string, and provide an ID and description for the sequence. - Specify the output file name and format in which you want to save the sequence:
output_file = "output.fasta"
. - Use the
SeqIO.write()
function to write the sequence record to the output file in FASTA format, using the “fasta” format specifier.
Make sure to have the output.fasta
file will be created with the specified sequence in FASTA format.
Join the conversation