Course Content
Biopython Fundamentals
About Lesson

Introduction to Sequence Feature Visualization

  • Visualizing sequence features provides a clear and informative representation of annotated regions in a sequence.
  • Visualization aids in understanding the structure, function, and relationships of features.

Importance of Sequence Feature Visualization:

  • Visualization enhances the interpretation and analysis of complex genomic data.
  • It helps in identifying patterns, motifs, and variations within sequences.

Types of Sequence Feature Visualization

  1. Annotated Sequence Plots:

    • Display the annotated regions along with the sequence.
    • Use different colors or symbols to represent different feature types.
  2. Circular Plots:

    • Represent the sequence as a circular layout.
    • Show features as arcs or lines around the circle.
  3. Sequence Diagrams:

    • Visualize features using diagrams, such as bar charts or heatmaps.
    • Provide a compact representation of complex feature data.

Annotated Sequence Plots


from Bio import SeqIO
from Bio.Graphics import GenomeDiagram

genbank_file = "sequence.gb"

record = SeqIO.read(genbank_file, "genbank")

gd_diagram = GenomeDiagram.Diagram("Sequence Features")
gd_track = gd_diagram.new_track(1, name="Features")
gd_feature_set = gd_track.new_set()

for feature in record.features:
    color = "blue" if feature.type == "CDS" else "red"
    gd_feature_set.add_feature(feature, color=color, label=True)

gd_diagram.draw(format="linear", pagesize="A4", fragments=4, start=0, end=len(record))
gd_diagram.write("annotated_sequence.png", "PNG")
  • Read a GenBank file using the SeqIO.read() function.
  • Create a GenomeDiagram object and add a track and feature set.
  • Iterate over each feature in the record.
  • Set the color based on the feature type.
  • Add the feature to the feature set with labels.
  • Draw the diagram and save it as an image file.

Circular Plots

from Bio import SeqIO
from Bio.Graphics import GenomeDiagram

genbank_file = "sequence.gb"

record = SeqIO.read(genbank_file, "genbank")

gd_diagram = GenomeDiagram.Diagram("Circular Plot")
gd_track = gd_diagram.new_track(1, name="Features")
gd_feature_set = gd_track.new_set()

for feature in record.features:
    color = "blue" if feature.type == "CDS" else "red"
    gd_feature_set.add_feature(feature, color=color, label=True)

gd_diagram.draw(format="circular", circular=True, pagesize=(20 * cm, 20 * cm), start=0, end=len(record))
gd_diagram.write("circular_plot.png", "PNG")
  • Read a GenBank file using the SeqIO.read() function.
  • Create a GenomeDiagram object and add a track and feature set.
  • Iterate over each feature in the record.
  • Set the color based on the feature type.
  • Add the feature to the feature set with labels.
  • Draw the circular plot and save it as an image file.

Sequence Diagrams

from Bio import SeqIO
from Bio.Graphics import SequenceDiagram

genbank_file = "sequence.gb"

record = SeqIO.read(genbank_file, "genbank")

diagram = SequenceDiagram.Diagram()

for feature in record.features:
    color = "blue" if feature.type == "CDS" else "red"
    diagram.add_feature(feature.location.start, feature.location.end, color=color)

diagram.draw(format="linear", orientation="landscape", pagesize=(10 * cm, 5 * cm))
diagram.write("sequence_diagram.png", "PNG")
  • Read a GenBank file using the SeqIO.read() function.
  • Create a SequenceDiagram object.
  • Iterate over each feature in the record.
  • Set the color based on the feature type.
  • Add the feature to the diagram.
  • Draw the sequence diagram and save it as an image file.

Summary

  • Visualizing sequence features enhances data interpretation and analysis.
  • Biopython provides various modules, such as GenomeDiagram and SequenceDiagram, for generating annotated sequence plots and diagrams.
  • Choose the appropriate visualization technique based on the nature of the data and analysis goals.