Objective
- Understand the concept and significance of biological databases in bioinformatics.
- Explore different types of biological databases and their applications.
- Learn how to access and retrieve data from biological databases using Biopython.
Introduction to Biological Databases:
- Biological databases store and organize vast amounts of biological data, such as DNA sequences, protein structures, and gene annotations.
- They serve as valuable resources for researchers, providing access to a wide range of biological information.
Types of Biological Databases:
- Sequence Databases (e.g., GenBank, UniProt)
- Structure Databases (e.g., Protein Data Bank, PDB)
- Gene Expression Databases (e.g., GEO, ArrayExpress)
- Pathway Databases (e.g., KEGG, Reactome)
- Functional Annotation Databases (e.g., Gene Ontology, GO)
Importance of Biological Databases:
- Biological databases facilitate data storage, retrieval, and analysis, enabling researchers to make discoveries and gain insights.
- They support various bioinformatics tasks, including sequence alignment, protein structure prediction, and functional annotation.
Accessing Biological Databases with Biopython:
- Biopython provides modules and functions to access and retrieve data from various biological databases.
- It offers a unified interface to query databases and extract relevant information.
Using Biopython to Access Sequence Databases
from Bio.PDB import PDBList
pdblist = PDBList()
pdblist.retrieve_pdb_file("1abc")
pdb_file = "pdb1abc.ent"
parser = PDBParser()
structure = parser.get_structure("1abc", pdb_file)
model = structure[0]
chain = model['A']
residue = chain[1]
atoms = residue.get_atoms()
- Create a
PDBList
object from Bio.PDB
to access the Protein Data Bank (PDB).
- Use
retrieve_pdb_file()
to download a specific PDB file (e.g., “1abc”).
- Specify the PDB identifier (e.g., “1abc”) to access the downloaded file.
- Parse the PDB file using
PDBParser()
from Bio.PDB
.
- Access and analyze the structure components (model, chain, residue, atoms).
Summary
- Biological databases are crucial resources for storing and accessing biological data.
- Biopython provides modules and functions to query and retrieve data from various biological databases.
- Explore the functionality of Biopython to access sequence databases, structure databases, and other biological data resources.