For this study we are interested in finding out protein families having a high similarity, then picking up a pair of proteins and performing structural alignment in a PyMol (MMTK might also be used). See this for a general reference to protein similarity networks: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3031030/.
For the first part of our study we perform a BLAST global alignment on the human proteome. You have the peptide data in your folder. Read how to do this here: https://www.biostars.org/p/6541/
You then select a few high scoring candidate pairs and choose one such pair. Perform structural alignment on Python/PyMol.
A way to perform structural alignment in BioPython is taken from this blog of a Danish computational chemist. It is neat, but I would prefer if you try instead to use PyMol and make a nice visualization of the aligned structures as well. Use this for functional alignment: http://www.pymolwiki.org/index.php/Align Alternatively you can open the BioPython output in any visualization program, preferably from Python.
In [ ]:
import Bio.PDB
# Select what residues numbers you wish to align
# and put them in a list
start_id = 1
end_id = 70
atoms_to_be_aligned = range(start_id, end_id + 1)
# Start the parser
pdb_parser = Bio.PDB.PDBParser(QUIET = True)
# Get the structures
ref_structure = pdb_parser.get_structure("reference", "1D3Z.pdb")
sample_structure = pdb_parser.get_structure("samle", "1UBQ.pdb")
# Use the first model in the pdb-files for alignment
# Change the number 0 if you want to align to another structure
ref_model = ref_structure[0]
sample_model = sample_structure[0]
# Make a list of the atoms (in the structures) you wish to align.
# In this case we use CA atoms whose index is in the specified range
ref_atoms = []
sample_atoms = []
# Iterate of all chains in the model in order to find all residues
for ref_chain in ref_model:
# Iterate of all residues in each model in order to find proper atoms
for ref_res in ref_chain:
# Check if residue number ( .get_id() ) is in the list
if ref_res.get_id()[1] in atoms_to_be_aligned:
# Append CA atom to list
ref_atoms.append(ref_res['CA'])
# Do the same for the sample structure
for sample_chain in sample_model:
for sample_res in sample_chain:
if sample_res.get_id()[1] in atoms_to_be_aligned:
sample_atoms.append(sample_res['CA'])
# Now we initiate the superimposer:
super_imposer = Bio.PDB.Superimposer()
super_imposer.set_atoms(ref_atoms, sample_atoms)
super_imposer.apply(sample_model.get_atoms())
# Print RMSD:
print super_imposer.rms
# Save the aligned version of 1UBQ.pdb
io = Bio.PDB.PDBIO()
io.set_structure(sample_structure)
io.save("1UBQ_aligned.pdb")