The Molecule object has the following architecture:
Atom, Residue and Model objects are stored in a python dict of their parent containers. Access of Atom, Residue and Model objects within a Molecule object is supported by using the properties of python object. Normally a Model object will be assigned with the key of ‘m+number’, such as ‘m1, m100’. Suppose we have a Molecule object name ‘mol’, then the 1st Model object of ‘mol’ is ‘mol.m1’. Residue objects within a model will be assigned with the key of ‘chain id+residue serial’, 1st residue of chain A will have the name of ‘A1’, and can be access by ‘mol.m1.A1’. The name of an atom in its 3D coordinates will be used as the key. Then an Atom object with the name of ‘CA’ in residue ‘A1’ in 1st model can be access by ‘mol.m1.A1.CA’. However some atom has the name end with quotes, in this case the quotes will be replaced by ‘’ (underscore). E.g., the key for “C5’” will be ‘C5’
In [2]:
from SBio import *
mol = create_molecule('test.pdb')
mol
Out[2]:
navigate through a Molecule object:
In [2]:
for m in mol.get_model():
for r in m.get_residue():
print(r)
The "get_model", "get_atom" and "get_residue" are python generators, can be more conveniently used like this:
In [3]:
atoms=mol.m1.get_atom()
residue=mol.m1.get_residue()
for r in residue:
print(r)
In [4]:
mol.write_pdb('mol_new.pdb') # write all conformation into a single pdb file
i = 1
for m in mol.get_model():
name = 'mol_m'+str(i)+'.pdb'
m.write_pdb(name) #write one conformation to a single pdb file
i+=1
The 'Model' module provide several methods for extraction information of a molecule
In [10]:
m1 = mol.m1
print(m1.get_atom_num())
print(m1.get_residue_list())
print(m1.get_sequence('A')) #the sequence of chain A
m1.write_fasta('A', 'test.fasta', comment='test')
m1.get_mw()
Out[10]:
In [6]:
a1 = mol.m1.A2.O4_ # the actual name for this atom is "O4'"
a2 = mol.m1.A2.C1_
a3 = mol.m1.A2.N9
a4 = mol.m1.A2.C4
get_distance(a1, a2)
Out[6]:
In [7]:
get_angle(a1, a2, a3)
Out[7]:
In [8]:
chi = get_torsion(a1,a2,a3,a4)
print('the CHI torsion angle is {}'.format(chi))
The 'Interaction' module provides several methods to check the interaction between atoms:
In [11]:
a5 = m1.A2.N6
a6 = m1.A2.H61
a7 = m1.A1.O2
print(get_hydrogen_bond(a5, a7, a6)) #arguments order: donor, acceptor, donor_H=None
print(get_polar_interaction(a5, a7))
In [3]:
m1 = mol.m1
m2 = mol.m2
molecule_list = [m1,m2]
residue_range = [[1,2],[1,2]]
sa = Structural_alignment(molecule_list, residue_range, update_coord=False)
sa.run()
print(sa.rmsd)
The 'Structural_alignment' module is used to deal with the multiple sequence alignment to mapping residues between different residues, i.e., to get the residue serials of the conserved residues among different molecules. The conserved residue serials can than be used in the structure alignment.
In [6]:
seq = 'D:\\python\\structural bioinformatics_in_python\\PPO-crystal.clustal'
alignment=Seq_align_parser(seq)
alignment.run()
con_res = []
con_res.append(alignment.align_res_mask['O24164|PPOM_TOBAC '])
con_res.append(alignment.align_res_mask['P56601|PPOX_MYXXA '])
print(con_res[0]) #conserved residue in 'PPOM_TOBAC'
print(con_res[1])
The 'Structural_analysis' and 'Plot' modules provide methods for simple structural analysis and visulization for nucleic and protein. For example, the backbone torsion angle, the puckering of the sugar of nucleeotide, can be computed and plotted. (see examples for more detail)