We hereby demonstrate how to use the API to assign PB sequences.
In [1]:
from pprint import pprint
import urllib.request
import os
# print date & versions
import datetime
print("Date & time:",datetime.datetime.now())
import sys
print("Python version:", sys.version)
In [2]:
import pbxplore as pbx
print("PBxplore version:", pbx.__version__)
The pbxplore.chains_from_files()
function is the prefered way to read PDB and PDBx/mmCIF files using PBxplore. This function takes a list of file path as argument, and yield each chain it can read from these files. It provides a single interface to read PDB and PDBx/mmCIF files, to read single model and multimodel files, and to read a single file of a collection of files.
Here we want to read a single file with a single model and a single chain. Therefore, we need the first and only record that is yield by pbxplore.chains_from_files()
. This record contains a name for the chain, and the chain itself as a pbxplore.structure.structure.Chain
object. Note that, even if we want to read a single file, we need to provide it as a list to pbxplore.chains_from_files()
.
In [3]:
pdb_name, _ = urllib.request.urlretrieve('https://files.rcsb.org/view/1BTA.pdb', '1BTA.pdb')
structure_reader = pbx.chains_from_files([pdb_name])
chain_name, chain = next(structure_reader)
print(chain_name)
print(chain)
Protein Blocks are assigned based on the dihedral angles of the backbone. So we need to calculate them. The pbxplore.structure.structure.Chain.get_phi_psi_angles()
methods calculate these angles and return them in a form that can be directly provided to the assignement function.
The dihedral angles are returned as a dictionnary. Each key of this dictionary is a residue number, and each value is a dictionary with the phi and psi angles.
In [4]:
dihedrals = chain.get_phi_psi_angles()
pprint(dihedrals)
The dihedral angles can be provided to the pbxplore.assign()
function that assigns a Protein Block to each residue, and that returns the PB sequence as a string. Note that the first and last two residues are assigned to the Z
jocker block as some dihedral angles cannot be calculated.
In [5]:
pb_seq = pbx.assign(dihedrals)
print(pb_seq)
A single PDB file can contain several models. Then, we do not want to read only the first chain. Instead, we want to iterate over all the chains.
In [6]:
pdb_name, _ = urllib.request.urlretrieve('https://files.rcsb.org/view/2LFU.pdb', '2LFU.pdb')
for chain_name, chain in pbx.chains_from_files([pdb_name]):
dihedrals = chain.get_phi_psi_angles()
pb_seq = pbx.assign(dihedrals)
print('* {}'.format(chain_name))
print(' {}'.format(pb_seq))
The pbxplore.chains_from_files()
function can also handle several chains from several files.
In [7]:
import glob
files = ['1BTA.pdb', '2LFU.pdb', '3ICH.pdb']
for pdb_name in files:
urllib.request.urlretrieve('https://files.rcsb.org/view/{0}'.format(pdb_name), pdb_name)
print('The following files will be used:')
pprint(files)
for chain_name, chain in pbx.chains_from_files(files):
dihedrals = chain.get_phi_psi_angles()
pb_seq = pbx.assign(dihedrals)
print('* {}'.format(chain_name))
print(' {}'.format(pb_seq))
PB sequences can be assigned from a trajectory. To do so, we use the pbxplore.chains_from_trajectory()
function that takes the path to a trajectory and the path to the corresponding topology as argument. Any file formats readable by MDAnalysis can be used. Except for its arguments, pbxplore.chains_from_trajectory()
works the same as pbxplore.chains_from_files()
.
In [8]:
topology, _ = urllib.request.urlretrieve('https://raw.githubusercontent.com/pierrepo/PBxplore/master/demo_doc/psi_md_traj.gro',
'psi_md_traj.gro')
trajectory, _ = urllib.request.urlretrieve('https://raw.githubusercontent.com/pierrepo/PBxplore/master/demo_doc/psi_md_traj.xtc',
'psi_md_traj.xtc')
for chain_name, chain in pbx.chains_from_trajectory(trajectory, topology):
dihedrals = chain.get_phi_psi_angles()
pb_seq = pbx.assign(dihedrals)
print('* {}'.format(chain_name))
print(' {}'.format(pb_seq))
Providing the dihedral angles can be formated as expected by pbxplore.assign()
, the source of these angles does not matter. For instance, other PDB parser can be used with PBxplore.
In [9]:
import Bio
import Bio.PDB
import math
print("BioPython version:", Bio.__version__)
pdb_name, _ = urllib.request.urlretrieve('https://files.rcsb.org/view/2LFU.pdb', '2LFU.pdb')
for model in Bio.PDB.PDBParser().get_structure("2LFU", pdb_name):
for chain in model:
polypeptides = Bio.PDB.PPBuilder().build_peptides(chain)
for poly_index, poly in enumerate(polypeptides):
dihedral_list = poly.get_phi_psi_list()
dihedrals = {}
for resid, (phi, psi) in enumerate(dihedral_list, start=1):
if not phi is None:
phi = 180 * phi / math.pi
if not psi is None:
psi = 180 * psi / math.pi
dihedrals[resid] = {'phi': phi, 'psi': psi}
print(model, chain)
pb_seq = pbx.assign(dihedrals)
print(pb_seq)