In [1]:
%autosave 0
from __future__ import print_function
We here show how to find base-pairs and stacking interactions in structures and simulations. The function
stackings, pairings, res = bb.annotate(pdb)
returns three lists:
stackings and pairings contain the list of interactions for the N frames in the PDB/trajectory file and it is organized in the following way: for a given frame $i=1..N$ there are $k=1..Q$ interactions between residues with index pairings[i][0][k][0] and pairings[i][0][k][1]. The type of interaction is specified at the element pairings[i][1][k].
But let's make an example by annotating a PDB file containing a sarcin-ricin motif:
In [2]:
import barnaba as bb
# annotate
pdb = "../test/data/SARCIN.pdb"
stackings, pairings, res = bb.annotate(pdb)
# list base pairings
print("BASE-PAIRS")
for p in range(len(pairings[0][0])):
res1 = res[pairings[0][0][p][0]]
res2 = res[pairings[0][0][p][1]]
interaction = pairings[0][1][p]
print("%10s %10s %4s" % (res1,res2,interaction))
# list base-stackings
print()
print("STACKING")
for p in range(len(stackings[0][0])):
res1 = res[stackings[0][0][p][0]]
res2 = res[stackings[0][0][p][1]]
interaction = stackings[0][1][p]
print("%10s %10s %4s" % (res1,res2,interaction))
Base-pairing are classified according to the Leontis-Westhof classification, where
WWc pairs between complementary bases are called WCc or GUc.
Stacking are classified according to the MCannotate classification:
First, we consider only bases that are "close" in space, i.e. $R_{ij} < 1.7$ and $R_{ji} < 1.7$.
$R_{ij} = (x_{ij}/5, y_{ij}/5, z_{ij}/3)$ is the SCALED position vector with components ${x,y,z}$ (in $\mathring{A}$) of base j constructed on base i.
The criteria for base-stacking are the following:
$( |z_{ij}| \; AND \; |z_{ji}| > 2 \mathring{A} ) \; AND \; (\rho_{ij} \; OR\; \rho_{ji} < 2.5 \mathring{A}) \; AND\; (|\theta_{ij}| < 40^{\circ} ) $
where
The criteria for base-pairing are the following:
non stacked AND $|\theta_{ij}| < 60^{\circ}$ AND (number of hydrogen bonds $> 0$)
The number of hydrogen bonds is calculated as the number of donor-acceptor pairs with distance $< 3.3 \mathring{A}$.
If bases are complementary and the number of hydrogen bonds is > 1 (AU/GU) or > 2 (GC), the pair is considered WCc (or GUc).
ATT!
From the list of base-pairing, we can obtain the dot-bracket annotation using the function
dotbracket = bb.dot_bracket(pairings,res)
this function returns a string for each frame in the PDB/simulation. Let's see this in action:
In [3]:
dotbr, seq = bb.dot_bracket(pairings,res)
print(">",seq)
for j in range(len(dotbr)):
print(dotbr[j])