DUD-E: A Database of Useful Decoys: Enhanced


In [1]:
from __future__ import print_function, division, unicode_literals

import oddt
from oddt.datasets import dude
print(oddt.__version__)


0.4.1-12-ga7d79bd

We'd like to read files from DUD-E.
You can download different targets and different numbers of targets, but I used only these five: ampc, cxcr4, pur2, pygm, sahh.


In [2]:
%%bash
mkdir -p ./DUD-E_targets/
wget -qO- http://dude.docking.org/targets/ampc/ampc.tar.gz | tar xz -C ./DUD-E_targets/
wget -qO- http://dude.docking.org/targets/cxcr4/cxcr4.tar.gz | tar xz -C ./DUD-E_targets/
wget -qO- http://dude.docking.org/targets/pur2/pur2.tar.gz | tar xz -C ./DUD-E_targets/
wget -qO- http://dude.docking.org/targets/pygm/pygm.tar.gz | tar xz -C ./DUD-E_targets/
wget -qO- http://dude.docking.org/targets/sahh/sahh.tar.gz | tar xz -C ./DUD-E_targets/

In [3]:
directory = './DUD-E_targets'

We will use the dude class.


In [4]:
dude_database = dude(home=directory)

Now we can get one target or iterate over all targets in our directory.

Let's choose one target.


In [5]:
target = dude_database['cxcr4']

target has four properties: protein, ligand, actives and decoys:
protein - protein molecule
ligand - ligand molecule
actives - generator containing actives
decoys - generator containing decoys


In [6]:
target.ligand


Out[6]:

Let's see which target has the most actives and decoys.


In [7]:
for target in dude_database:
    actives = list(target.actives)
    decoys = list(target.decoys)
    print('Target: ' + target.dude_id, 
          'Number of actives: ' + str(len(actives)), 
          'Number of decoys: ' + str(len(decoys)), 
          sep='\t\t')


Target: pygm		Number of actives: 114		Number of decoys: 4045
Target: sahh		Number of actives: 190		Number of decoys: 3483
Target: cxcr4		Number of actives: 122		Number of decoys: 3414
Target: ampc		Number of actives: 62		Number of decoys: 2902
Target: pur2		Number of actives: 201		Number of decoys: 2725