This notebook prepares small molecule structure files for the SAMPL6 host-guest challenge from source files already present in the source directories. OpenEye toolkits are required, as well as openbabel for the case of oxaliplatin.
The procedure basically is as follows:
In [1]:
from openeye.oechem import *
from openeye.oeomega import *
from openeye.oequacpac import *
from openeye.oedocking import *
import os
In [2]:
# Specify source directory
source_dir = '.'
# Load host files
ifile = oemolistream(os.path.join(source_dir, 'OctaAcidsAndGuests', 'OA.pdb'))
OA = OEMol()
OEReadMolecule( ifile, OA)
ifile.close()
ifile = oemolistream(os.path.join(source_dir, 'OctaAcidsAndGuests', 'TEMOA.pdb'))
TEMOA = OEMol()
OEReadMolecule( ifile, TEMOA)
ifile.close()
ifile = oemolistream(os.path.join(source_dir, 'CB8AndGuests', 'CB8.sdf'))
CB8 = OEMol()
OEReadMolecule( ifile, CB8)
ifile.close()
# Assign partial charges; this step will take a while and isn't necessary to generate basic files
# (I ran this for about 25minutes and didn't get even the first one to converge on my laptop, so I'm skipping for now)
#for mol in [OA, TEMOA, CB8]:
# OEAssignCharges(mol, OEAM1BCCCharges())
# Write out host files to alternate formats (mol2, sdf)
for fmt in ['.mol2', '.sdf']:
ofile = oemolostream(os.path.join(source_dir, 'OctaAcidsAndGuests', 'OA'+fmt))
OEWriteMolecule( ofile, OA)
ofile.close()
ofile = oemolostream(os.path.join(source_dir, 'OctaAcidsAndGuests', 'TEMOA'+fmt))
OEWriteMolecule( ofile, TEMOA)
ofile.close()
# Add write of CB8
for fmt in ['.pdb', '.mol2']:
ofile = oemolostream(os.path.join(source_dir, 'CB8AndGuests', 'CB8'+fmt))
OEWriteMolecule( ofile, CB8)
ofile.close()
In [3]:
# Make receptors of the hosts for use in docking; takes about 4 minutes on my Mac
receptors = {}
for (hostname, hostmol) in [('OA', OA),('TEMOA', TEMOA), ('CB8', CB8)]:
# Start by getting center of mass to use as a hint for where to dock
com = OEFloatArray(3)
OEGetCenterOfMass(hostmol, com) #Try octa acid for now
# Create receptor, as per https://docs.eyesopen.com/toolkits/python/dockingtk/receptor.html#creating-a-receptor
receptor = OEGraphMol()
OEMakeReceptor(receptor, hostmol, com[0], com[1], com[2])
receptors[hostname]=receptor
We load each guest from SMILES (assigning protonation states roughly appropriate for neutral pH), use Omega to generate conformers, assign AM1-BCC charges, then dock to the relevant host and write out. We dock to both OA and TEMOA separately (though they use the same frame of reference) and write out "OA-GN" and "TEMOA-GN" files where N is a number from 0 to 7, corresponding to the guest. The molecules contained in these files are the same in both cases, but conformers/poses will be different.
Disclaimer: We have no idea whether these are likely binding modes, good starting poses, etc. We are providing these as a potentially reasonable set of geometries, but use at your own risk.
In [4]:
# Load initial octa acid molecule files, extract smiles
smifilename = os.path.join(source_dir, 'Gibb_SAMPL6_guests.smi')
smifile = open(smifilename, 'r')
text = smifile.readlines()
# Loop over OA, TEMOA and dock guests to each
for hostname in ['OA', 'TEMOA']:
#Get SMILES, build names
smiles = [ line.split()[0] for line in text ]
guest_names = [ ('%s-' % hostname)+'G%s' % i for i in range(len(smiles))]
#initialize omega
omega = OEOmega()
omega.SetMaxConfs(100) #Generate up to 100 conformers since we'll use for docking
omega.SetIncludeInput(False)
omega.SetStrictStereo(True) #Refuse to generate conformers if stereochemistry not provided
#Initialize charge generation
chargeEngine = OEAM1BCCCharges()
# Initialize docking
dock = OEDock()
dock.Initialize(receptors[hostname])
# Build OEMols from SMILES
oemols = []
for (smi, name) in zip(smiles, guest_names):
# Generate new OEMol and parse SMILES
mol = OEMol()
OEParseSmiles( mol, smi)
OESetNeutralpHModel(mol)
# Generate conformers with Omega; keep only best conformer
status = omega(mol)
if not status:
print("Error generating conformers for %s, %s." % (smi, name))
#print(smi, name, mol.NumAtoms()) #Print debug info -- make sure we're getting protons added as we should
# Assign AM1-BCC charges
OEAssignCharges(mol, chargeEngine)
# Dock to hosts
dockedMol = OEGraphMol()
status = dock.DockMultiConformerMolecule(dockedMol, mol) #By default returns only top scoring pose
sdtag = OEDockMethodGetName(OEDockMethod_Chemgauss4)
OESetSDScore(dockedMol, dock, sdtag)
dock.AnnotatePose(dockedMol)
# Write out docked pose if docking successful, otherwise write out first generated conformer
if status == OEDockingReturnCode_Success:
outmol = dockedMol
else:
print("Docking failed for %s; storing input pose." % name)
# Delete excess conformers -- we want to write only one
for k, conf in enumerate( mol.GetConfs() ):
if k > 0:
mol.DeleteConf(conf)
outmol = mol
# Write out
tripos_mol2_filename = os.path.join(source_dir, 'OctaAcidsAndGuests', name+'.mol2')
ofile = oemolostream( tripos_mol2_filename )
OEWriteMolecule( ofile, outmol)
ofile.close()
# Add write of SDF
sdf_filename = os.path.join(source_dir, 'OctaAcidsAndGuests', name+'.sdf')
ofile = oemolostream( sdf_filename )
OEWriteMolecule( ofile, outmol)
ofile.close()
# Save to oemols
oemols.append(OEMol(outmol))
# Clean up residue names in mol2 files that are tleap-incompatible: replace substructure names with valid text.
infile = open( tripos_mol2_filename, 'r')
lines = infile.readlines()
infile.close()
newlines = [line.replace('<0>', name) for line in lines]
outfile = open(tripos_mol2_filename, 'w')
outfile.writelines(newlines)
outfile.close()
We load each guest from SMILES (assigning protonation states roughly appropriate for neutral pH), use Omega to generate conformers, assign AM1-BCC charges, then dock to CB8 and write out. Conformer generation for oxaliplatin (a bonus case) to CB8 fails because of the platinum, so a separate procedure is used in an additional cell below for this case.
Disclaimer: We have no idea whether these are likely binding modes, good starting poses, etc. We are providing these as a potentially reasonable set of geometries, but use at your own risk.
In [5]:
# Load initial CB8 molecule files, extract smiles
smifilename = os.path.join(source_dir, 'Isaacs_SAMPL6_guests.smi')
smifile = open(smifilename, 'r')
text = smifile.readlines()
#Get SMILES, build names
smiles = [ line.split()[0] for line in text ]
guest_names = [ 'CB8-G%s' % i for i in range(len(smiles))]
#initialize omega
omega = OEOmega()
omega.SetMaxConfs(100) #Generate up to 100 conformers since we'll use for docking
omega.SetIncludeInput(False)
omega.SetStrictStereo(True) #Refuse to generate conformers if stereochemistry not provided
#Initialize charge generation
chargeEngine = OEAM1BCCCharges()
# Initialize docking
dock = OEDock()
dock.Initialize(receptors['CB8'])
# Build OEMols from SMILES
oemols = []
for (smi, name) in zip(smiles, guest_names):
# Generate new OEMol and parse SMILES
mol = OEMol()
OEParseSmiles( mol, smi)
OESetNeutralpHModel(mol)
# Generate conformers with Omega; keep only best conformer
status = omega(mol)
if not status:
print("Error generating conformers for %s, %s." % (smi, name))
#print(smi, name, mol.NumAtoms()) #Print debug info -- make sure we're getting protons added as we should
# Assign AM1-BCC charges
OEAssignCharges(mol, chargeEngine)
# Dock to hosts
dockedMol = OEGraphMol()
status = dock.DockMultiConformerMolecule(dockedMol, mol) #By default returns only top scoring pose
sdtag = OEDockMethodGetName(OEDockMethod_Chemgauss4)
OESetSDScore(dockedMol, dock, sdtag)
dock.AnnotatePose(dockedMol)
# Write out docked pose if docking successful, otherwise write out first generated conformer
if status == OEDockingReturnCode_Success:
outmol = dockedMol
else:
print("Docking failed for %s; storing input pose." % name)
# Delete excess conformers -- we want to write only one
for k, conf in enumerate( mol.GetConfs() ):
if k > 0:
mol.DeleteConf(conf)
outmol = mol
# Write out
tripos_mol2_filename = os.path.join(source_dir, 'CB8AndGuests', name+'.mol2')
ofile = oemolostream( tripos_mol2_filename )
OEWriteMolecule( ofile, outmol)
ofile.close()
# Add write of SDF
sdf_filename = os.path.join(source_dir, 'CB8AndGuests', name+'.sdf')
ofile = oemolostream( sdf_filename )
OEWriteMolecule( ofile, outmol)
ofile.close()
# Save to oemols
oemols.append(OEMol(outmol))
# Clean up residue names in mol2 files that are tleap-incompatible: replace substructure names with valid text.
infile = open( tripos_mol2_filename, 'r')
lines = infile.readlines()
infile.close()
newlines = [line.replace('<0>', name) for line in lines]
outfile = open(tripos_mol2_filename, 'w')
outfile.writelines(newlines)
outfile.close()
OpenBabel appears to be able to handle conformer generation for oxaliplatin, so generate its conformer and then use the same docking procedure (though this time skipping charging of the guest).
I am told that the resulting geometry of oxaliplatin is incorrect, which is perhaps not surprising since presumably the platinum poses problems for most of our modeling tools. If participants would like to study this compounds, they are encouraged to carefully examine and prepare a suitable 3D conformation before beginning their studies.
In [6]:
# Oxaliplatin fails, so use openbabel to at least generate a 3D structure of it
import os
file = open('oxaliplatin.smi', 'w')
file.write(smiles[-1]) #Assumes it is last!
file.close()
os.system('obabel -ismi oxaliplatin.smi -omol2 -O tmp.mol2 --gen3d')
istream = oemolistream('tmp.mol2')
mol = OEMol()
OEReadMolecule(istream, mol)
istream.close()
os.system('rm oxaliplatin.smi')
os.system('rm tmp.mol2')
# Can't charge this (no BCCs for Pt) so skip
# Dock to at least place in right region
dockedMol = OEGraphMol()
status = dock.DockMultiConformerMolecule(dockedMol, mol) #By default returns only top scoring pose
sdtag = OEDockMethodGetName(OEDockMethod_Chemgauss4)
OESetSDScore(dockedMol, dock, sdtag)
dock.AnnotatePose(dockedMol)
outmol = dockedMol
# Write out
tripos_mol2_filename = os.path.join(source_dir, 'CB8AndGuests', guest_names[-1]+'.mol2')
ofile = oemolostream( tripos_mol2_filename )
OEWriteMolecule( ofile, outmol)
ofile.close()
# Add write of SDF
sdf_filename = os.path.join(source_dir, 'CB8AndGuests', guest_names[-1]+'.sdf')
ofile = oemolostream( sdf_filename )
OEWriteMolecule( ofile, outmol)
ofile.close()
We now should have final docked poses for the Octa Acid (OA and TEMOA) and CB8 guests, organized into the relevant directories. I have manually inspected all of the files I generated in the hosts and at least verified that the guests reside roughly inside the hosts without any overlapping atoms. Whether you use these or not is entirely up to you, and the usual disclaimers apply (don't assume we have the right protonation states, etc.).
In [ ]: