This is to help with https://github.com/pandegroup/openmm/issues/1443 -- we need to determine whether valid OpenEye molecules can be created in general (irrespective of charge) from elements, connectivity, and bond order. The intention is to update OpenMM's Topology to include bond order, and we need to make sure we have the necessary info in general.
John Chodera currently has a hacked tool for doing something like this at https://github.com/choderalab/openmoltools/blob/master/openmoltools/forcefield_generators.py#L80-L179 which uses antechamber to infer bond orders with the assumption that the molecule is neutral.
Here we will attempt a more general solution.
We need to ensure that this works for various bond orders including aromatics, and that it works for molecules of various charges including unusual protonation states.
In [33]:
# NBVAL_SKIP
from openeye.oechem import *
mol = OEMol()
#smiles = 'CC(c1ccccc1)C(=O)[O-]'
smiles = 'C[NH2+]CC(c1ccccc1)C(=O)[O-]'
OEParseSmiles(mol, smiles)
OEAddExplicitHydrogens(mol)
OEAssignAromaticFlags(mol)
OETriposAtomNames(mol)
In [34]:
def get_atom_properties(mol):
"""Return atomic numbers, names, connectivity, and bond orders involved in an OE molecule.
Arguments:
----------
mol : OEMol
OpenEye molecule to process
Returns
-------
atomic_numbers : list of ints
List of integer atomic numbers
names : list of strings
List of strings of atom names
bonds : list of tuples
List of tuples of atom names involved in bonds in the format [ (bond1_name1, bond1_name2), (bond2_name1, bond2_name2), ...]
bond_orders : list of ints
List of integer bond orders corresponding to the bonds in the prior list
"""
atomic_numbers = []
names = []
bonds = []
bond_orders = []
for atom in mol.GetAtoms():
atomic_numbers.append( atom.GetAtomicNum())
names.append( atom.GetName())
for bond in mol.GetBonds():
bgn = bond.GetBgn().GetName()
end = bond.GetEnd().GetName()
bonds.append( (bgn, end) )
bond_orders.append( bond.GetOrder() )
return atomic_numbers, names, bonds, bond_orders
def make_mol_from_properties( atomic_numbers, names, bonds, bond_orders):
"""Use atomic numbers, names, connectivity, and bond orders (i.e. as output by get_atom_properties) to construct and return an OEMol.
Arguments:
----------
atomic_numbers : list of ints
List of integer atomic numbers
names : list of strings
List of strings of atom names
bonds : list of tuples
List of tuples of atom names involved in bonds in the format [ (bond1_name1, bond1_name2), (bond2_name1, bond2_name2), ...]
bond_orders : list of ints
List of integer bond orders corresponding to the bonds in the prior list
Returns
-------
mol : OEMol
OpenEye molecule created
Notes: Draws loosely on John Chodera's code from https://github.com/choderalab/openmoltools/blob/master/openmoltools/forcefield_generators.py#L80-L179
"""
mol = OEMol()
for atnr, name in zip(atomic_numbers, names):
oeatom = mol.NewAtom( atnr )
oeatom.SetName(name)
oeatoms = { oeatom.GetName() : oeatom for oeatom in mol.GetAtoms() }
for bond, bond_order in zip(bonds, bond_orders):
mol.NewBond(oeatoms[bond[0]], oeatoms[bond[1]], bond_order)
# Perceive aromaticity
OEAssignAromaticFlags(mol)
# Assign formal charges
OEAssignFormalCharges(mol)
return mol
In [36]:
# NBVAL_SKIP
atomic_numbers, names, bonds, bond_orders = get_atom_properties(mol)
mol = make_mol_from_properties( atomic_numbers, names, bonds, bond_orders)
smiles = OECreateIsoSmiString(mol)
print(smiles)
In [ ]: