Test whether impropers are properly implemented in SMIRNOFF

(Material from this will be migrated to a standard test once I work through it). By D. Mobley, 9/19/16.

Initial testing on benzene and o-xylene. Add CC(C)(C)c1ccccc1C(C)(C)C, 1,2-ditert-butylbenzene, for a case where steric strain should induce significant energy in the improper


In [1]:
# NBVAL_SKIP
# Import stuff we need
from openforcefield.typing.engines.smirnoff import forcefield, generateTopologyFromOEMol
import openeye.oechem as oechem
import openeye.oeiupac as oeiupac
import openeye.oeomega as oeomega
import copy
from simtk import openmm
from simtk.openmm import app
import numpy as np
import tleap_tools
# chose if you want things done in a verbose fashion:
verbose = False

In [2]:
# NBVAL_SKIP
# Load our forcefield 
#ffxml = 'test_forcefields/smirnoff99Frosst.offxml'
ffxml = 'smirnoff99Frosst_modified.offxml'
ff = forcefield.ForceField(ffxml)

In [3]:
# NBVAL_SKIP
# Initialize o-xylene and benzene as a test molecules
molnames = ['benzene', 'o-xylene','1,2-ditert-butylbenzene'] 
oemols = []
topologies = []
systems = []
for molname in molnames:
    mol = oechem.OEMol()
    oeiupac.OEParseIUPACName(mol, molname)
    omega = oeomega.OEOmega()
    omega(mol)
    # Generate coordinates as we'll need them later for energy evaluations
    omega = oeomega.OEOmega()
    omega(mol)
    oechem.OETriposAtomNames(mol) #Assign Tripos names
    oechem.OETriposAtomTypes(mol) #Assign tripos numeric types
    oechem.OETriposTypeNames(mol) #Assign tripos text types
    # Generate topology and system
    topology = forcefield.generateTopologyFromOEMol(mol)
    topologies.append(topology)
    print("Creating system for %s..." % molname)
    system = ff.createSystem(topology, [mol], chargeMethod = 'OECharges_AM1BCCSym', verbose = verbose)
    systems.append(system)
    
    # Energy minimize molecules with SMIRNOFF because otherwise the strained molecule will still be planar
    # due to how omega generates conformers
    coordinates = mol.GetCoords()
    natoms=len(coordinates)
    positions = np.zeros([natoms,3], np.float32)
    for index in range(natoms):
        (x,y,z) = coordinates[index]
        positions[index,0] = x
        positions[index,1] = y
        positions[index,2] = z
    positions = unit.Quantity(positions, unit.angstroms)
    integrator = openmm.VerletIntegrator(1.0 * unit.femtoseconds)
    simulation = app.Simulation(topology, system, integrator)
    simulation.context.setPositions(positions)
    state = simulation.context.getState(getEnergy=True)
    energy = state.getPotentialEnergy() / unit.kilocalories_per_mole
    #print(energy)
    simulation.minimizeEnergy()
    state = simulation.context.getState(getEnergy=True, getPositions=True)
    energy = state.getPotentialEnergy() / unit.kilocalories_per_mole
    #print(energy)
    
    #Swap coordinates ack into molecule
    positions = state.getPositions(asNumpy=True)
    pos_unitless = positions/unit.angstroms
    coordlist = []
    for idx in range(len(pos_unitless)):
        coordlist.append( pos_unitless[idx][0])
        coordlist.append( pos_unitless[idx][1])
        coordlist.append( pos_unitless[idx][2])
    oecoords = oechem.OEFloatArray(coordlist)
    mol.SetCoords(oecoords)
    
    # Generate topology and system (again to make sure we get the right coords)
    oemols.append(oechem.OEMol(mol))
    topology = generateTopologyFromOEMol(mol)
    topologies.append(topology)
    print("Creating system for %s..." % molname)
    system = ff.createSystem(topology, [mol], chargeMethod = 'OECharges_AM1BCCSym', verbose = verbose)
    systems.append(system)


Creating system for benzene...
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-3-01f5143b0dcf> in <module>()
     32         positions[index,1] = y
     33         positions[index,2] = z
---> 34     positions = unit.Quantity(positions, unit.angstroms)
     35     integrator = openmm.VerletIntegrator(1.0 * unit.femtoseconds)
     36     simulation = app.Simulation(topology, system, integrator)

NameError: name 'unit' is not defined

Now generate mol2 files for our molecules with parm99 types so we can parameterize with AMBER

We'll just go through and modify benzene and toluene to get parm99 types and then write them out.


In [ ]:
# NBVAL_SKIP
# Next step -- need AMBER prmtop/crd files for o-xylene and/or benzene for comparison
# I should be able to get these by hand-typing them with parm@frosst types and then generating from the frcmod, etc.

# Benzene uses CA for carbons, HA for hydrogens. 
# o-xylene would be the same for the ring atoms, and then HC for the methyl hydrogen and CT for the carbon. 
# Superficially that looks consistent with SMIRKS patterns in SMIRNOFF99Frosst

oemols_tripos = [] # Store with original types
# Create benzene and o-xylene and swap atom names before writing right here! 
for idx, oemol in enumerate(oemols):
    oemols_tripos.append(oechem.OEMol(oemol))
    for atom in oemol.GetAtoms():
        typename= atom.GetType()
        #print(typename)
        if typename=='H':
            # What is it connected to?
            c_ar = False
            for bond in atom.GetBonds():
                nbr = bond.GetNbr(atom)
                #print('Atom type %s has neighbor type %s' % (typename, nbr.GetType()))
                if nbr.GetType=='C.ar' or nbr.GetType()=='CA':
                    c_ar = True
            if c_ar:
                atom.SetType('HA')
            else: 
                atom.SetType('HC')
        elif typename=='C.ar':
            atom.SetType('CA')
        elif typename=='C.3':
            atom.SetType('CT')
        #print("   New type %s..." % atom.GetType())

    # Write resulting molecule
    ostream = oechem.oemolostream(molnames[idx]+'.mol2')
    ostream.SetFlavor(oechem.OEFormat_MOL2, oechem.OEOFlavor_MOL2_Forcefield) # Use forcefield flavor to preserve types
    oechem.OEWriteMolecule( ostream, oemol)
    ostream.close()

Generate AMBER prmtop and crd files for these


In [ ]:
# NBVAL_SKIP
frcmod = 'parm_Frosst.frcmod'
for (idx, molname) in enumerate(molnames):
    prmtop = molname+'.prmtop'
    mol2file = molname+'.mol2'
    crd = molname+'.crd'
    tleap_tools.run_tleap(molname, mol2file, frcmod, prmtop_filename=prmtop, inpcrd_filename=crd, verbose = False)

Compare energies of molecules and their components


In [ ]:
# NBVAL_SKIP
from openforcefield.typing.engines.smirnoff import forcefield_utils as ff_utils
import numpy as np
import simtk.unit as unit
print(oemols_tripos)
for idx, molname in enumerate(molnames):
    print("Examining %s..." % molname)
    oemol = oemols_tripos[idx]
    prmtop = molname+'.prmtop'
    crd = molname+'.crd'
    frosst_component, smirnoff_component, frosst_energy, smirnoff_energy = ff_utils.compare_molecule_energies(prmtop, crd, ff, oemol, skip_assert = True, verbose = verbose)
    for comp in frosst_component:
        cdiff = np.abs( frosst_component[comp] - smirnoff_component[comp])
        #print cdiff, comp
        if cdiff > 1e-5*unit.kilocalorie_per_mole:
            print("   Component %s differs by %s..." % (comp, cdiff.in_units_of(unit.kilocalorie_per_mole)))
    print ("SMIRNOFF energy %s, AMBER energy %s" % (smirnoff_energy, frosst_energy))
    
print("Processed.")

Without energy minimization, the ditert-butylbenzene case has nb/angle energies which differ by a bit, but it's basically at the limits of single precision (1 part in 10^8 or so) which is roughly what we ought to expect. It's just that the energy for this guy is pretty huge. With energy minimization, this goes away.

Note that some discrepancy in the angle term is expected for ditert-butylbenzene -- I carefully manipulated the SMIRNOFF so the parameter values would be exactly the same for the angles relevant to o-xylene and benzene but this hasn't been done for ditert-butylbenzene so presumably there is an angle parameter which is a bit different.

In any case, this example tests the improper code and it does now appear to be working correctly.

Compare actual parameters

Note that this will only work up to torsions for o-xylene and benzene (but not ditert-butylbenzene as noted above); the impropers have a different number of terms by design in SMIRNOFF so this will fail for impropers. (Update: Modified code to allow skipping parameter comparison on impropers.)


In [ ]:
# NBVAL_SKIP
# We can use the same code as the above but just switch the assertion about checking terms to on
for idx, molname in enumerate(molnames):
    print("Examining %s..." % molname)
    oemol = oemols_tripos[idx]
    prmtop = molname+'.prmtop'
    crd = molname+'.crd'
    frosst_component, smirnoff_component, frosst_energy, smirnoff_energy = ff_utils.compare_molecule_energies(prmtop, crd, ff, oemol, skip_assert = False, verbose = verbose, skip_improper = True)

    
print("Processed.")

In [ ]: