Example 4. HIV Protease bound to an inhibitor

About      Issues      Tutorials      Documentation </span>

Example 4: The Dynamics of HIV Protease bound to a small molecule

This notebook prepares a co-crystallized protein / small molecule ligand structure from the PDB database and prepares it for molecular dynamics simulation.

  • Author: Aaron Virshup, Autodesk Research
  • Created on: August 9, 2016
  • Tags: HIV Protease, small molecule, ligand, drug, PDB, MD

In [ ]:
import moldesign as mdt
import moldesign.units as u

I. The crystal structure

First, we'll download and investigate the 3AID crystal structure.

A. Download and visualize

In [ ]:
protease = mdt.from_pdb('3AID')

In [ ]:

B. Try assigning a forcefield

This structure is not ready for MD - this command will raise a ParameterizationError Exception. After running this calculation, click on the Errors/Warnings tab to see why.

In [ ]:
amber_ff = mdt.forcefields.DefaultAmber()
newmol = amber_ff.create_prepped_molecule(protease)

You should see 3 errors:

  1. The residue name ARQ not recognized
  2. Atom HD1 in residue HIS69, chain A was not recognized
  3. Atom HD1 in residue HIS69, chain B was not recognized

(There's also a warning about bond distances, but these can be generally be fixed with an energy minimization before running dynamics)

We'll start by tackling the small molecule "ARQ".

II. Parameterizing a small molecule

We'll use the GAFF (generalized Amber force field) to create force field parameters for the small ligand.

A. Isolate the ligand

Click on the ligand to select it, then we'll use that selection to create a new molecule.

In [ ]:
sel = mdt.widgets.ResidueSelector(protease)

In [ ]:
drugres = mdt.Molecule(sel.selected_residues[0])
drugres.draw2d(width=700, show_hydrogens=True)

B. Assign bond orders and hydrogens

A PDB file provides only limited information; they often don't provide indicate bond orders, hydrogen locations, or formal charges. These can be added, however, with the add_missing_pdb_data tool:

In [ ]:
drugmol = mdt.tools.set_hybridization_and_saturate(drugres)

In [ ]:
drugmol.draw2d(width=700, show_hydrogens=True)

C. Generate forcefield parameters

We'll next generate forcefield parameters using this ready-to-simulate structure.

NOTE: for computational speed, we use the gasteiger charge model. This is not advisable for production work! am1-bcc or esp are far likelier to produce sensible results.

In [ ]:
drug_parameters = mdt.create_ff_parameters(drugmol, charges='gasteiger')

III. Prepping the protein

Section II. dealt with getting forcefield parameters for an unknown small molecule. Next, we'll prep the other part of the structure.

A. Strip waters

Waters in crystal structures are usually stripped from a simulation as artifacts of the crystallization process. Here, we'll remove the waters from the protein structure.

In [ ]:
dehydrated = mdt.Molecule([atom for atom in protease.atoms if atom.residue.type != 'water'])

B. Histidine

Histidine is notoriously tricky, because it exists in no less than three different protonation states at biological pH (7.4) - the "delta-protonated" form, referred to with residue name HID; the "epsilon-protonated" form aka HIE; and the doubly-protonated form HIP, which has a +1 charge. Unfortunately, crystallography isn't usually able to resolve the difference between these three.

Luckily, these histidines are pretty far from the ligand binding site, so their protonation is unlikely to affect the dynamics. We'll therefore use the guess_histidine_states function to assign a reasonable starting guess.

In [ ]:

IV. Prep for dynamics

With these problems fixed, we can succesfully assigne a forcefield and set up the simulation.

A. Assign the forcefield

Now that we have parameters for the drug and have dealt with histidine, the forcefield assignment will succeed:

In [ ]:
amber_ff = mdt.forcefields.DefaultAmber()
sim_mol = amber_ff.create_prepped_molecule(dehydrated)

B. Attach and configure simulation methods

Armed with the forcefield parameters, we can connect an energy model to compute energies and forces, and an integrator to create trajectories:

In [ ]:
sim_mol.set_energy_model(mdt.models.OpenMMPotential, implicit_solvent='obc', cutoff=8.0*u.angstrom)
sim_mol.set_integrator(mdt.integrators.OpenMMLangevin, timestep=2.0*u.fs)

C. Equilibrate the protein

The next series of cells first minimize the crystal structure to remove clashes, then heats the system to 300K.

In [ ]:
mintraj = sim_mol.minimize()

In [ ]:
traj = sim_mol.run(20*u.ps)

In [ ]:
viewer = traj.draw(display=True)

In [ ]: