AdaptiveMD

Example 1 - Setup

0. Imports


In [1]:
import sys, os, time

We want to stop RP from reporting all sorts of stuff for this example so we set a specific environment variable to tell RP to do so. If you want to see what RP reports change it to REPORT.


In [2]:
# verbose = os.environ.get('RADICAL_PILOT_VERBOSE', 'REPORT')
os.environ['RADICAL_PILOT_VERBOSE'] = 'ERROR'

We will import the appropriate parts from AdaptiveMD as we go along so it is clear what it needed at what stage. Usually you will have the block of imports at the beginning of your script or notebook as suggested in PEP8.


In [3]:
from adaptivemd import Project


/Users/jan-hendrikprinz/anaconda/lib/python2.7/site-packages/radical/utils/atfork/stdlib_fixer.py:58: UserWarning: logging module already imported before fixup.
  warnings.warn('logging module already imported before fixup.')

In [4]:
from adaptivemd.engine.openmm import OpenMMEngine
from adaptivemd.analysis.pyemma import PyEMMAAnalysis

from adaptivemd import File, Directory, WorkerScheduler

Let's open a project with a UNIQUE name. This will be the name used in the DB so make sure it is new and not too short. Opening a project will always create a non-existing project and reopen an exising one. You cannot chose between opening types as you would with a file. This is a precaution to not accidentally delete your project.


In [5]:
# Project.delete('test')

In [6]:
project = Project('testcase-worker')

Now we have a handle for our project. First thing is to set it up to work on a resource.


In [98]:
print len([t for t in project.trajectories if t.exists])


264

In [99]:
for w in project.workers:
    print w.hostname, w.state


Stevie.fritz.box down
Stevie.fritz.box running
Stevie.fritz.box down
Stevie.fritz.box running

In [14]:
from uuid import UUID

In [15]:
import datetime
datetime.datetime.fromtimestamp(modeller.__time__).strftime("%Y-%m-%d %H:%M:%S")


Out[15]:
'2017-03-06 22:46:29'

In [16]:
project.generators.add(engine)
project.generators.add(modeller)


Added file <adaptivemd.engine.openmm.openmm.OpenMMEngine object at 0x10cbd2c10>
Added file <adaptivemd.analysis.pyemma.emma.PyEMMAAnalysis object at 0x10cf1f2d0>

In [17]:
sc = WorkerScheduler(project.resource)
sc.enter(project)


Changed to booting
Changed to running

In [18]:
t = engine.task_run_trajectory(project.new_trajectory(pdb_file, 100, restart=True)). extend(50).extend(100)

In [54]:
sc(t)


Out[54]:
[<adaptivemd.engine.engine.TrajectoryGenerationTask at 0x10cf3e390>]

In [63]:
sc.advance()

In [ ]:


In [64]:
print project.generators


<StoredBundle with 2 file(s) @ 0x10cbb1190>

In [17]:
t1 = engine.task_run_trajectory(project.new_trajectory(pdb_file, 100, restart=True))
t2 = t1.extend(100)

In [18]:
t2.trajectory.restart


Out[18]:
RestartFile(00000000.dcd.restart)

In [19]:
project.tasks.add(t2)


Added file <adaptivemd.engine.engine.TrajectoryGenerationTask object at 0x10ce1e650>

In [24]:
for f in project.trajectories:
    print f.drive, f.basename, len(f), f.created, f.__time__, f.exists, hex(f.__uuid__)


sandbox 00000000.dcd 100 1488816219.52 1488816205 True 0x60e23714028611e78222000000000068L
sandbox 00000000.dcd 200 None 1488816205 False 0x60e23714028611e7822200000000008eL

In [27]:
for f in project.files:
    print f.drive, f.path, f.created, f.__time__, f.exists, hex(f.__uuid__)


file /Users/jan-hendrikprinz/Studium/git/adaptive-sampling/package/adaptivemd/scripts/_run_.py None 1488816183 False 0x60e23714028611e78222000000000002L
file /Users/jan-hendrikprinz/Studium/git/adaptive-sampling/package/adaptivemd/engine/openmm/openmmrun.py None 1488816184 False 0x60e23714028611e78222000000000004L
file /Users/jan-hendrikprinz/Studium/git/adaptive-sampling/package/examples/files/alanine/alanine.pdb None 1488816200 False 0x60e23714028611e7822200000000003aL
file /Users/jan-hendrikprinz/Studium/git/adaptive-sampling/package/examples/files/alanine/system.xml None 1488816200 False 0x60e23714028611e7822200000000003cL
file /Users/jan-hendrikprinz/Studium/git/adaptive-sampling/package/examples/files/alanine/integrator.xml None 1488816200 False 0x60e23714028611e7822200000000003eL
sandbox /projects/test/trajs/00000000.dcd 1488816219.52 1488816205 True 0x60e23714028611e78222000000000068L
unit 00000000.restart 1488816219.52 1488816213 True 0x60e23714028611e7822200000000006aL
sandbox /projects/test/trajs/00000000.dcd None 1488816205 False 0x60e23714028611e7822200000000008eL

In [22]:
w = project.workers.last
print w.state
print w.command


running
None

In [23]:
for t in project.tasks:
    print t.state, t.worker.hostname if t.worker else 'None'


failed Stevie.fritz.box
success Stevie.fritz.box

In [17]:


In [18]:


In [19]:


In [20]:



Out[20]:
[<adaptivemd.engine.engine.TrajectoryGenerationTask at 0x10cec5610>]

In [24]:
sc.advance()

In [33]:
t1 = engine.task_run_trajectory(project.new_trajectory(pdb_file, 100))
t2 = t1.extend(100)

In [34]:
project.tasks.add(t2)


Added file <adaptivemd.engine.engine.TrajectoryGenerationTask object at 0x10cd93a10>

In [35]:
# from adaptivemd.engine import Trajectory
# t3 = engine.task_run_trajectory(Trajectory('staging:///trajs/0.dcd', pdb_file, 100)).extend(100)
# t3.dependencies = []

# def get_created_files(t, s):
#     if t.is_done():
#         print 'done', s
#         return s - set(t.added_files)
#     else:
#         adds = set(t.added_files)
#         rems = set(s.required[0] for s in t._pre_stage)
#         print '+', adds
#         print '-', rems
#         q = set(s) - adds | rems 
        
#         if t.dependencies is not None:
#             for d in t.dependencies:                
#                 q = get_created_files(d, q)

#         return q
    
# get_created_files(t3, {})

In [42]:
for w in project.workers:
    print w.hostname, w.state


Stevie.fritz.box releaseunfinished

In [43]:
w = project.workers.last
print w.state
print w.command


releaseunfinished
None

In [39]:
w.command = 'shutdown'

In [38]:
for t in project.tasks:
    print t.state, t.worker.hostname if t.worker else 'None'


success Stevie.fritz.box
success Stevie.fritz.box
queued Stevie.fritz.box
success Stevie.fritz.box
created None
created None
created None
created None
created None
created None

In [112]:
for f in project.trajectories:
    print f.drive, f.basename, len(f), f.created, f.__time__, f.exists, hex(f.__uuid__)


sandbox 00000000.dcd 100 1488811288.83 1488811284 True 0xf6345b8a027a11e7a1bb00000000007cL
sandbox 00000001.dcd 100 1488811302.4 1488811290 True 0xf6345b8a027a11e7a1bb0000000000a0L
sandbox 00000001.dcd 200 1488811310.45 1488811290 True 0xf6345b8a027a11e7a1bb0000000000c2L
sandbox 00000002.dcd 100 1488811602.49 1488811552 True 0xf6345b8a027a11e7a1bb000000000140L
sandbox 00000002.dcd 200 1488811610.53 1488811552 True 0xf6345b8a027a11e7a1bb000000000162L
sandbox 00000003.dcd 100 None 1488811553 False 0xf6345b8a027a11e7a1bb000000000192L
sandbox 00000003.dcd 200 None 1488811553 False 0xf6345b8a027a11e7a1bb0000000001b4L

In [138]:
project.trajectories.one[0]


Out[138]:
Frame(00000000.dcd[0])

In [139]:
t = engine.task_run_trajectory(project.new_trajectory(project.trajectories.one[0], 100))

In [140]:
project.tasks.add(t)


Added file <adaptivemd.engine.engine.TrajectoryGenerationTask object at 0x10c8e9490>

In [141]:
print project.files
print project.tasks


<StoredBundle with 21 file(s) @ 0x10ccb4e50>
<StoredBundle with 15 file(s) @ 0x10ccb4e90>

In [30]:
t = modeller.execute(list(project.trajectories))

In [32]:
project.tasks.add(t)

In [34]:
from uuid import UUID

In [ ]:
project.storage.tasks._document.find_one({'_dict': {'generator' : { '_dict': }}})

In [56]:
genlist = ['openmm']


Out[56]:
<adaptivemd.task.Task at 0x10594d550>

In [51]:
scheduler = sc
prefetch = 1

while True:
    scheduler.advance()
    if scheduler.is_idle:
        for _ in range(prefetch):
            tasklist = scheduler(project.storage.tasks.consume_one())

        if len(tasklist) == 0:
            break

    time.sleep(2.0)


['ln -s ../staging_area/alanine.pdb initial.pdb', 'ln -s ../staging_area/system.xml system.xml', 'ln -s ../staging_area/integrator.xml integrator.xml', 'ln -s ../staging_area/openmmrun.py openmmrun.py', 'ln -s ../staging_area/trajs/00000000.dcd input.dcd', 'hostname', 'mdconvert -o input.pdb -i 0 -t initial.pdb input.dcd', 'python "openmmrun.py" "-r" "--report-interval" "1" "-p" "CPU" "--store-interval" "1" "-t" "input.pdb" "--length" "100" "output.dcd"', 'mv output.dcd ../staging_area/trajs/00000004.dcd']
task succeeded. State: success
Added file sandbox:///workers/staging_area/trajs/00000004.dcd

Note, that you cannot add the same engine twice. But if you create a new engine it will be considered different and hence you can store it again.

Create one intial trajectory

Finally we are ready to run a first trajectory that we will store as a point of reference in the project. Also it is nice to see how it works in general.

1. Open a scheduler

a job on the cluster to execute tasks

the .get_scheduler function delegates to the resource and uses the get_scheduler functions from there. This is merely a convenience since a Scheduler has the responsibility to open queues on the resource for you.

You have the same options as the queue has in the resource. This is often the number of cores and walltime, but can be additional ones, too.

Let's open the default queue and use a single core for it since we only want to run one simulation.


In [15]:
scheduler = project.get_scheduler(cores=1)

Next we create the parameter for the engine to run the simulation. Since it seemed appropriate we use a Trajectory object (a special File with initial frame and length) as the input. You could of course pass these things separately, but this way, we can actualy reference the no yet existing trajectory and do stuff with it.

A Trajectory should have a unique name and so there is a project function to get you one. It uses numbers and makes sure that this number has not been used yet in the project.


In [16]:
trajectory = project.new_trajectory(engine['pdb_file'], 100)
trajectory


Out[16]:
Trajectory('alanine.pdb' >> 00000000.dcd[0..100])

This says, initial is alanine.pdb run for 100 frames and is named xxxxxxxx.dcd.

Now, we want that this trajectory actually exists so we have to make it (on the cluster which is waiting for things to do). So we need a Task object to run a simulation. Since Task objects are very flexible there are helper functions to get them to do, what you want, like the ones we already created just before. Let's use the openmm engine to create an openmm task


In [17]:
task = engine.task_run_trajectory(trajectory)

That's it, just that a trajectory description and turn it into a task that contains the shell commands and needed files, etc.

Last step is to really run the task. You can just use a scheduler as a function or call the .submit() method.


In [18]:
scheduler(task)


Out[18]:
[<adaptivemd.task.Task at 0x1214190d0>]

Now we have to wait. To see, if we are done, you can check the scheduler if it is still running tasks.


In [26]:
scheduler.is_idle


* unit.000000  state Failed (None), out/err:  / 
task did not complete
Out[26]:
True

In [35]:
print scheduler.generators


<StoredBundle with 0 file(s) @ 0x11f942810>

or you wait until it becomes idle using .wait()


In [27]:
# scheduler.wait()

If all went as expected we will now have our first trajectory.


In [28]:
print project.files
print project.trajectories


<StoredBundle with 0 file(s) @ 0x11f942850>
<ViewBundle with 0 file(s) @ 0x11f942890>

Excellent, so cleanup and close our queue


In [67]:
scheduler.exit()

and close the project.


In [68]:
project.close()

The final project.close() will also shut down all open schedulers for you, so the exit command would not be necessary here. It is relevant if you want to exit the queue as soon as possible to save walltime.

Summary

You have finally created an AdaptiveMD project and run your first trajectory. Since the project exists now, it is much easier to run more trajectories now.


In [ ]: