In [1]:
import sys, os
In [2]:
from adaptivemd import (
Project,
Event, FunctionalEvent,
File
)
# We need this to be part of the imports. You can only restore known objects
# Once these are imported you can load these objects.
from adaptivemd.engine.openmm import OpenMMEngine
from adaptivemd.analysis.pyemma import PyEMMAAnalysis
Let's open our test
project by its name. If you completed the first examples this should all work out of the box.
In [3]:
project = Project('tutorial')
Open all connections to the MongoDB
and Session
so we can get started.
An interesting thing to note here is, that since we use a DB in the back, data is synced between notebooks. If you want to see how this works, just run some tasks in the last example, go back here and check on the change of the contents of the project.
Let's see where we are. These numbers will depend on whether you run this notebook for the first time or just continue again. Unless you delete your project it will accumulate models and files over time, as is our ultimate goal.
In [4]:
print project.files
print project.generators
print project.models
Now restore our old ways to generate tasks by loading the previously used generators.
In [5]:
engine = project.generators['openmm']
modeller = project.generators['pyemma']
pdb_file = project.files['initial_pdb']
You are free to conduct your simulations from a notebook but normally you will use a script. The main point about adaptivity is to make decision about tasks along the way.
To make this happen we need Conditions
which are functions that evaluate to True
or False
and once they are True
they cannot change anymore back to False
. Like a one time on switch.
These are used to describe the happening of an event. We will now deal with some types of events.
We want to first look into a way to run python code asynchroneously in the project. For this, we write a function that should be executed. Inside you will create tasks and submit them.
If the function should pause, write yield {condition_to_continue}
. This will interrupt your script until the function you return will return True
when called. An example
In [6]:
def strategy(loops=10, trajs_per_loop=4, length=100):
for loop in range(loops):
# submit some trajectory tasks
trajectories = project.new_ml_trajectory(length, trajs_per_loop)
tasks = map(engine.task_run_trajectory, trajectories)
project.queue(tasks)
# continue if ALL of the tasks are done (can be failed)
yield [task.is_done for task in tasks]
# submit a model job
task = modeller.execute(list(project.trajectories))
project.queue(task)
# when it is done do next loop
yield task.is_done
and add the event to the project (these cannot be stored yet!)
In [7]:
project.add_event(strategy(loops=2))
Out[7]:
What is missing now? The adding of the event triggered the first part of the code. But to recheck if we should continue needs to be done manually.
RP has threads in the background and these can call the trigger whenever something changed or finished.
Still that is no problem, we can do that easily and watch what is happening
Let's see how our project is growing. TODO: Add threading.Timer to auto trigger.
In [8]:
import time
from IPython.display import clear_output
In [ ]:
try:
while project._events:
clear_output(wait=True)
print '# of files %8d : %s' % (len(project.trajectories), '#' * len(project.trajectories))
print '# of models %8d : %s' % (len(project.models), '#' * len(project.models))
sys.stdout.flush()
time.sleep(2)
project.trigger()
except KeyboardInterrupt:
pass
Let's do another round with more loops
In [10]:
project.add_event(strategy(loops=2))
Out[10]:
And some analysis (might have better functions for that)
In [11]:
# find, which frames from which trajectories have been chosen
trajs = project.trajectories
q = {}
ins = {}
for f in trajs:
source = f.frame if isinstance(f.frame, File) else f.frame.trajectory
ind = 0 if isinstance(f.frame, File) else f.frame.index
ins[source] = ins.get(source, []) + [ind]
for a,b in ins.iteritems():
print a.short, ':', b
And do this with multiple events in parallel.
In [12]:
def strategy2():
for loop in range(10):
num = len(project.trajectories)
task = modeller.execute(list(project.trajectories))
project.queue(task)
yield task.is_done
# continue only when there are at least 2 more trajectories
yield project.on_ntraj(num + 2)
In [13]:
project.add_event(strategy(loops=10, trajs_per_loop=2))
project.add_event(strategy2())
Out[13]:
And now wait until all events are finished.
In [6]:
project.wait_until(project.events_done)
See, that we again reused our strategy.
In [18]:
project.close()