This example shows how to create an Activity
record, linking inputs, outputs and related files.
In [ ]:
import ovation as ov
import ovation.activities as activities
import ovation.download as download
from ovation.session import connect
from tqdm import tqdm_notebook
from pprint import pprint
In [3]:
s = connect(input('Email: '), org=int(input("Organization (enter for default): ") or 0))
When creating an activity, you can specify the inputs, outputs and related files at the time of creation. Don't worry if you don't know all of them yet. You can also add and remove inputs, outputs, and related files later.
Activity inputs are specified as an array of UUIDs or entity dicts
and can be either Revisions
or Sources
. You can use the local path of a file. Local files will be uploaded (creating associated File
and Reivision
records).
In [ ]:
inputs = ['181c9eb7-8450-4d59-9b5a-ede7fb984b51','181c9eb7-8450-4d59-9b5a-ede7fb984b51']
Activity outputs are specified as an array of UUIDs or entity dicts
to Revisions
. You can use the local path of a file as well. Local files will be uploaded (creating associated File
and Reivision
records).
In [ ]:
outputs = ['9744f67f-7daa-43a5-901f-8f63b5b956d4']
create_activity
creates the activity and associates it with the given Project
.
In [ ]:
project_id = input('Project UUID: ')
activity_name = input('Activity name: ')
activity = activities.create_activity(s,
project_id,
activity_name,
inputs=inputs, outputs=outputs)
After creation, you can modify the inputs, outputs, and related files of an Activity. Of course, it's a good idea to do this carefully if downstream results depend on the results of this Activity. Inputs, outputs, and related files are added and removed in the same way (using add_inputs
, remove_inputs
; add_outputs
, remove_outputs
; add_related
, and remove_related
). This example shows how to add a new output to the activity:
In [ ]:
activites.add_outputs(s, activity, outputs=['local/file/analysis_result.csv'])
To remove an output
, use remove_oututs
:
In [ ]:
activites.remove_outputs(s, activity, outputs=['181c9eb7-8450-4d59-9b5a-ede7fb984b51'])
It's common to create an activity from exsting inputs, download the activity inputs and run an analysis and then upload the results as outputs to the activity. This example shows this common workflow:
In [ ]:
# Collect information
project_id = input('Project UUID: ')
activity_name = input('Activity name: ')
# Create the activity
activity = activities.create_activity(s,
project_id,
activity_name,
inputs=inputs)
## Download inputs to the workding directory.
# For simplicity, we use a for loop. For faster downloads,
# consider using a multiprocessing.Pool to map over the inputs
inputs = s.get(activity.relationships.inputs.related)
for revision in inputs:
download.download_revision(s, revision, progress=tqdm_notebook)
# DO SOME ANALYSIS
# This part's all on you. Fortunately, you're a world expert. Go get 'em!
# Upload outputs
# In this example, analysis_result.csv is the output
activities.add_output(s, activity, outputs=['./analysis_result.csv'])