In [1]:
from dmpipe.pipeline import Pipeline
First we build a pipeline object to analyze the dSphs, and point it at the config file for the dSphs analysis.
In [2]:
pipe = Pipeline(linkname = 'dSphs')
In [3]:
configfile = 'config/master_dSphs.yaml'
We have to 'preconfigure' the pipeline, because we need to build up the list of targets so that we correctly set up the later stages of the pipeline.
In [4]:
pipe.preconfigure(configfile)
We then tell the pipeline to update the arguments for all of the steps that comprise it.
In [5]:
pipe.update_args(dict(config=configfile))
Now we look around a bit, we can drill down into the links that make up the pipeline. In this case we have a pipeline that consists of a couple of preparation jobs, then 5 sub-pipelines ('data', 'sim_xxx', 'random'). The sub-pipeline analyze the real data, and some different simulation cases. The 'random' sub-pipeline analyzes random sky directions.
At this point the pipeline is fully configured, let's look around.
In [6]:
pipe.linknames
Out[6]:
In [7]:
pipe['data']
Out[7]:
In [8]:
pipe['data'].linknames
Out[8]:
We can print the status for the various links. Using recurse=True will also drill down to the status of the links in the sub pipelines.
In [9]:
pipe.print_status()
In [10]:
pipe.print_status(recurse=True)
We can access a particular Link in the pipeline.
We can ask:
what the default options for the link are ('_options')
what the current set of options are ('args')
In [11]:
pipe['data']['analyze-roi']
Out[11]:
In [12]:
pipe['data']['analyze-roi']._options
Out[12]:
In [13]:
pipe['data']['analyze-roi'].args
Out[13]:
We can ask what jobs get run by this link. In this paritcular case the link runs 1 job that in turn dispatches several other jobs to the batch farm.
In [14]:
pipe['data']['analyze-roi'].jobs
Out[14]:
Here we are talking to the link that represents any one of the disptached jobs. The command_template() function tells us how we would run this job from the UNIX command line.
In [15]:
pipe['data']['analyze-roi'].scatter_link
Out[15]:
In [16]:
pipe['data']['analyze-roi'].scatter_link.command_template()
Out[16]:
Here are ask what jobs will be dispatched. Note that there are two jobs with slightly different names.
In [17]:
pipe['data']['analyze-roi'].scatter_link.jobs
Out[17]:
Here are are asking for information about the first of those two jobs. In particular, the specific options used for this instance of the job. You can merge the job_config with the command_template to get the exact syntax for the instance of this command.
In [18]:
pipe['data']['analyze-roi'].scatter_link.jobs['draco@dSphs.data.analyze-roi']
Out[18]:
In [19]:
pipe['data']['analyze-roi'].scatter_link.jobs['draco@dSphs.data.analyze-roi'].job_config
Out[19]:
Here we run a single link.
We use 'run_with_log()' so that the log file will be in the correct place and the pipeline will know the job
has completed.
In [20]:
pipe['spec-table'].run_with_log()
In [21]:
pipe.print_status()
Here we run the 'data' sub-pipeline.
In [22]:
pipe['data'].run(resubmit_failed=True)
In [23]:
pipe.print_status(recurse=True)
In [24]:
pipe['data']['convert-castro'].scatter_link.jobs['draco:ack2016_point:lgauss@dSphs.data.convert-castro'].job_config
Out[24]:
In [25]:
pipe.run(resubmit_failed=True)
In [ ]: