As you've seen from our documentation, Boutiques is a flexible way to represent command line executables and distribute them across compute ecosystems consistently. A Boutiques tool descriptor is a JSON file that fully describes the input and output parameters and files for a given command line call (or calls, as you can include pipes(|) and ampersands (&)). There are several ways Boutiques helps you build a tool descriptor for your tool:
To help you aid in this process, we will walk through the process of making an tool descriptor for FSL's BET (finished product found here).
The first step in creating an tool descriptor for your command line call is creating a fully descriptive list of your command line options. If your tool was written in Python and you use the argparse library, then this is already done for you in large part. For many tools (bash, Python, or otherwise) this can be obtained by typing executing it with the -h flag. In the case of FSL's BET, we get the following:
In [2]:
%%bash
bet -h
Looking at all of these flags, we see a list of options which can be summarized by:
bet [INPUT_FILE] [MASK] [FRACTIONAL_INTENSITY] [VERTICAL_GRADIENT] [CENTER_OF_GRAVITY] [OVERLAY_FLAG] [BINARY_MASK_FLAG] [APPROX_SKULL_FLAG] [NO_SEG_OUTPUT_FLAG] [VTK_VIEW_FLAG] [HEAD_RADIUS] [THRESHOLDING_FLAG] [ROBUST_ITERS_FLAG] [RES_OPTIC_CLEANUP_FLAG] [REDUCE_BIAS_FLAG] [SLICE_PADDING_FLAG] [MASK_WHOLE_SET_FLAG] [ADD_SURFACES_FLAG] [ADD_SURFACES_T2] [VERBOSE_FLAG] [DEBUG_FLAG]
Now that we have summarized all command line options for our tool - some of which describe inputs and others, outputs - we can begin to craft our JSON Boutiques tool descriptor.
For those unfamiliar with JSON, we recommend following this 3 minute JSON tutorial to get you up to speed. In short, a JSON file is a dictionary object which contains keys and associated values. A key informs us what is being described, and a value is the description (which, importantly, can be arbitrarily typed). The Boutiques tool descriptor is a JSON file which requires the following keys, or, properties:
namedescriptionschema-versioncommand-lineinputsoutput-filesSome additional, optional, properties that a Boutiques fill will recognize are:
groupstool-versionsuggested-resourcescontainer-image:typeimageindexIn the case of BET, we will of course populate the required elements, but will also include tool-version and groups.
We will break-up populating the tool descriptor into two sections: adding meta-parameters (such as name, description, schema-version, command-line, tool-version, and docker-image, docker-index if we were to include them) and i/o-parameters (such as inputs, output-files, and groups).
Currently, before adding any details, our tool descriptor should looks like this:
{
"name" : TODO,
"tool-version": TODO,
"description": TODO,
"command-line": TODO,
"scheme-version": TODO,
"inputs": TODO,
"output-files": TODO,
}
Many of the meta-parameters will be obvious to you if you're familiar with the tool, or extractable from the message received earlier when you passed the -h flag into your program. We can update our JSON to be the following:
{
"name" : "fsl_bet",
"tool-version" : "1.0.0",
"description" : "Automated brain extraction tool for FSL",
"command-line" : "bet [INPUT_FILE] [MASK] [FRACTIONAL_INTENSITY] [VERTICAL_GRADIENT] [CENTER_OF_GRAVITY] [OVERLAY_FLAG] [BINARY_MASK_FLAG] [APPROX_SKULL_FLAG] [NO_SEG_OUTPUT_FLAG] [VTK_VIEW_FLAG] [HEAD_RADIUS] [THRESHOLDING_FLAG] [ROBUST_ITERS_FLAG] [RES_OPTIC_CLEANUP_FLAG] [REDUCE_BIAS_FLAG] [SLICE_PADDING_FLAG] [MASK_WHOLE_SET_FLAG] [ADD_SURFACES_FLAG] [ADD_SURFACES_T2] [VERBOSE_FLAG] [DEBUG_FLAG]",
"schema-version" : "0.4",
"inputs": TODO,
"output-files": TODO,
"groups": TODO
}
Inputs and outputs of many applications are complicated - outputs can be dependent upon input flags, flags can be mutually exclusive or require at least one option, etc. The way Boutiques handles this is with a detailed schema which consists of options for inputs and outputs, as well as optionally specifying groups of inputs which may add additional layers of input complexity.
As you have surely noted, tools do only contain a single "name" or "version" being used, but may have many input and output parameters. This means that inputs, outputs, and groups, will be described as a list. Each element of these lists will be a dictionary following the input, output, or group schema, respectively. This means that our JSON actually looks more like this:
{
"name" : "fsl_bet",
"tool-version" : "1.0.0",
"description" : "Automated brain extraction tool for FSL",
"command-line" : "bet [INPUT_FILE] [MASK] [FRACTIONAL_INTENSITY] [VERTICAL_GRADIENT] [CENTER_OF_GRAVITY] [OVERLAY_FLAG] [BINARY_MASK_FLAG] [APPROX_SKULL_FLAG] [NO_SEG_OUTPUT_FLAG] [VTK_VIEW_FLAG] [HEAD_RADIUS] [THRESHOLDING_FLAG] [ROBUST_ITERS_FLAG] [RES_OPTIC_CLEANUP_FLAG] [REDUCE_BIAS_FLAG] [SLICE_PADDING_FLAG] [MASK_WHOLE_SET_FLAG] [ADD_SURFACES_FLAG] [ADD_SURFACES_T2] [VERBOSE_FLAG] [DEBUG_FLAG]",
"schema-version" : "0.4",
"inputs": [
{TODO},
{TODO},
...
],
"output-files": [
{TODO},
{TODO},
...
],
}
As the file is beginning to grow considerably in number of lines, we will no longer show you the full JSON at each step but will simply show you the dictionaries responsible for output, input, and group entries.
The input schema contains several options, many of which can be ignored in this first example with the exception of id, name, and type. For BET, there are several input values we can choose to demonstrate this for you. We have chosen three with considerably different functionality and therefore schemas. In particular:
[INPUT_FILE][FRACTIONAL_INTENSITY][CENTER_OF_GRAVITY][INPUT_FILE] The simplest of these in the [INPUT_FILE] which is a required parameter that simply expects a qualified path to a file. The dictionary entry is:
{
"id" : "infile",
"name" : "Input file",
"type" : "File",
"description" : "Input image (e.g. img.nii.gz)",
"optional": false,
"value-key" : "[INPUT_FILE]"
}
[FRACTIONAL_INTENSITY] This parameter documents an optional flag that can be passed to the executable. Along with the flag, when it is passed, is a floating point value that can range from 0 to 1. We are able to validate at the level of Boutiques whether or not a valid input is passed, so that jobs are not submitted to the execution engine which will error, but they get flagged upon validation of inputs. This dictionary is:
{
"id" : "fractional_intensity",
"name" : "Fractional intensity threshold",
"type" : "Number",
"description" : "Fractional intensity threshold (0->1); default=0.5; smaller values give larger brain outline estimates",
"command-line-flag": "-f",
"optional": true,
"value-key" : "[FRACTIONAL_INTENSITY]",
"integer" : false,
"minimum" : 0,
"maximum" : 1
}
[CENTER_OF_GRAVITY] The center of gravity value expects a triple (i.e. [X, Y, Z] position) if the flag is specified. Here we are able to set the condition that the length of the list received after this flag is 3, by specifying that the input is a list that has both a minimum and maximum length.
{
"id" : "center_of_gravity",
"name" : "Center of gravity vector",
"type" : "Number",
"description" : "The xyz coordinates of the center of gravity (voxels, not mm) of initial mesh surface. Must have exactly three numerical entries in the list (3-vector).",
"command-line-flag": "-c",
"optional": true,
"value-key" : "[CENTER_OF_GRAVITY]",
"list" : true,
"min-list-entries" : 3,
"max-list-entries" : 3
}
For further examples of different types of inputs, feel free to explore more examples.
The output schema also contains several options, with the only mandatory ones being id, name, and path-template. We again demonstrate an example from BET:
outfileoutfile All of the output parameters in BET are similarly structured, and exploit the same core functionality of basing the output file, described by path-template, as a function of an input value on the command line, here given by [MASK]. The optional flag also describes whether or not a derivative should always be produced, and whether Boutiques should indicate an error if a file isn't found. The output descriptor is thus:
{
"id" : "outfile",
"name" : "Output mask file",
"description" : "Main default mask output of BET",
"path-template" : "[MASK].nii.gz",
"optional" : true
}
An extension of the feature of naming outputs based on inputs exists in newer versions of the schema than this example was originally developed, and enable stripping the extension of the input values used, as well. An example of this can be seen here.
The group schema enables provides an additional layer of complexity when considering the relationships between inputs. For instance, if multiple inputs within a set are mutually exclusive, they may be grouped and a flag set indicating that only one can be selected. Alternatively, if at least one option within a group must be specified, the user can also set a flag indicating such. The following group from the BET implementation is used to illustrate this:
variational_params_groupvariational_params_group Many flags exist in BET, and each of them is represented in the command line we specified earlier. However, as you may have noticed when reading the output of bet -h, several of these options are mutually exclusive to one another. In order to again prevent jobs from being submitted to a scheduler and failing there, Boutiques enables grouping of inputs and forcing such mutual exclusivity so that the invalid inputs are flagged in the validation stage. This group dictionary is:
{
"id" : "variational_params_group",
"name" : "Variations on Default Functionality",
"description" : "Mutually exclusive options that specify variations on how BET should be run.",
"members" : ["robust_iters_flag", "residual_optic_cleanup_flag", "reduce_bias_flag", "slice_padding_flag", "whole_set_mask_flag", "additional_surfaces_flag", "additional_surfaces_t2"],
"mutually-exclusive" : true
}
Though an example of one-is-required input groups is not available in our BET example, you can investigate a validated tool descriptor here to see how it is implemented.
Now that the basic implementation of this tool has been done, you can check out the schema to explore deeper functionality of Boutiques. For example, if you have created a Docker or Singularity container, you can associate an image with your tool descriptor and any compute resource with Docker or Singularity installed will launch the executable through them (an example of using Docker can be found here).
Once you've completed your Boutiques tool descriptor, you should run the validator to ensure that you have created it correctly. The README.md here describes how to install and use the validator and remainder of the Boutiques shell (bosh) tools on your tool descriptor.
Once the tool descriptor has been validated, your tool is now ready to be integrated in a platform that supports Boutiques. You can use the localExec.py tool described here to launch your container locally for preliminary testing. Once you feel comfortable with your tool, you can contact your system administrator and have them integrate it into their compute resources so you can test and use it to process your data.