Data Output

Similarly important to data input is data output. Using a data output module allows you to restructure and rename computed output and to spatial differentiate relevant output files from the temporary computed intermediate files in the working directory. Nipype provides the following modules to handle data stream output:

DataSink
JSONFileSink
MySQLSink
SQLiteSink
XNATSink

This tutorial covers only DataSink. For the rest, see the section interfaces.io on the official homepage.

Preparation

Before we can use DataSink we first need to run a workflow. For this purpose, let's create a very short preprocessing workflow that realigns and smooths one functional image of one subject.

First, let's create a SelectFiles node. For an explanation about this step, see the Data Input tutorial.


In [ ]:
from nipype import SelectFiles, Node

# Create SelectFiles node
templates={'func': '{subject}/{session}/func/{subject}_{session}_task-fingerfootlips_bold.nii.gz'}
sf = Node(SelectFiles(templates),
          name='selectfiles')
sf.inputs.base_directory = '/data/ds000114'
sf.inputs.subject = 'sub-01'
sf.inputs.session = 'ses-test'

Second, let's create the motion correction and smoothing node. For an explanation about this step, see the Nodes and Interfaces tutorial.


In [ ]:
from nipype.interfaces.fsl import MCFLIRT, IsotropicSmooth

# Create Motion Correction Node
mcflirt = Node(MCFLIRT(mean_vol=True,
                       save_plots=True),
               name='mcflirt')

# Create Smoothing node
smooth = Node(IsotropicSmooth(fwhm=4),
              name='smooth')

Third, let's create the workflow that will contain those three nodes. For an explanation about this step, see the Workflow tutorial.


In [ ]:
from nipype import Workflow
from os.path import abspath

# Create a preprocessing workflow
wf = Workflow(name="preprocWF")
wf.base_dir = '/output/working_dir'

# Connect the three nodes to each other
wf.connect([(sf, mcflirt, [("func", "in_file")]),
            (mcflirt, smooth, [("out_file", "in_file")])])

Now that everything is set up, let's run the preprocessing workflow.


In [ ]:
wf.run()


170904-05:38:15,44 workflow INFO:
	 Workflow preprocWF settings: ['check', 'execution', 'logging']
170904-05:38:15,71 workflow INFO:
	 Running serially.
170904-05:38:15,72 workflow INFO:
	 Executing node selectfiles in dir: /output/working_dir/preprocWF/selectfiles
170904-05:38:15,120 workflow INFO:
	 Executing node mcflirt in dir: /output/working_dir/preprocWF/mcflirt
170904-05:38:15,125 workflow INFO:
	 Running: mcflirt -in /data/ds000114/sub-01/ses-test/func/sub-01_ses-test_task-fingerfootlips_bold.nii.gz -meanvol -out /output/working_dir/preprocWF/mcflirt/sub-01_ses-test_task-fingerfootlips_bold_mcf.nii.gz -plots
170904-05:39:33,271 workflow INFO:
	 Executing node smooth in dir: /output/working_dir/preprocWF/smooth
170904-05:39:33,278 workflow INFO:
	 Running: fslmaths /output/working_dir/preprocWF/mcflirt/sub-01_ses-test_task-fingerfootlips_bold_mcf.nii.gz -s 1.69864 /output/working_dir/preprocWF/smooth/sub-01_ses-test_task-fingerfootlips_bold_mcf_smooth.nii.gz
Out[ ]:
<networkx.classes.digraph.DiGraph at 0x7f43a65f8a20>

After the execution of the workflow we have all the data hidden in the working directory 'working_dir'. Let's take a closer look at the content of this folder:


In [ ]:
! tree /output/working_dir/preprocWF


/output/working_dir/preprocWF
├── d3.js
├── graph1.json
├── graph.json
├── index.html
├── mcflirt
│   ├── _0x653c043f1574f1c0240b0f1bfb464acc.json
│   ├── command.txt
│   ├── _inputs.pklz
│   ├── _node.pklz
│   ├── _report
│   │   └── report.rst
│   ├── result_mcflirt.pklz
│   └── sub-01_ses-test_task-fingerfootlips_bold_mcf.nii.gz
├── selectfiles
│   ├── _0x8be4cb43842af73f06e36ceafabda572.json
│   ├── _inputs.pklz
│   ├── _node.pklz
│   ├── _report
│   │   └── report.rst
│   └── result_selectfiles.pklz
└── smooth
    ├── _0x5b200af0570d488947d2c5801048ec3d.json
    ├── command.txt
    ├── _inputs.pklz
    ├── _node.pklz
    ├── _report
    │   └── report.rst
    ├── result_smooth.pklz
    └── sub-01_ses-test_task-fingerfootlips_bold_mcf_smooth.nii.gz

6 directories, 23 files

As we can see, there is way too much content that we might not really care about. To relocate and rename all the files that are relevant for you, you can use DataSink?

DataSink

DataSink is Nipype's standard output module to restructure your output files. It allows you to relocate and rename files that you deem relevant.

Based on the preprocessing pipeline above, let's say we want to keep the smoothed functional images as well as the motion correction paramters. To do this, we first need to create the DataSink object.


In [ ]:
from nipype.interfaces.io import DataSink

# Create DataSink object
sinker = Node(DataSink(), name='sinker')

# Name of the output folder
sinker.inputs.base_directory = '/output/working_dir/preprocWF_output'

# Connect DataSink with the relevant nodes
wf.connect([(smooth, sinker, [('out_file', 'in_file')]),
            (mcflirt, sinker, [('mean_img', 'mean_img'),
                               ('par_file', 'par_file')]),
            ])
wf.run()


170904-05:39:41,710 workflow INFO:
	 Workflow preprocWF settings: ['check', 'execution', 'logging']
170904-05:39:41,715 workflow INFO:
	 Running serially.
170904-05:39:41,716 workflow INFO:
	 Executing node selectfiles in dir: /output/working_dir/preprocWF/selectfiles
170904-05:39:41,722 workflow INFO:
	 Executing node mcflirt in dir: /output/working_dir/preprocWF/mcflirt
170904-05:39:41,729 workflow INFO:
	 Running: mcflirt -in /data/ds000114/sub-01/ses-test/func/sub-01_ses-test_task-fingerfootlips_bold.nii.gz -meanvol -out /output/working_dir/preprocWF/mcflirt/sub-01_ses-test_task-fingerfootlips_bold_mcf.nii.gz -plots
170904-05:40:59,869 workflow INFO:
	 Executing node smooth in dir: /output/working_dir/preprocWF/smooth
170904-05:40:59,881 workflow INFO:
	 Running: fslmaths /output/working_dir/preprocWF/mcflirt/sub-01_ses-test_task-fingerfootlips_bold_mcf.nii.gz -s 1.69864 /output/working_dir/preprocWF/smooth/sub-01_ses-test_task-fingerfootlips_bold_mcf_smooth.nii.gz
170904-05:41:08,60 workflow INFO:
	 Executing node sinker in dir: /output/working_dir/preprocWF/sinker
Out[ ]:
<networkx.classes.digraph.DiGraph at 0x7f43a65b5400>

Let's take a look at the output folder:


In [ ]:
! tree /output/working_dir/preprocWF_output


/output/working_dir/preprocWF_output
├── in_file
│   └── sub-01_ses-test_task-fingerfootlips_bold_mcf_smooth.nii.gz
├── mean_img
│   └── sub-01_ses-test_task-fingerfootlips_bold_mcf.nii.gz_mean_reg.nii.gz
└── par_file
    └── sub-01_ses-test_task-fingerfootlips_bold_mcf.nii.gz.par

3 directories, 3 files

This looks nice. It is what we asked it to do. But having a specific output folder for each individual output file might be suboptimal. So let's change the code above to save the output in one folder, which we will call 'preproc'.

For this we can use the same code as above. We only have to change the connection part:


In [ ]:
wf.connect([(smooth, sinker, [('out_file', 'preproc.@in_file')]),
            (mcflirt, sinker, [('mean_img', 'preproc.@mean_img'),
                               ('par_file', 'preproc.@par_file')]),
            ])
wf.run()


170904-05:41:08,355 workflow INFO:
	 Workflow preprocWF settings: ['check', 'execution', 'logging']
170904-05:41:08,368 workflow INFO:
	 Running serially.
170904-05:41:08,370 workflow INFO:
	 Executing node selectfiles in dir: /output/working_dir/preprocWF/selectfiles
170904-05:41:08,383 workflow INFO:
	 Executing node mcflirt in dir: /output/working_dir/preprocWF/mcflirt
170904-05:41:08,398 workflow INFO:
	 Running: mcflirt -in /data/ds000114/sub-01/ses-test/func/sub-01_ses-test_task-fingerfootlips_bold.nii.gz -meanvol -out /output/working_dir/preprocWF/mcflirt/sub-01_ses-test_task-fingerfootlips_bold_mcf.nii.gz -plots
170904-05:42:26,35 workflow INFO:
	 Executing node smooth in dir: /output/working_dir/preprocWF/smooth
170904-05:42:26,40 workflow INFO:
	 Running: fslmaths /output/working_dir/preprocWF/mcflirt/sub-01_ses-test_task-fingerfootlips_bold_mcf.nii.gz -s 1.69864 /output/working_dir/preprocWF/smooth/sub-01_ses-test_task-fingerfootlips_bold_mcf_smooth.nii.gz
170904-05:42:33,712 workflow INFO:
	 Executing node sinker in dir: /output/working_dir/preprocWF/sinker
Out[ ]:
<networkx.classes.digraph.DiGraph at 0x7f43e034cf60>

Let's take a look at the new output folder structure:


In [ ]:
! tree /output/working_dir/preprocWF_output


/output/working_dir/preprocWF_output
├── in_file
│   └── sub-01_ses-test_task-fingerfootlips_bold_mcf_smooth.nii.gz
├── mean_img
│   └── sub-01_ses-test_task-fingerfootlips_bold_mcf.nii.gz_mean_reg.nii.gz
├── par_file
│   └── sub-01_ses-test_task-fingerfootlips_bold_mcf.nii.gz.par
└── preproc
    ├── sub-01_ses-test_task-fingerfootlips_bold_mcf.nii.gz_mean_reg.nii.gz
    ├── sub-01_ses-test_task-fingerfootlips_bold_mcf.nii.gz.par
    └── sub-01_ses-test_task-fingerfootlips_bold_mcf_smooth.nii.gz

4 directories, 6 files

This is already much better. But what if you want to rename the output files to represent something a bit readable. For this DataSink has the substitution input field.

For example, let's assume we want to get rid of the string 'task-fingerfootlips' and 'bold_mcf' and that we want to rename the mean file, as well as adapt the file ending of the motion parameter file:


In [ ]:
# Define substitution strings
substitutions = [('_task-fingerfootlips', ''),
                 ("_ses-test", ""),
                 ('_bold_mcf', ''),
                 ('.nii.gz_mean_reg', '_mean'),
                 ('.nii.gz.par', '.par')]

# Feed the substitution strings to the DataSink node
sinker.inputs.substitutions = substitutions

# Run the workflow again with the substitutions in place
wf.run()


170904-05:42:34,597 workflow INFO:
	 Workflow preprocWF settings: ['check', 'execution', 'logging']
170904-05:42:35,119 workflow INFO:
	 Running serially.
170904-05:42:35,122 workflow INFO:
	 Executing node selectfiles in dir: /output/working_dir/preprocWF/selectfiles
170904-05:42:35,133 workflow INFO:
	 Executing node mcflirt in dir: /output/working_dir/preprocWF/mcflirt
170904-05:42:35,135 workflow INFO:
	 Collecting precomputed outputs
170904-05:42:35,138 workflow INFO:
	 Executing node smooth in dir: /output/working_dir/preprocWF/smooth
170904-05:42:35,139 workflow INFO:
	 Collecting precomputed outputs
170904-05:42:35,144 workflow INFO:
	 Executing node sinker in dir: /output/working_dir/preprocWF/sinker
170904-05:42:35,151 interface INFO:
	 sub: /output/working_dir/preprocWF_output/in_file/sub-01_ses-test_task-fingerfootlips_bold_mcf_smooth.nii.gz -> /output/working_dir/preprocWF_output/in_file/sub-01_smooth.nii.gz
170904-05:42:35,152 interface INFO:
	 sub: /output/working_dir/preprocWF_output/preproc/sub-01_ses-test_task-fingerfootlips_bold_mcf_smooth.nii.gz -> /output/working_dir/preprocWF_output/preproc/sub-01_smooth.nii.gz
170904-05:42:35,153 interface INFO:
	 sub: /output/working_dir/preprocWF_output/mean_img/sub-01_ses-test_task-fingerfootlips_bold_mcf.nii.gz_mean_reg.nii.gz -> /output/working_dir/preprocWF_output/mean_img/sub-01_mean.nii.gz
170904-05:42:35,155 interface INFO:
	 sub: /output/working_dir/preprocWF_output/par_file/sub-01_ses-test_task-fingerfootlips_bold_mcf.nii.gz.par -> /output/working_dir/preprocWF_output/par_file/sub-01.par
170904-05:42:35,156 interface INFO:
	 sub: /output/working_dir/preprocWF_output/preproc/sub-01_ses-test_task-fingerfootlips_bold_mcf.nii.gz_mean_reg.nii.gz -> /output/working_dir/preprocWF_output/preproc/sub-01_mean.nii.gz
170904-05:42:35,158 interface INFO:
	 sub: /output/working_dir/preprocWF_output/preproc/sub-01_ses-test_task-fingerfootlips_bold_mcf.nii.gz.par -> /output/working_dir/preprocWF_output/preproc/sub-01.par
Out[ ]:
<networkx.classes.digraph.DiGraph at 0x7f43a5d14860>

Now, let's take a final look at the output folder:


In [ ]:
! tree /output/working_dir/preprocWF_output


/output/working_dir/preprocWF_output
├── in_file
│   ├── sub-01_ses-test_task-fingerfootlips_bold_mcf_smooth.nii.gz
│   └── sub-01_smooth.nii.gz
├── mean_img
│   ├── sub-01_mean.nii.gz
│   └── sub-01_ses-test_task-fingerfootlips_bold_mcf.nii.gz_mean_reg.nii.gz
├── par_file
│   ├── sub-01.par
│   └── sub-01_ses-test_task-fingerfootlips_bold_mcf.nii.gz.par
└── preproc
    ├── sub-01_mean.nii.gz
    ├── sub-01.par
    ├── sub-01_ses-test_task-fingerfootlips_bold_mcf.nii.gz_mean_reg.nii.gz
    ├── sub-01_ses-test_task-fingerfootlips_bold_mcf.nii.gz.par
    ├── sub-01_ses-test_task-fingerfootlips_bold_mcf_smooth.nii.gz
    └── sub-01_smooth.nii.gz

4 directories, 12 files

Cool, much more clearly!