This notebook explains how to use sourcetracking in QIIME 2

Note: For a more detailed description of the theory behind microbial sourcetracking see the notebook Sourcetracking using a Gibbs Sampler.

In order to perform microbial sourcetracking we will need two files (imported into QIIME 2):

  • A count table of the type FeatureTable[RelativeFrequency] (example: data/tiny-test/otu_table.qza)
  • A metadata file in QIIME format (example: data/tiny-test/map.txt)

For downstream visualization we will also need:

  • taxonomy.qza FeatureData[Taxonomy] (example: data/tiny-test/taxonomy.qza)

In the metadata we will need to identify two columns.

First, we need to identify the column classifying samples as sources or sinks and the values associated with each. This can be given as:

--p-source-sink-column SourceSink

The values for each sources and sinks can be input as lists with no spaces as such:

--p-source-column-value Source1,Source2
--p-sink-column-value Sink1,Sink2

Note: In our case we only have two values source & sink

Second, we will need to specify the environment each source represents.

--p-source-category-column Env

Note: There are many other parameters in the command that can help tune the sourcetracking, for an in depth description of how to choose them see the other notebooks.

One parameter worth discussing here is the --p-loo / --p-no-loo. The --p-loo flag will produce per source feature assignments. While the --p-no-loo will produce produce per sink feature assignments. Below we will run both examples.


In [11]:
!qiime dev refresh-cache


QIIME is caching your current deployment for improved performance. This may take a few moments and should only happen once per deployment.

In [5]:
!qiime sourcetracker2 gibbs \
    --i-feature-table ../data/tiny-test/otu_table.qza \
    --m-sample-metadata-file ../data/tiny-test/map.txt \
    --p-source-sink-column SourceSink \
    --p-loo \
    --p-source-column-value source \
    --p-sink-column-value sink \
    --p-source-category-column Env \
    --output-dir qiime2-results/sourcetracker-loo


Saved FeatureTable[RelativeFrequency] to: qiime2-results/sourcetracker-loo/mixing_proportions.qza
Saved FeatureTable[RelativeFrequency] to: qiime2-results/sourcetracker-loo/mixing_proportion_stds.qza
Saved FeatureTable[RelativeFrequency] to: qiime2-results/sourcetracker-loo/per_sink_assignments.qza
Saved SampleData[SinkSourceMap] to: qiime2-results/sourcetracker-loo/per_sink_assignments_map.qza

Now we can generate a barplot composite of the results using the built in barplot command.


In [6]:
!qiime sourcetracker2 barplot\
    --i-proportions qiime2-results/sourcetracker-loo/mixing_proportions.qza \
    --m-sample-metadata-file ../data/tiny-test/map.txt\
    --o-visualization qiime2-results/sourcetracker-loo/proportions-barplot.qzv


Saved Visualization to: qiime2-results/sourcetracker-loo/proportions-barplot.qzv

This command should produce an interactive barplot that can be viewed at https://view.qiime2.org/. It will look something like the following screen shot.

Now we can track the contribution of each feature (i.e. gene/taxon/ASV) to the sources. To visualize this we will need to specify each source individually.


In [54]:
!qiime sourcetracker2 assignment-barplot\
    --i-feature-assignments qiime2-results/sourcetracker-loo/per_sink_assignments.qza\
    --i-feature-metadata ../data/tiny-test/taxonomy.qza\
    --i-assignments-map qiime2-results/sourcetracker-loo/per_sink_assignments_map.qza\
    --p-per-value drainwater\
    --o-visualization qiime2-results/sourcetracker-loo/drainwater-sink-assignments-barplot.qzv

!qiime sourcetracker2 assignment-barplot\
    --i-feature-assignments qiime2-results/sourcetracker-loo/per_sink_assignments.qza\
    --i-feature-metadata ../data/tiny-test/taxonomy.qza\
    --i-assignments-map qiime2-results/sourcetracker-loo/per_sink_assignments_map.qza\
    --p-per-value seawater\
    --o-visualization qiime2-results/sourcetracker-loo/seawater-sink-assignments-barplot.qzv

!qiime sourcetracker2 assignment-barplot\
    --i-feature-assignments qiime2-results/sourcetracker-loo/per_sink_assignments.qza\
    --i-feature-metadata ../data/tiny-test/taxonomy.qza\
    --i-assignments-map qiime2-results/sourcetracker-loo/per_sink_assignments_map.qza\
    --p-per-value sewage\
    --o-visualization qiime2-results/sourcetracker-loo/sewage-sink-assignments-barplot.qzv


Saved Visualization to: qiime2-results/sourcetracker-loo/drainwater-sink-assignments-barplot.qzv
Saved Visualization to: qiime2-results/sourcetracker-loo/seawater-sink-assignments-barplot.qzv
Saved Visualization to: qiime2-results/sourcetracker-loo/sewage-sink-assignments-barplot.qzv

As an example we can take a look at the sewage plot (i.e. sewage-sink-assignments-barplot.qza).

We can repeat this same protocol if we want to look at the feature contribution to each sink by specifying --p-no-loo.


In [55]:
!qiime sourcetracker2 gibbs \
    --i-feature-table ../data/tiny-test/otu_table.qza \
    --m-sample-metadata-file ../data/tiny-test/map.txt \
    --p-source-sink-column SourceSink \
    --p-no-loo \
    --p-source-column-value source \
    --p-sink-column-value sink \
    --p-source-category-column Env \
    --output-dir qiime2-results/sourcetracker-noloo


Saved FeatureTable[RelativeFrequency] to: qiime2-results/sourcetracker-noloo/mixing_proportions.qza
Saved FeatureTable[RelativeFrequency] to: qiime2-results/sourcetracker-noloo/mixing_proportion_stds.qza
Saved FeatureTable[RelativeFrequency] to: qiime2-results/sourcetracker-noloo/per_sink_assignments.qza
Saved SampleData[SinkSourceMap] to: qiime2-results/sourcetracker-noloo/per_sink_assignments_map.qza

We can visualize the contributions the same as before but with a new x-axis.


In [56]:
!qiime sourcetracker2 barplot\
    --i-proportions qiime2-results/sourcetracker-noloo/mixing_proportions.qza \
    --m-sample-metadata-file ../data/tiny-test/map.txt\
    --o-visualization qiime2-results/sourcetracker-noloo/proportions-barplot.qzv


Saved Visualization to: qiime2-results/sourcetracker-noloo/proportions-barplot.qzv

Finally, we can visualize the feature contributions to each sink. We can see what options we have by tabulating the metadata.


In [60]:
!qiime metadata tabulate\
    --m-input-file qiime2-results/sourcetracker-noloo/per_sink_assignments_map.qza\
    --o-visualization qiime2-results/sourcetracker-noloo/per_sink_assignments_map.qzv\


Saved Visualization to: qiime2-results/sourcetracker-noloo/per_sink_assignments_map.qzv

The output of tabulate will look something like the following

From this we can choose from the first columns set of values. As an example we will take a look at sink s0.

Note that each visualization command must be run for each sink.


In [58]:
!qiime sourcetracker2 assignment-barplot\
    --i-feature-assignments qiime2-results/sourcetracker-noloo/per_sink_assignments.qza\
    --i-feature-metadata ../data/tiny-test/taxonomy.qza\
    --i-assignments-map qiime2-results/sourcetracker-noloo/per_sink_assignments_map.qza\
    --p-per-value s0\
    --o-visualization qiime2-results/sourcetracker-noloo/s0-sink-assignments-barplot.qzv


Saved Visualization to: qiime2-results/sourcetracker-noloo/s0-sink-assignments-barplot.qzv