This shows how to run ARIBA on a large number of samples. To save time, this notebook does not actually run any of the commands (each run of ARIBA takes a few minutes).
First we need a reference database. You will already have one if you followed the instructions in the previous part of this tutorial (How to use custom reference data with ARIBA). Alternatively, you can use one of the public datasets that ARIBA supports. For example, to use CARD run these two commands to make an ARIBA database directory called ariba_db
:
ariba getref card out.card
ariba prepareref -f out.card.fa -m out.card.tsv ariba_db
ARIBA needs the database directory, which we will call Ngo_ARIBAdb
to be consistent with the previous section of the tutorial, and two sequencing reads files reads.1.fastq.gz
, reads.2.fastq.gz
. The command to run ARIBA is:
ariba run Ngo_ARIBAdb reads.1.fastq.gz reads.2.fastq.gz outdir
The above command will make a new directory called outdir
that contains the results.
The N. gonorrhoeae dataset consists of 1517 samples, and we need to run ARIBA on each sample, which can be done with a "for" loop. We assume that the reads files are named like this:
ERR1067813.1.fq.gz ERR1067813.2.fq.gz
ERR1067814.1.fq.gz ERR1067814.2.fq.gz
ERR1067815.1.fq.gz ERR1067815.2.fq.gz
Then we can run ARIBA on all samples like this (you may need to edit this command depending on how your own files are named):
for sample in `ls *.1.fq.gz | sed 's/\.1.fq.gz//'`
do
ariba run Ngo_ARIBAdb $sample.1.fq.gz $sample.2.fq.gz $sample.ariba
done
For Sanger pathogens users only: use LSF to run all the jobs.
for sample in `ls *.1.fq.gz | sed 's/\.1.fq.gz//'`
do
bsub.py 1 $sample.ariba ariba run Ngo_ARIBAdb \
$sample.1.fq.gz $sample.2.fq.gz $sample.ariba
done
The output directory of each sample is called $sample.ariba
, for example ERR1067813.ariba is the output directory for sample ERR1067813.
The output files are described here.
Now go to the next part of the tutorial where we use Phandango to view the results.
You can also return to the index or revisit the previous section.