Serotype detection using SeroBA

Introduction

SeroBA is a software tool for identifying the serotype of samples from Illumina reads. This tutorial will walk you through using SeroBA for serotyping of Streptococcus pneumoniae samples.

For more in depht information about SeroBA, please refer to the paper:

SeroBA: rapid high-throughput serotyping of Streptococcus pneumoniae from whole genome sequence data Epping L, van Tonder, AJ, Gladstone RA, GPS Consortium, Bentley SD, Page AJ, Keane JA bioRxiv preprint, 2017 Sep.; doi: 10.1101/179465

Learning outcomes

By the end of this tutorial you can expect to be able to:

  • Understand serotyping, why it is important and what it can be used for
  • Run SeroBA on several samples to predict their serotype
  • Summarise the SeroBA results for several samples
  • Interpret the detailed output of SeroBA
  • Download and prepare the S. pneumoniae databases from PneumoCAT for use with SeroBA

Tutorial sections

This tutorial comprises the following sections:

  1. What is serotyping?
  2. Preparation of databases before running SeroBA
  3. Running SeroBA
  4. Interpreting the results of SeroBA

Authors

This tutorial was created by Sara Sjunnebo.

Running the commands from this tutorial

You can run the commands in this tutorial either directly from the Jupyter notebook (if using Jupyter), or by typing the commands in your terminal window.

Running commands on Jupyter

If you are using Jupyter, command cells (like the one below) can be run by selecting the cell and clicking Cell -> Run from the menu above or using ctrl Enter to run the command. Let's give this a try by printing our working directory using the pwd command and listing the files within it. Run the commands in the two cells below.


In [ ]:
pwd

In [ ]:
ls -l

Running commands in the terminal

You can also follow this tutorial by typing all the commands you see into a terminal window. This is similar to the "Command Prompt" window on MS Windows systems, which allows the user to type DOS commands to manage files.

To get started, select the cell below with the mouse and then either press control and enter or choose Cell -> Run in the menu at the top of the page.


In [ ]:
echo cd $PWD

Now open a new terminal on your computer and type the command that was output by the previous cell followed by the enter key. The command will look similar to this:

cd /home/manager/pathogen-informatics-training/Notebooks/SEROBA/

Now you can follow the instructions in the tutorial from here.

Let’s get started!

This tutorial assumes that you have SeroBA installed on your computer. For download and installation instructions, please see the SeroBA GitHub-page.

To check that you have installed the software correctly, you can run the following command:


In [ ]:
seroba --help

This should return the following help message:

usage: seroba <command> <options>

optional arguments:
  -h, --help     show this help message and exit

Available commands:

    getPneumocat
                 downloads genetic information from PneumoCat
    createDBs    creates Databases for kmc and ariba
    runSerotyping
                 indetify serotype of your input data
    summary      output folder has to contain all folders with prediction
                 results
    version      Get versions and exit

To get started with the tutorial, head to the first section: What is serotyping? The answers to all questions in the tutorial can be found here.