Working through this tutorial, you will be investigating IL-22RA1 signalling at the intestinal epithelium. The dataset you will be using for this tutorial forms part of the following publication:
Epithelial IL-22RA1-Mediated Fucosylation Promotes Intestinal Colonization Resistance to an Opportunistic Pathogen
Pham TA, Clare S, Goulding D, Arasteh JM et al.
Cell Host Microbe. 2014 Oct 8;16(4):504-16. doi: 10.1016/j.chom.2014.08.017.
PMID: 25263220
For Sanger pathogen users, you can find this dataset under study id 2319 using the pf
commands. Click here for more information.
Cytokines are small, secreted proteins which effect the behavior of other cells. Due to their crucial role in cell signalling, they are often targets of RNA-Seq studies. In this study, the authors were interested in interleukin 22 (IL-22), an important mediator of host mucosal defence, and its receptor, interleukin 22 receptor subunit alpha 1 (IL-22RA1). IL-22 targets receptors on the surface of cells that line the intestines, also known as the intestinal epithelium. It can stimulate these cells to multiply, produce antimicrobial peptides and shed, providing a local defence against colonisation of bacterial and fungal pathogens.
However, the relationship between IL-22 and host defense is complex. While it may be involved in preventing colonisation, in some situations it has been shown to promote colonisation and in others, play no obvious role in susceptibility. So, in this study, the authors generated organoids (small, 3D tissue cultures which mimic the larger organ they represent) from wild type mice and organoids from IL-22RA1 knockout mice (i.e. mice which don't express/produce IL-22RA1). To investigate IL-22RA1 signalling in the intestinal epithelium, they then compared the gene expression from WT and KO organoids stimulated with IL22.
In this tutorial, you will be analysing 32 RNA samples, each of which has been sequenced on an Illumina HiSeq sequencing machine. There are four conditions: wild type cells with no treatment (WT_Ctrl), wild type cells stimulated with IL22 (WT_IL22), IL-22RA1 knockout cells with no treatment (KO_Ctrl) and IL-22RA1 knockout cells stimulated with IL22 (KO_IL22). There are four biological replicates and two technical replicates for each condition.
Condition | Cell type | Treatment | Biological Replicates | Technical Replicates |
---|---|---|---|---|
WT_Ctrl | Wild type | Control | 4 | 2 |
WT_IL22 | Wild type | Control | 4 | 2 |
KO_Ctrl | IL-22RA1 knockout | IL22 | 4 | 2 |
KO_IL22 | IL-22RA1 knockout | IL22 | 4 | 2 |
Note: this is a two factor study, but we must reduce it to a single factor study to run DEAGO
Move into the directory containing the tutorial data files.
In [ ]:
cd data
List the files and folders in the directory.
In [ ]:
ls -l
You should see a directory called counts. This contains the files which have our gene counts (number of reads assigned to each gene ~ gene abundance) for each of the samples, one file per sample. Let's count them.
In [ ]:
ls counts | wc -l
You will also see a file called targets.txt which tells us the relationship between sample and experimental condition.
In [ ]:
head targets.txt
There are two files with similar names, ensembl_mm10.tsv and ensembl_mm10_deago_formatted.tsv, which are the gene annotations. The first contains the annotations as downloaded from Ensembl BioMart, one line per annotation.
In [ ]:
head ensembl_mm10.tsv
The second contains those annotations converted for use with DEAGO, one line per gene.
In [ ]:
head ensembl_mm10_deago_formatted.tsv
These are the input files for DEAGO. We'll take a closer look in the next section of this tutorial.
For a quick recap of what the tutorial covers and the software you will need, head back to the Introduction.
Otherwise, let's take a closer look at how to prepare input data.