Integrative Genome Viewer (IGV) allows you to visualise genomic datasets. This quick start guide will give you a brief overview of IGV, how to load data, navigate the genome and visualise your data. The IGV user guide is really useful and contains information on many more features than we have the chance go through in this quick start guide.
Integrative Genome Viewer (IGV)
Broad Institute and the Regents of the University of California
Download: https://software.broadinstitute.org/software/igv/download
User guide: http://software.broadinstitute.org/software/igv/UserGuide
This tutorial was written by Victoria Offord.
This guide assumes that you have the following software or packages and their dependencies installed on your computer. The software or packages used in this guide may be updated from time to time so, we have also given you the version which was used when writing the guide.
Package name | Link for download/installation instructions | Version |
---|---|---|
samtools | https://github.com/samtools/samtools | 1.6 |
IGV | (https://software.broadinstitute.org/software/igv/) | 2.3.90 |
Before you begin, make sure you have an index for the reference genome which IGV will use to traverse the genome. You can do this using samtools.
samtools faidx <your_genome_file.fa>
The resulting index file will have the extension .fai and must be in the same directory as the reference genome.
When you start IGV, it will open a main window. At the top of this window you have a toolbar and genome ruler for navigation. The largest area in the main window is the data viewer where your alignments, annotations and other data will be displayed. To do this, IGV uses horizontal rows called tracks. Finally, at the bottom, there is a sequence viewer which contains the base level information for your reference genome.
IGV provides several genomes which can be selected with the "Genome drop-down box" on the toolbar. However, your reference genome may not always be on this list. When your reference is not available, you will need to load it from a FASTA file.
To load a reference genome from file, go to "Genomes -> Load Genome from File…".
Select the FASTA file containing the reference genome and click "Open".
Once the genome has loaded, the chromosomes will be shown on the genome ruler with their names/numbers above. When a region is selected, a red box will appear. This represents the visible region of the genome.
Above the genome ruler is the toolbar which has a variety controls for navigating the genome:
There are several other buttons which can be used to control the visible portion of the genome.
The sequence viewer shows the genome at the single nucleotide level. You won't be able to see the sequence until you are zoomed in. As you start to zoom in (+), you will see that each nucleotide is represented by a coloured bar (red=T, yellow=g, blue=c and green=a). This makes it easier to spot repetitive regions in the genome. Carry on zooming in (+) and you will see the individual nucleotides.
If you right-click on "Sequence" at the left-hand side of the sequence viewer and click "Show translation", you will also see the amino acid sequence for the forward three reading frames.
You can also see the reverse three reading frames by right-clicking on the track and selecting "Flip strand".
Note: at the bottom right of the main window is the amount of memory available to IGV and how much of this it is currently using - always keep a wary eye on this!!
In addition to your genome, you will probably want to load an annotation file that contains information such as gene locations and gene structures (e.g. introns/exons/CDS).
To load a GFF file containing annotations, go to "File -> Load from File…".
Select the annotation file and click "Open".
This will load the annotation track. At the genome level, you will see this shown as a density track for the associated annotation. On the left you will see the track label which is the name of the file you just loaded. You can change this label to something more recognisable by right clicking on the label and selecting "Rename Track".
As you zoom in (+), you will start to be able to see the individual genes (shown in blue).
Genes are represented in blue as boxes (exonic regions) and lines (intronic regions). The arrows indicate the strand of the direction in which the gene will be transcribed. The box height indicates whether the region is a coding seequence (taller) or untranslated region (thinner).
For a clearer view of the gene structure, right click on the annotation track and click "Expanded".
Now you will see the annotated isoforms and can more clearly see the arrows that indicate which strand the gene is on. If you zoom in further, you will also see the amino acid sequence superimposed onto the exons.
IGV can be used to visualise many different types of data, including read alignments. Each time you load an alignment file it will be added to the data viewer as a new major track.
To load a read alignment file, go to "File -> Load from File…".
Select a sorted BAM file and click "Open".
Note: BAM files and their corresponding index files must be in the same directory for IGV to load them properly.
For each read alignment, a major track will appear containing two minor tracks for that sample: coverage statistics and read alignments. For the total number of visible tracks, see the bottom left of main window.
At the genome level, there will be no coverage plot or read alignments visible. At the chromosome level, there are two messages displayed: Zoom in to see coverage/alignments. Finally, once you have zoomed in (+) you will see a density plot in the coverage track and your read alignments.
You can open more than one alignment file. Each alignment file will be loaded into a new track with its coverage statistics and read alignments. However, make sure you keep an eye on the memory usage in the bottom right corner or IGV may crash!
When zoomed in to view a region, you can get alignment information for each position in the genome by hovering over the coverage track. This will open a yellow box which tells you the total number of reads mapped at that position, a breakdown of the mapped nucleotide frequencies and the number of reads mapping in a forward/reverse orientation. In our example, 95 reads mapped, 50 forward and 45 reverse, all of which called A at position 202,768 on chromosome PccAS_05_v3.
Read are represented by grey or transparent/white bars which are stacked together where they align to the reference genome. Reads are pointed to indicate the orientation in which they mapped i.e. on the forward or reverse strand. Hovering over an individual read will display information about its alignment.
Mismatches occur where the nucleotide in the aligned read is not the same as the nucleotide in that position on the reference genome. A mismatch is indicated by a coloured bar at the relevant position on the read. The colour of the bar represents the mismatched base in the read (red=T, yellow=g, blue=c and green=a).
For more information about how reads are coloured and what this means, see the IGV user guide.
In IGV you can navigate through different levels of visualisation, from the whole genome, all the way down to a base level resolution.
To return to the whole genome view:
In the chromosome view, the alignment track has changed from a density plot to showing individual gene, the genome ruler is now showing co-ordinates instead of chromosome name/numbers. More importantly, the alignment tracks are saying that to see coverage or read alignments we need to zoom in further.
Note: the visible region of the chromosome is indicated by the red box on the genome ruler.
Alternatively, if you know the name of the gene you want to view and you have loaded an annotation file, you can enter the gene name into the "Search" and click "Go". For example, to view PCHAS_0100100, you would enter PccAS_0100100
in the search box.
Note: the search box will try to help you by listing options to autocomplete the search box.
You can zoom in and out from each view by using the "+" and "-" buttons on the zoom control at the right-hand side of the toolbar. This will also work with the "+" and "-" keys on your keyboard.
There are several ways you can move around the view: