BLAST

Basic Local Alignment Search Tool (BLAST) is a powerful tool for comparing and identifying sequences which share similarity. This can be useful for several reasons:

  • Identifying an unknown sequence by finding annotated (or known) sequences which are similar
  • Finding similar sequences in other species (e.g. orthologs)
  • Predicting function by identifying similar regions in other sequences which already have a known function

In this tutorial, we are going to use a version of BLAST called BLAST+ which can be downloaded from ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/LATEST/.

BLAST+ is split into different applications which are based on the type of sequence provided by you, the user, as well as the type of sequences in the database being searched. There are three things you will need each time you want to run a BLAST search:

  • A query sequence (can be nucleotide or protein)
  • A sequence database (can be nucleotide or protein)
  • A BLAST application (this will depend on your query sequence and database - more on this later!)

Why do I need this tutorial you may say! Well, running BLAST+ is like running a lab experiment. To get meaningful results, you must first optimize the conditions you are using. After this tutorial you will not only be able to run BLAST, but be able to tailor your search to your specific biological question.

Learning outcomes

By the end of this tutorial you will be able to:

  • Create a BLAST database from your own sequences
  • Understand the difference between BLAST programs and when to use them
  • Run BLAST locally
  • Use the parameters to generate tailored BLAST output files

Tutorial sections

This tutorial is split into two sections:

Ansewers to the exercises from both sections can be found here

Running commands from this tutorial

You may be viewing this tutorial interactively via Jupyter or via a handout (e.g. PDF). If you are using Jupyter, command cells (like the one below) can be run by selecting the cell and clicking Cell -> Run from the menu above or using ctrl Enter to run the command. Don't worry if you are using this as a PDF, you can run all the commands by copying or typing them into your terminal.

Let's give this a try by printing our working directory using the pwd command and listing the files within it. Run the commands in the two cells below (Jupyter) or on your terminal (PDF).


In [ ]:
pwd

In [ ]:
ls -l

Let's get started!

For the first part of this tutorial, we are going to look at how to create a BLAST database from a file containing your own sequences. Answers to all of the questions can be found here. Click here to continue.