This workshop is designed to introduce you to the two core tools that you will be using for the other workshops.
Before you begin this workshop you should know some very basic unix commands. These are covered in chapters 1-10 of the interactive guide. Unless you are already familiar with UNIX it is essential that you read over those chapters before you start (~ 30 minutes).
At the end of this workshop you should;
The document you are reading is a jupyter notebook.
It consists of series of cells that contain either text or computer code.
Jupyter notebooks are very useful for bioinformatics because they allow text to be mixed together with code for manipulating data, running programs and creating plots.
The cell you are reading is a text cell. Click on it to make it the currently active/selected cell. The active cell will have thin coloured border around it with a thicker border on the left. If the border is blue the cell is not editable.
Double click on this cell to make it editable.
You should see that it's border turns green. You should also see that it's content changes to plain text in Markdown format. Markdown is a way of writing documentation that is very simple but still allows some basic styling (headers, links, images, code, bold, italics, equations, quotes)
The text you type into code cells should consist of valid commands that can be interpreted by the notebook's kernel
. A notebook's kernel
is the engine it uses to evaluate code cells. This notebook is running the Bash kernel. This means that when you run code cells they will be interpreted as if you typed the same text at the unix command prompt.
Jupyter notebooks support many types of kernels including Python
, R
and Bash
which are particularly useful for bioinformatics.
Note: You can tell which kernel a notebook is running by looking at the kernel indicator in the top right corner.
The notebook will not actually run your cells until you tell it to. You can do this by first selecting the cell and then using the menu to select Cell -> Run Cells.
The cell immediately below this one is a code cell.
The ls
command in this cell should be familiar to you. Try running it.
Try double-clicking on a text cell to set it into edit mode. Then run the text cell. When text cells are run they aren't evaluated by the kernel
but are rendered for display in your web browser.
In [1]:
ls
Run the Setup Code
In order for this notebook to work properly you need to run the cell below before doing anything else. This will load custom functions and settings required to make the self assessment exercises work.
If you restart your kernel you will also need to rerun the setup code
Don't use the
cd
command
The answers to all self assessment exercises assume that you don't change your directory from the default. You shouldn't ever need to use the cd
command to answer an exercise.
In [2]:
# Essential Setup Code : Must be run first.
wget https://www.dropbox.com/s/zqgacjshllprdcc/setup.sh?dl=0 -O setup.sh
source ./setup.sh
Your task: Write a command to list the contents of the current directory
This is deliberately easy (the answer is ls
) so that you can focus on understanding the self-assessment mechanism.
Follow these steps for every exercise:
In [3]:
e1_answer(){
### BEGIN SOLUTION
ls
### END SOLUTION
}
In [4]:
test_e1
In [5]:
# This code cell is for you to experiment with the ls command (see exercise below)
ls -aFl
ls
commandUse the code cell above and try various optional arguments to the ls
command. Eg.
ls -F
ls -1
ls -a
ls -R
ls -S
Now try printing the help text for the ls
command
ls --help
Search through the help and look for each of the options in the commands above. Use the description for each option to understand the output you see when you run each command.
Note: Another way to bring up the help is the man
command but unfortunately this doesn't work well in a jupyter notebook
In [6]:
e2_answer(){
### BEGIN SOLUTION
ls -r
### END SOLUTION
}
In [7]:
test_e2
In [8]:
e3_answer(){
### BEGIN SOLUTION
ls E2
### END SOLUTION
}
In [9]:
test_e3
In [10]:
e4_answer(){
### BEGIN SOLUTION
ls -1 -Sr E2
### END SOLUTION
}
In [11]:
test_e4
Your task: Write a command to list the contents of the E2 directory one item per line so that the word HELLO is spelled. Your output should look like the text below
E2/5_H.txt
E2/2_E.txt
E2/3_L.txt
E2/4_L.txt
E2/1_O.txt
Hint 1: You will need to use the wild-card character *
. See chapter 13 of the guide for examples.
Hint 2: Look at the sizes of files using ls -l
In [12]:
e5_answer(){
### BEGIN SOLUTION
ls -1 -Sr E2/*.txt
### END SOLUTION
}
In [13]:
test_e5
In [14]:
ls --help
Playing with the command line is the best way to learn.
fortune
command.
fortune
Run it a few timescowsay
command like this
cowsay "keyboard good, mouse bad"
cowsay -f sheep "keyboard good, mouse bad"
fortune | cowsay
This introduces a new concept, the pipe operator,
|
. A pipe allows the output of one command to be used as input for another .We will cover pipes in more detail in workshop 2
/usr/share/cowsay/cows
.