The commandline or 'terminal' is an interface you can use to run programs and analyse your data. If this is your first time using one it will seem pretty daunting at first but, with just a few commands, you'll start to see how it helps you to get things done much quicker. You're probably more familiar with software which uses a graphical user interface, also known as a GUI; unfortunately most of the best bioinformatics software has not been programed with this capability.
In [ ]:echo "cd $PWD"
It should say something like
cd /home/manager/pathogen-informatics-training/Notebooks/Unix/basic. Type whatever it said into your terminal and press
Then continue through the course, entering any commands that you encounter into your terminal window.
However, before getting started there are some general points to remember that will make your life easier:
lsis not the same as typing
Directories are the Unix equivalent of folders on a PC or Mac. They are organised in a hierarchy, so directories can have sub-directories and so on. Directories are very useful for organising your work and keeping your account tidy - for example, if you have more than one project, you can organise the files for each project into different directories to keep them separate. You can think of directories as rooms in a house. You can only be in one room (directory) at a time. When you are in a room you can see everything in that room easily. To see things in other rooms, you have to go to the appropriate door and crane your head around. Unix works in a similar manner, moving from directory to directory to access files. The location or directory that you are in is referred to as the current working directory.
If there is a file called
genome.seq in the
dna directory its location or full pathname can be expressed as
pwd stands for print working directory. A command (also known as a program) is something which tells the computer to do something. Commands are therefore often the first thing that you type into the terminal (although we'll show you some advanced exceptions to this rule later).
As described above, directories are arranged in a hierarchical structure. To determine where you are in the hierarchy you can use the
pwd command to display the name of the current working directory. The current working directory may be thought of as the directory you are in, i.e. your current position in the file-system tree.
To find out where you are, type this into your terminal.
In [ ]:pwd
Remember that Unix is case sensitive,
PWD is not the same as
pwd will list each of the folders you would need to navigate through to get from the
root of the file system to your current directory. This is sometimes refered to as your 'absolute path' to distinguish that it gives a complete route rather than a 'relative path' which tells you how to get from one folder to another. More on that shortly.
In [ ]:ls
You should see that there are 4 items in this directory.
To list the contents of a directory with extra information about the items type:
In [ ]:ls -l
Instead of printing out a simple list, this should have printed out additional information about each file. Note that there is a space between the command
ls and the
-l. There is no space between the dash and the letter l.
-l is our first example of an option. Many commands have options which change their behaviour but are not always required.
What do each of the columns represent?
To list all contents of a directory including hidden files and directories type:
In [ ]:ls -a -l
This is an example of a command which can take multiple options at the same time. Different commands take different options and sometimes (unhelpfully) use the same letter to do different things.
How many hidden files and directories are there?
Try the same command but with the
In [ ]:ls -alh
You'll also notice that we've combined
-a -l -h into what appears to be a single
-alh option. It's almost always ok to do this for options which are made up of a single dash followed by a single letter.
What does the
-h option do?
To list the contents of the directory called Pfalciparum with extra information type:
In [ ]:ls -l Pfalciparum/
In this case we gave
ls an argument describing the relative path to the directory
Pfalciparum from our current working directory. Arguments are very similar to options (and I often use the terms interchangably) but they often refer to things which are not prefixed with dashes.
How many files are there in this directory?
Typing out file names is really boring and you're likely to make typos which will at best make your command fail with a strange error and at worst overwrite some of your carefully crafted analysis. Tab completion is a trick which normally reduces this risk significantly.
Instead of typing out
ls Pfalciparum/, try typing
ls P and then press the
tab character (instead of
Enter). The rest of the folder name should just appear. If you have two folders with simiar names (e.g.
my_awesome_results/) then you might need to give your terminal a bit of a hand to work out which one you want. In this case you would type
ls -l m, when you press
tab the terminal would read
ls -l my_awesome_, you could then type
s followed by another
tab and it would work out that you meant
Every file and directory have a set of permissions which restrict what can be done with a file or directory.
The first set of permissions (characters 2,3,4) refer to what the owner of the file can do, the second set of permissions (5,6,7) refers to what members of the Unix group can do and the third set of permissions (8,9,10) refers to what everyone else can do.
cd stands for change directory.
cd command will change the current working directory to another, in other words allow you to move up or down in the directory hierarchy.
To move into the
Styphi directory type the following. Note, you'll remember this more easily if you type this into the terminal rather than copying and pasting. Also remember that you can use tab completion to save typing all of it.
In [ ]:cd Styphi/
Now use the
pwd command to check your location in the directory hierarchy and the
ls command to list the contents of this directory.
In [ ]:pwd ls
You should see that there are 3 files called:
In [ ]:ls .
In [ ]:ls ..
In [ ]:ls ~
Try moving between directories a few times. Can you get into the
Pfalciparum/ and then back into
To copy the file
Styphi.gff to a new file called
In [ ]:cp Styphi.gff StyphiCT18.gff
ls to check the contents of the current directory for the copied file:
In [ ]:ls
mv command stand for move.
mv command will move a file from one location to another. This moves the file rather than copies it, therefore you end up with only one file rather than two. When using the command, the path or pathname is used to tell Unix where to find the file. You refer to files in other directories by using the list of hierarchical names separated by slashes. For example, the file called bases in the directory genome has the path genome/bases. If no path is specified, Unix assumes that the file is in the current working directory.
To move the file
StyphiCT18.gff from the current directory to the directory above type:
In [ ]:mv StyphiCT18.gff ..
ls command to check the contents of the current directory and the directory above to see that
StyphiCT18.gff has been moved.
In [ ]:ls
In [ ]:cd .. ls
In [ ]:rm StyphiCT18.gff
ls command to check the contents of the current directory to see that the file
StyphiCT18.gff has been removed.
In [ ]:ls
Unfortunately there is no "recycle bin" on the command line to recover the file from, so you have to be careful.
To find all files in the current directory and all its subdirectories that end with the suffix gff:
In [ ]:find . -name "*.gff"
How many gff files did you find?
To find all the subdirectories contained in the current directory type:
In [ ]:find . -type d
How many subdirectories did you find?
These are just two basic examples of the find command but it is possible to use the following find options to search in many other ways:
-mtime: search files by modifying date
-atime: search files by last access date
-size: search files by file size
-user: search files by user they belong to
Many people panic when they are confronted with a Unix prompt! Don’t! All the commands you need to solve these exercises are provided above and don't be afraid to make a mistake. If you get lost ask a demonstrator. If you are a person skilled at Unix, be patient this is only a short exercise.
To begin, open a terminal window and navigate to the
basic directory in the
Unix directory (remember use the Unix command
cd) and then complete the exercise below.
lscommand to show the contents of the
Pfalciparumdirectory into the
findcommand to find all gff files in the
Unixdirectory, how many files did you find?
findcommand to find all the fasta files in the
Unixdirectory, how many files did you find?