This lesson is adapted from the one of the same name from Software Carpentry
This lesson can be done interactively with the students and this notebook distributed for reference.
The Unix shell is older than most of the people who use it. It has survived so long because it is one of the most productive programming environments ever created—maybe even the most productive. Its syntax may be cryptic, but people who have mastered it can experiment with different commands interactively, then use what they have learned to automate their work.
The shell is an interactive interpreter: it reads commands, finds the corresponding programs, runs them, and displays output.
Output can be redirected using >
and <
.
Commands can be combined using pipelines.
The history
command can be used to view and repeat previous operations, while tab completion can be used to save re-typing.
Directories (or folders) are nested to organize information hierarchically.
Use grep
to find things in files, and find
to find files themselves.
Programs can be paused, run in the background, or run on remote machines.
The shell has variables like any other program, and these can be used to control how it behaves.
At a high level, computers really do four things:
How we interact with the computer is varied and evolving (mouse, keyboard, touchscreen, etc.). The oldest method of interaction is called CLUI, or command-line user interface, to distinguish it from the GUIs, or graphical user interfaces, that most of us are now used to.
Workflow at the command line:
In between the user and the computer is another program called the command shell
What the user types goes into the shell, which figures out what commands to run and orders the computer to execute them. The computer then sends the output of those programs back to the shell, which takes care of displaying things to the user.
A shell is just a program like any other; the only thing that's different about it is that its job is to run other programs, rather than to do calculations itself.
The most popular Unix shell is bash
, the Bourne again shell. (It's called that because it's derived from a shell written by Stephen Bourne. This is what passes for wit among programmers.) Bash
is the default shell on most modern implementations of Unix…
Using it, or any other shell, feels a lot more like programming than like using windows and mice.
Commands are terse—often only a couple of characters long—and their names are often cryptic.
pwd
, ls
, and cd
.Some of the commands we will use most often are ones related to storing data on disk.
The subsystem reponsible for this is called the file system.
It organizes our data into files, which hold information… …and directories, which hold files or other directories.
How can we use the shell to run other programs that will show us what's in the file system?
Once we have logged in, we'll see a shell prompt, which is usually just a dollar sign (but which may show extra information, like our user ID).
The shell prompt signals that the shell is waiting for us to type something in.
Type "whoami
", followed by "enter
". This command prints out the ID of the current user, i.e., shows us who the shell thinks we are.
When we enter it, the shell finds a program called whoami
…
...runs it...
…displays its output…
…and then displays a new prompt, telling us that it's ready for more commands.
Now that we know who we are, we can find out where we are using pwd
, which stands for "print working directory".
This is our current default directory, i.e., the directory the computer assumes we want to use unless we specify something else explicitly.
The computer's response is /Users/jklay
. To understand what this means, let's have a look at how the file system as a whole is organized.
Tour of filesystem:
At the very top of the file system is a directory called the root directory that holds everything else the computer is storing.
/
Inside that directory (or underneath it, if you're drawing a tree) are several other directories, such as bin
(which is where some built-in programs are stored), data
, users
, tmp
, and so on.
We know that our current working directory, /Users/jklay
, is stored inside /Users
because /Users
is the first part of its name. Similarly, we know that /Users
is stored inside the root directory / because its name begins with /.
Underneath /Users
, we find one directory for each user with an account on this machine.
Notice, by the way, that there are two meanings for the / character. When it appears at the front of a file or directory name, it refers to the root directory. When it appears inside a name, it's just a separator.
$ ls
stands for listing, prints the names of all the files and directories in the current directory in alphabetical order, arranged neatly into columns.
To make its output more comprehensible, we can give it the argument, or flag,
$ ls -F
This tells ls to add a trailing / to the names of directories.
By convention, the second part of a filename, called the filename extension, indicates what type of data the file holds.
.txt
signals a plain text file, .pdf
indicates a PDF document, .cfg
is a configuration file full of parameters for some program or other, and so on. However, this is only a convention, and not a guarantee. Files contain bytes, nothing more; it's up to us and our programs to interpret those bytes according to the rules for PDF documents, images, and so on.
Relative paths vs Absolute paths
We can use cd
followed by a directory name to change our working directory.
$ cd
stands for "change directory"……which is a bit misleading: the command doesn't change the directory……it changes the shell's idea of what directory we are in.
cd
doesn't print anything, but if we run pwd
after it, we can see that we are now "in" a new directory
If we run ls
again it will show us the listing of files and directories where we are now.
To go up in the directory tree, use
$ cd ..
..
is a special directory name meaning "the directory containing this one". Or more succinctly, the parent of the current directory.
The special directory ..
doesn't usually show up when we run ls
.
$ ls -a
shows all, including ..
and also another special directory that's just called .
, which is the directory we're currently in. It may seem redundant to have a name for where we are, but we'll see some uses for it later.
mkdir
and a simple editor called nano
.When we type in commands like ls
or pwd
, the shell finds the corresponding programs, runs them on our behalf, and shows us their output. But how do we create files and directories for it to show us?
Let's create a new directory called tmp
with the command mkdir tmp
.
$ mkdir tmp
As you might guess from its name, mkdir
means "make directory".
Since tmp
is a relative path (without a leading slash), the new directory is made below the current one.
If you run ls
again you will see it in the list. However, there's nothing below it yet: tmp
is empty, which we can tell if we try to list its contents:
$ ls tmp
The command doesn't print any output, indicating that tmp
is empty. If we use ls -a tmp
to show directories whose names start with '.
', though, we see that .
and ..
are there, as they always are.
Let's change our working directory to tmp
using cd
, then run the command nano junk
.
$ cd tmp
$ nano junk
nano
is a simple text editor. It can only work with plain character data, not tables, images, or any other human-friendly media.
You can start typing and your text will appear in the window starting at the cursor location.
When you are done typing, you can save the file to disk by holding down the "control" button and the o
button at the same time. By convention, Unix uses the caret ^
followed by a letter to mean "control plus that letter".
Once our quotation is saved, we can use Control-X to quit the editor and return to the shell.
nano
doesn't leave any output on the screen after it exits, but ls
now shows that we have created a file called junk
.
Let's tidy up by running rm junk
. "rm
" stands for "remove"—this command deletes files.
It's important to remember that there is no undelete. Unix doesn't move things to a trash bin: it unhooks them from the file system so that their storage space on disk can be recycled. Tools for finding and recovering deleted files do exist, but there's no guarantee they'll work in any particular situation, since the computer may reclaim the file's disk space right away.
If we now run ls
, its output is empty once again, which tells us that our file is gone.
Now recreate the file by opening nano
again with nano junk
, then cd
up one directory.
$ nano junk
$ cd ../
If we try to remove the tmp
directory using rm tmp
, we get an error message: rm
only works on files, not directories.
$ rm tmp
rm: cannot remove `tmp': Is a directory
The right command is rmdir
, which stands for "remove directory".
$ rmdir tmp
rmdir: failed to remove `tmp': Directory not empty
It doesn't work yet either, though, because the directory we're trying to remove isn't empty.
If we want to get rid of tmp
we must first delete the file junk
.
$ rm tmp/junk
$ rmdir tmp
The directory is now empty, so rmdir
deletes it.
Create that directory and file one more time.
$ mkdir tmp
$ nano tmp/junk
$ ls tmp
junk
$
junk
isn't a particularly informative name, so let's change the file's name using mv
.
$ mv tmp/junk tmp/quotes.txt
mv
is short for "move": we use it to move a file from one place to another.
It also works on directories: there is no separate mvdir
command.
The first argument tells mv
what we're moving. The second tells it where to move it to.
The general form of the command is
$ mv <source> <destination>
where the source can be a file or a directory and the same for the destination.
In this case, we're moving tmp/junk
to tmp/quotes.txt
, which has the same effect as renaming the file.
Sure enough, ls
shows us that tmp
now contains one file called quotes.txt
.
$ ls tmp
quotes.txt
We can bring that file into the current working directory by using the mv
command, but this time the second argument is a directory.
$ mv tmp/quotes.txt .
The effect is to move the file from the directory it was in to a different directory. You can use ls
to see that tmp
is now empty and the file has been moved to the current working directory.
The cp
command works very much like mv
, except it copies a file instead of moving it.
$ cp <source> <destination>
where the source must be a file but the destination can be either a file or a directory.
To summarize, here are the commands we've seen so far, along with the two special directory names.
`pwd` | print working directory |
`cd` | change working directory |
`ls` | listing |
`.` | current directory |
`..` | parent directory |
`mkdir` | make a directory |
`nano` | text editor |
`rm` | remove (delete) a file |
`rmdir` | remove (delete) a directory |
`mv` | move (rename) a file or directory |
`cp` | copy a file |
With this information you will be able to do most of the things we will need to do at the command line in the terminal. As you get more comfortable working with the command line, we'll learn new commands to expand your ability to work with the filesystem.