Very general question time: What is it that computers do? Computers can interact with us, store information for us, run programs, etc. Computers can help us do science! We, as scientists, interact with computers in a couple of basic ways, through the command line interface (CLI) or a graphical user interface (GUI). What are some examples of GUIs? Well, almost everything you're used to using is a GUI! Word, Excel, etc. GUIs are great because they require no memorization of syntax or knowledge of programming--you can simply use menus and icons to move, delete, or open a file. So why bother with CLI at all? Because it can be very convenient and powerful, and we will learn how!
COMMAND SHELL: The command shell is a program that helps you communicate with your computer. You type a something into the terminal, and then the shell takes this and figures out what commands the computer needs to run and then orders the computer to do so. A commonly used shell for Unix is the Bash shell. Let's start off with some very basic shell commands. NOTE: Because I'm in this python notebook environment I need to include this first line with the percentage signs. It is called a magic function and it just helps me simulate being in the shell environment. You DO NOT need to use this.
Before we begin the lesson, we need to do a little bit of set-up. Do not worry about what this command means for now, just know that it is helping you to correctly set up things on your computer so that things like printing, running this notebook, and using python will work. Simply type the following on the command line of your terminal:
~premapta/setup
To make sure this has worked, try a couple of things. First type the following and you should see an interactive environment for python come up.
ipython
To exit this session use Ctrl+D
. Next use the commands below and you should see a web browser pop up from which you can access this notebook.
cd Lectures/UnixIntro
ipython notebook UnixIntro.ipynb
To kill the ipython notebook server use Ctrl+C
.
We can ask the computer who the current user is using whoami
or we can print the working directory (where we are) using pwd
In both cases the shell finds the program (either whoami
or pwd
), runs that program, and then displays the output for us.
Type the following into your terminal window and press enter:
whoami
also try:
pwd
which stands for "print the working directory".
From the pwd
command, we can see which directory we are currently working in.
/astro/users/<username>
refers to your so-called home directory. This is like the top-level directory on your computer (remember: folder = directory).
We need to be working in a specific directory to do some of the examples in this lesson. Type the following into your command line and press enter.
cd ~/Lectures/UnixIntro
We'll worry about what this command means later. Let's investigate what is in our current working directory with the ls
command:
ls
The ls
command lists what is in the current working directory. You'll see something like this:
bar.txt
baz.txt
bio.txt
foo.txt
helloworld.py
wordcount.txt
This is showing you the same things that you'd see in a file browser.
The ls
command LISTS what is in the current working directory. This is not very exciting right now, because all I have in here is this notebook and a few other files. Let's move to another directory to check out more stuff. You can either follow this command (change to my home directory) or insert your own uwnetid instead of my username:
In [4]:
%%bash
cd /astro/users/premapta
ls -F
So now we see a lot more stuff! I also added this trailing "-F" flag to ls
to make it more clear which things are directories (adds trailing "/" to directories) and which things are files. We can tell what type of files all the files are by their extensions (i.e. names are ".pdf" or the like). How did I change to this directory? I used the cd
command to change directories, followed by the PATH to the new destination directory. The cd
command can be followed by a variety of other characters to move around our file system. Let's look at a couple:
In [5]:
%%bash
pwd
cd .
pwd
cd ..
pwd
cd ~
pwd
So what did we do? We print our working directory (/astro/users/premapta/Lectures/UnixIntro), then I changed to the "." directory, then printed that directory (still UnixIntro), then changed to the ".." directory, then printed that directory (Lectures), then changed to the home directory (/astro/users/premapta) and printed it. So what are the ".", "..", and "~" directories? These characters the current directory, the parent directory, and the home directory, respectively. We can get them to show up when we use a special flag with our ls
command:
In [6]:
%%bash
ls -a
This stands for "list all." Now I want everyone to take a moment to do some exploring. If you type cd
without a path to any directory, what happens? If you use the "-s" flag with ls
what output do you get?
Now that we've seen how to navigate through the directory structure and see what is inside directories, let's see how to create things. Navigate back to your home directory--cd /astro/users/your_uwnetid_here
--so that you will be able to create directories and write files. If we want to make a new directory we will use the following command:
In [7]:
%%bash
mkdir bio
The mkdir
command stands for "make directory" so with this we have created a new directory that I named bio. If you move into this directory and try to see what is in it you will see that it is empty. Go ahead and do that now. The next thing we'll want to do is create a file in this new directory. Let's make this a draft file. To create a file we will use the touch
command followed by the name of the file we want to create. (More on creating files with text editors later on).
In [8]:
%%bash
cd bio
touch draft.txt
ls
So we can see that we have created a "draft.txt" file in our bio directory, but this file is empty. Let's say we want to make a copy of a this file and move in to another directory. To make copies of files we use the cp
command. To move files we will use mv
command. NOTE: I am only doing the cd bio
command again because this notebook automatically reverts me back to the "/astro/users/premaptap/Lectures/UnixIntro" directory when I start a new cell. In the terminal you would NOT need to do this.
In [9]:
%%bash
cd bio
cp draft.txt draftcopy.txt
ls
mv draftcopy.txt ..
ls
So let's see what we have done with this series of commands. We have copied "draft.txt" into a new file called "draftcopy.txt." By using the ls
command I can see that both these files exist in the directory. Next I use the mv
command followed by the destination (recall from earlier) to move this copy file to one directory up. Now if I do ls
again I see that ONLY "draft.txt" remains in this directory. Now let's say that I only want "draftcopy.txt" and so I decided to remove the whole bio directory. The command to remove something is rm
(Note: be careful since removing is PERMANENT). Does this command work? Why or why not?
You should find that it does not work since the directory is not empty. There are several things you could do here. You could go into the "bio" directory and remove the file using rm draft.txt
and then subsequently delete the empty directory, or you could use one of the following commands:
In [10]:
%%bash
rm -r bio
rmdir bio
The first command tells it to delete everything in the directory and then delete the directory itself ("r" stands for "recursive"), while the second command is the equivalent of rm
for a directory (in that it will delete the whole directory). Again, be VERY careful when doing this. Deleting is permanent.
The next thing to look at is how we can combine existing programs (commands) to do powerful things from the command line. Let's imagine that I have a very crowded directory with lots of different files, how will I find just the files of a certain type? We can use something called the wildcard to accomplish this. For example, let's say I only want to look at ".txt" files. Here's how we might accomplish that:
In [11]:
%%bash
ls *.txt
This "wildcard" symbol matches one or more characters. The "?" wildcard matches a single character. You can use these in combination with one another to get at more specific file names. For example, what would ls b*.txt
output versus ls ?a*
versus ls ba?.txt
Try these out to see what you get! Note that the wildcard can be used with any other shell command, for example rm *
. DON'T DO THIS. It's a bad idea to delete everything.
Let's say I want to write some of my output from the shell to a file. Let's use the wc
command to count the number of lines,words, and characters in each file and then redirect that to a new file called "wordcount.txt."
In [12]:
%%bash
wc *.txt > wordcount.txt
The greater than symbol tells the computer to redirect the shell output to a file instead of printing it to a screen. If we want to see what is in this new file "wordcount.txt," we can use the cat
command, which stands for "concatenate" and instructs the computer to print the contents of the file.
In [13]:
%%bash
cat wordcount.txt
Imagine that I had to do this for a REALLY large file. Would I want all the output printed directly to my screen? Probably not. Imagine I only care about that summary line at the end. In order to get at just the first or last few lines of a file we can use the head
or tail
commands as follows:
In [14]:
%%bash
head -1 wordcount.txt
tail -1 wordcount.txt
The "-1" means either first line, or last line, respectively. We could change these numbers to get the first three lines, or last three lines, for example. Let's look at how we could search for instances of specific words in lines in files:
In [15]:
%%bash
cat *.txt | grep -n "file"
The vertical bar is referred to as a "pipe". It tells the shell to take the output of the command on the left as the input to the command on the right. grep
is a command that finds lines in files that match a particular pattern. It is a contraction of "global/regular expression/print." The "-n" flag means to print the line number where the expression we are searching for occurs. There are many other flag options to go along with grep,
which you can find by doing man grep
(which stands for manual). There are also lots of other ways to search for specific "regular expressions" using grep
. As one example, let's look at what the following does:
In [16]:
%%bash
cat *.txt | grep '^Here'
When we add in the carrot this prints out the file contents of only those files that have lines beginning with "Here."
We can also use the sort
command on files. Here I will show an example of doing an alphabetical sort, but note that it is possible to do a numerical sort as well (you will try this later for yourselves). What if I tried this with sort -k 2 alpha.dat
instead?
In [17]:
%%bash
sort -k 1 alpha.dat
There are even more Unix commands (not covered here) that are listed on the Unix Cheat Sheet that you have. This includes things like how to download a file from the internet (given a web address of the file), how to create a tarball, how to copy things between machines, etc. Now that we've gone over these basic commands today I'll show you one last thing, which is how to remotely log into your machines. This command is also covered on the Unix Cheat Sheet, but requires a little set-up.
The first thing you need is the correct "config" file in the .ssh folder in your home directory. You only need to create this ONCE. First, check to see that this folder exists by doing the following command:
cd ~/.ssh
If this works, then the directory already exits. If you get an error, then you need to create this directory first, which you would do with mkdir ~/.ssh
. Next, you need to create the config file. The command to do that is as follows:
touch ~/.ssh/config
Now open us this config file with your favorite text editor (something like emacs ~/.ssh/config
) and edit it to include the following lines:
Host gateway
User UWNETID
Hostname gateway.phys.washington.edu
Host astrolabXX
User UWNETID
Hostname astrolabXX
ProxyCommand ssh -q -W %h:%p gateway
Remember that to SAVE the emacs file you to a ctrl+xs
and to EXIT emacs you do a ctrl+xc.
In this file change UWNETID to your login username, and XX to the default lab computer you would like to log in to. Once you have this file all set up on your laptop, you should be able to remotely log into an astrolab computer with the following command (also on the Unix Cheat Sheet):
ssh astrolabXX
In order to log out of a remote session you just need to type exit
in the terminal window. The next time you want to login all you have to do is type ssh astrolabXX
again and it should prompt you for your password (twice) as before.
First, download putty--use putty.exe--and then download Xming. For convenience drag both of these to make icons on your desktop so that you don't have to go searching for them later. Make sure both have installed properly. Make sure Xming is open (just double click, nothing will show up on the screen) before opening putty. Once this is done, you can open up putty. It should look like this:
Where it says "Host Name (or IP address) type gateway.phys.washington.edu
. On the left hand side click the "+" next to "SSH." More options should come up. Next click on X11 and on the new screen check the box that says "enable X11 forwarding." Now go back to the first screen (click "Session" on the left hand side). Under "Saved Sessions" type something like "astrolab" and hit "Save." This saves the settings you just input under the name astrolab. Next click "Open." An Xming terminal should pop up and prompt you for your username. Enter your uwnetid and your password. If this works, you have tunneled through gateway! Only one more step. In the terminal type the following:
ssh -l UWNETID astrolabXX.astro.washington.edu
And once again enter your password when prompted. Note that the "-l" is a lowercase L (not a 1), and that you should put in your username and preferred astrolab computer where it says UWNETID and XX. Now you should be logged into your astrolab computer! Try doing an pwd
to make sure you are in your home directory on the astrolab machine, or do an ls
to see what is there. You will need to enter exit
once to log out of your astrolab machine, and then exit
once more to log out of gateway. Once you've done this your Xming terminal window should disappear.
The next time you want to do a remote login just make sure that Xming is open first and then open putty. Then you should be able to click on "astrolab" in your "Saved Sessions" and just hit "Open." An Xming terminal will again open and prompt you for your uwnetid and password. Then you type the ssh -l UWNETID astrolabXX.astro.washington.edu
command above and you will be logged in again!
Next time we will cover a brief bit of astro background, talk about using text editors (to more easily create and edit files), and then get started on our first Unix assignment!
In [ ]: