Lesson 1: Introduction to Python

How to use this notebook
Introduction
Writing your first script
The print statement
Variables and data types
Basic math
Commenting code
Test your understanding: practice set 1

1. How to use this notebook

This is an interactive Jupyter notebook. Throughout the lesson you will see blocks of Python code with "In [ ]" to the left of them. If you click on these blocks and press Shift + Enter, the code will be run and the output will be printed below the block.

In order to get credit for this pre-lab, you must run each code box.

Try it out with this code block here:



In [0]:

    
print "I am Python code! Press Shift+Enter to run me!"

Many of the code blocks in this notebook will start off in the "un-run" state, so make sure you run them to see what the output is!

You can also edit these code blocks yourself and then re-run them to see what changes. This is a great way to learn more about programming, so feel free to do this to any of the code blocks in the lesson.

As you go through these exercises, you might run into issues where pressing "Shift+Enter" doesn't work correctly and you can't get your code to run. This can happen with the platform we are using, the IPython notebook. The solution is to go to the menu bar underneath the 'Jupyter' logo, select 'Kernel', and then select 'Restart.' In our experience, this typically will do the trick to get the notebook running again!

2. Introduction

Why learn Python?

It's a particularly simple and easy to learn for beginners
It's widely used by the scientific community

Languages such as Perl and R are also quite popular among scientists, and are worth learning. Luckily, once you learn your first programming language, it's usually much easier to pick up additional languages!

3. Writing and running Python code

A note about Python versions

For all of the Python exercises in this course, we are using Python 2, which is the second version of the Python language. Python 3 has been released, but many of the same principles apply. It is important to know which version of a programming language you are using, especially in the case of Python, as the syntax has changed slightly between these two versions!

There are several different ways you can write and run Python code. Here are the two ways we'll cover in this course:

Programs and scripts: This is the most common method. Code is written in a .py file and then executed on the command line. We'll talk more about this later.

Jupyter / IPython notebooks: That's what you're using right now! These notebooks allow you to write small (or large) blocks of code and immediately see the result. They also let you intersperse text and images with your code, which makes them very useful as research notebooks or as vignettes to share with others.

We'll start off just writing code in Jupyter notebooks. For a quick introduction to how they work, click Help in the menu above, and then User interface tour.

Try this now

In the code block below, write the following and then run it:

print "Hello world!"



In [0]:

^^^ Write your code here and press Shift+Enter to run! ^^^

4. The `print` statement

In the exercise above, we typed the following line of code:

print "Hello world!"

This is called a print statement. Its purpose is to print data to the terminal screen (or notebook output block, if applicable).

If the data that we want to print is a line of text (called a string in the programming world), we must enclose the text in quotes. If we forget to do this, Python will likely give an error (more on why this occurs in the next section!). Note that the quotes around the string will not actually be printed -- they are not considered part of the string, they just demarcate where the string starts and ends.

If the data we want to print is a number, we do not have to use quotes. Python recognizes numbers as a distinct type of data from strings. In fact, if you enclose your numbers in quotes, Python will start treating them like strings rather than numbers (which is sometimes what we want, but usually not).

Let's look at a few examples:



In [0]:

    
print "I am a string. I am enclosed in quotes."



In [0]:

    
print 123



In [0]:

    
print 2934.454

With numbers, we can also do basic math operations. For example:



In [0]:

    
print 1 + 2



In [0]:

    
print 2 * 3

Here's where it becomes very important to be aware if you are using a string or a number. Let's see what happens when we try to add two numbers enclosed in quotes:



In [0]:

    
print "1" + "2"

What happened here? Basically, whenever you try to "add" two strings, Python does something called concatenation -- merging them into one string. Python doesn't check whether the string holds a number, so it doesn't even try to "add" them in the traditional sense.

Concatenation works for any strings, and is good for when we want to combine multiple strings into one:



In [0]:

    
print "Hello" + "world"



In [0]:

    
print "Space    " + "added"

What happens if we try to combine a string and a number?



In [0]:

    
print "5" + 5

This is not allowed! Remember, Python doesn't consider "5" a number when it's enclosed in quotes -- it's just any old string. Python doesn't know how to combine a string and a number, so it gives an error message.

If we want to print multiple types of data in the same line, we can use a comma instead of a plus sign. This tells Python, "don't try to combine these data -- just print them all separately to the screen!". For example:



In [0]:

    
print "1 + 2 = ", 1 + 2, "!"

This print statement is an example of how the syntax changed between Python 2 and 3. In Python 3, print statements are enclosed in parentheses, so this command would look like print("1 + 2 = ", 1 + 2, "!"), but if you try to run that in Python 2, your output will include the parentheses. This isn't so important to remember for this class since we are always using Python 2, but if you ever use Python 3, this is worth remembering!

[ Check yourself! ] Print practice

Think you got it? Using the code block below, write code to print your name.



In [0]:

5. Variables and data types

What is a variable?

You can think of variables as little boxes that we put data in. You name each box so that you can refer to it and use it in your code. This gives your code flexibility, for reasons you will see soon.

Creating a variable is sometimes called "declaring" or "defining" a variable. This basically just requires giving the variable a name and assigning it an initial value. A variable usually needs to be defined in this way before you can use it elsewhere in your code (although there are a few exceptions we'll go over later).

Here's an example of a variable definition:

geneID = "Fmr1"

Here, geneID is the variable name and "Fmr1" is the piece of data that is being stored in the variable. The = is what we call an assignment operator. In English, we might read this line of code as "store the string 'Fmr1' in the variable 'geneID'".

Important to note:

You should try not to think of = in Python as the same kind of equals sign you use in math. In math, a = implies equality of the information on either side. In Python, = simply means "assign the value on the right to the variable on the left". It may help to think of it more like an arrow pointing to the left, e.g. geneID <- "Fmr1".

What should I name my variables?

You can name your variables almost anything, but there are a few important rules and conventions to keep in mind.

Rules (if you break these, you will get errors):

only letters, numbers, and underscores can be used in a variable name
the variable name can not begin with a number
you can not use any of the python reserved words as a variable name
the capitalization of your variables matters. For example, geneID and geneid would be considered different variables.

Conventions (recommended, but not required):

begin a variable name with a lower case letter
use a name that is descriptive of the info stored in the variable
if your variable name is more than one word squished together, use camelCase or under_scores to make it easier to read.

Some examples:

Good	Bad
`geneID`	`3rdColumn` (illegal)
`personCount`	`sdfsxwcnq` (gibberish)
`input_file`	`person#` (illegal)
`avgGeneCount`	`class` (reserved word)

If you have proper syntax highlighting in your text editor, it should be obvious when you accidentally use a reserved word because it will be a different color than all your other variables! If not, a full list of reserved words can be found here: PyDocs

Variables in action

Below are several examples of code using variables. For each code block, first try to guess what the output will be, and then run the block to see the answer. A short explanation of what happened in each code block follows.



In [0]:

    
geneID = "Fmr1"
print geneID

When we print a variable, Python knows that what we're really interested in is what's stored in the variable, not the variable itself. Therefore, Python automatically prints the contents of the variable when we do this kind of print statement.



In [0]:

    
apples = 5
oranges = 10
fruit = apples + oranges
print fruit

When two variables contain numbers, we can add them together as if they were the numbers themselves. Another important thing to note here is that in an assignment statement, everything on the right of the = will happen first. So in line 3, the apples + oranges part happens first, and the result is stored in fruit.



In [0]:

    
apples = 5
oranges = 10
print apples + oranges

Note that the addition is done before printing, and only the result is printed (much like adding literal numbers, as we saw before).



In [0]:

    
apples = 5
oranges = 10
print apples, oranges

As we saw before, the comma is basically allows us to list multiple things we would like to print at once, without trying to add or concatenate them. A space is automatically inserted between each item.



In [0]:

    
apples = 5
oranges = 10
print "I have", apples, "apples"

As above, we can mix and match different data types in our print statements when we use a comma.



In [0]:

    
people = 3
people = people + 1
print people

This is an important one to understand. What we did here was overwrite the value of people with the value of people + 1. The important thing to remember is that the right side of the = sign is evaluated first. So first Python figures out what people + 1 is, which is 3 + 1 or 4. Once that is completely done, it takes that result and stores it in the variable on the left. In this case, this overwrites the value that was already in people.

Later on, this is how we will create counters, i.e. variables that we increment by 1 every time something happens.



In [0]:

    
people = 3
animals = 4
people = animals
print people
print animals

Another example of overwriting a variable. Note that the value of animals is unchanged.



In [0]:

    
name = "Joe Shmo"
age = 20
print name,"will be",(age + 1),"next year"

Here we just did some simple math within the print statement.



In [0]:

    
yourAge = "16"
print "You will be", (yourAge + 1), "next year"

What happened? We put the number 16 in quotes--this makes it a string instead of a number! As we discussed above, Python can't add a string to a number, so it gives an error message. (You may also notice that it starts to print the message, but fails when we try to do the addition! Sometimes this can help you track down where errors are occuring.)

[ Check yourself! ] Variable practice

Think you got it? In the code block below, there are two variables. Write one additional line of code to print both variables on the same line.



In [0]:

    
geneName = "Actb"
readCount = 10375

Data types

Data comes in many types: numbers, words, letters, etc.

In Python, certain types of data are treated differently. There are four main "data types" we'll be working with:

String - a string is just another word for text. You can think of it as "a string of letters/characters". Strings are enclosed in double or single quotes to distinguish them from variables and commands (ex: "This is a string!" 'So is this!')
Integer ("int") - this refers to whole numbers (same as in real life). In programming, integers are handled differently than non-integers, which is why we make this distinction.
Floating point numbers ("float") - numbers with decimals.
Booleans – True or False (1 or 0). We'll talk more about this later.

As we've seen, different types of data are treated differently by Python:



In [0]:

    
print 1 + 1



In [0]:

    
print "1" + "1"



In [0]:

    
print 1 + "1"

Sometimes, like in the last example above, we'd actually like to convert one data type into another. Fortunately, Python provides simple built-in functions that allow us to do this in certain cases.

Here's a partial list of the most useful conversion function:

str() - converts a variable or piece of data to a string. Works on integers, floats, and booleans (and others).
int() - converts a variable or piece of data to an integer. Works strings made up only of numbers (e.g. "123"), floats (decimal part will be truncated), and booleans (True converts to 1, False to 0).
float() - converts a variable or piece of data to a float (decimal). Works on strings made up only of numbers (e.g. "123.45"), integers (a .0 will be added), and booleans (True converts to 1.0, False to 0.0)

Let's look at a few example:



In [0]:

    
print 1 + int("1")



In [0]:

    
print "1" + str(1)



In [0]:

    
print float("1")

A variable takes on the "type" of whatever data it is currently storing. So a variable holding a string has the string type, and a variable holding an integer has the integer type. Thus, we can apply the type conversion functions to variables as well:



In [0]:

    
age = "16"
print float(age)



In [0]:

    
age = "16"
print int(age) + 1

6. Basic Math

Math in Python uses many of the symbols and conventions you're already used to from traditional mathematics:



In [0]:

    
print 2 + 2



In [0]:

    
print 5 - 10



In [0]:

    
print 5 * 5



In [0]:

    
print 10 / 2

Order of operations (P.E.M.D.A.S.) is maintained:



In [0]:

    
print 2 + 5 * 5



In [0]:

    
print (2 + 5) * 5

There are also a few notations that you might not be familiar with.

Exponents:



In [0]:

    
print 5 ** 2

Remainder (aka modulus or "mod"):



In [0]:

    
print 5 % 2

However, the most important difference to watch out for is integer division. Run the following examples:



In [0]:

    
print 6 / 2



In [0]:

    
print 5 / 2



In [0]:

    
print 5 / 3



In [0]:

    
print 5 / 4

As you can see, only 6 / 2 gave the correct answer -- all the other answers appear as if they've been rounded down. Why is this? Basically, it is a somewhat odd rule in Python that whenever you divide two integers, Python always returns an integer answer. If the answer should have a decimal component, the decimal is simply truncated (e.g. 2.5 is truncated to 2).

To get a proper answer, at least one of the numbers being divided must be a float:



In [0]:

    
print 5.0 / 2



In [0]:

    
print float(5) / 2



In [0]:

    
print 5 / float(2)

This is a very common source of errors, so always keep it in mind when you divide! When in doubt, convert one number with float().

7. Commenting code

Our final topic for the day is a simple but important one: commenting your code.

A "comment" is a line of code that Python ignores when it executes your code. We mark these lines by starting them with a hash/pound sign (#). Here's an example:



In [0]:

    
# This is a comment line! It won't be printed!
print "Hello, friend"

# Use comments to leave notes on what your code is doing
print "Nice day we're having, isn't it?"

# ...or to temporarily prevent certain lines of code from executing
# print 1 + "2" + holy illegal operations, batman! + 42.8g$

# You can put them almost anywhere!
print "Well, goodbye then" # even here!

You can also make comments that span multiple lines, using triple quotes (""") like so:



In [0]:

    
""" 
This here is a multi-line comment.
Make sure to end it with matching quotes!
"""
print "Aye aye!"

When should I use comments?

Comments are meant to improve the understandability of your code to another person (and possibly yourself in the future).
Use them whenever you think a piece of code might be particularly confusing to a reader.
You can also use them to "section" your code. One strategy is to write comments first, before the code, and use them as an outline for the structure of the code.
One mistake beginners make is to actually comment too much. You don't need to comment things that are standard and obvious, just the parts that are most likely to be confusing.

Most importantly: always keep your comments up to date! Inaccurate comments are worse than no comments at all, because they mislead the reader and can cause false assumptions. If you make major changes to your code, always check your comments to make sure they have not become inaccurate.

8. Test your understanding: practice set 1

For the following blocks of code, first try to guess what the output will be, and then run the code yourself. These examples may introduce some ideas and common pitfalls that were not explicitly covered in the text above, so be sure to complete this section.



In [0]:

    
print "what's", "up"



In [0]:

    
print "what's" + "up"



In [0]:

    
print "I have", 5, "cats"



In [0]:

    
print "I have" + 5 + "cats"



In [0]:

    
print 9 - 6 * 2



In [0]:

    
print (9 - 6) * 2



In [0]:

    
print 24 % 6



In [0]:

    
print 24 % 7



In [0]:

    
print -3 ** 2



In [0]:

    
print 9 / 2



In [0]:

    
print 9.0 / 2



In [0]:

    
print 9 / float(2)



In [0]:

    
x = 5
print x * 3



In [0]:

    
x = "5"
print x * 3



In [0]:

    
x = "5"
print int(x) * 3



In [0]:

    
x = "cat"
y = x
print y



In [0]:

    
x = "cat"
y = "dog"
x = y
y = x
print y



In [0]:

    
x = "cat"
y = "dog"
print x + y



In [0]:

    
x = 5
x = 1
print x



In [0]:

    
x = 5
x + 1
print x



In [0]:

    
x = 2
y = 4
print (x * y) ** x



In [0]:

    
x = 16
print x ** 0.5