A crash course in Python

**Authors**: Thierry D.G.A Mondeel, Stefania Astrologo, Ewelina Weglarz-Tomczak & Hans V. Westerhoff
University of Amsterdam
2017

Inspired in part by a collection of parts from various notebooks on the web. Some links to the originals provided below:

Note: The goal here is not to give you a complete introduction to python. There are separete university courses for this that fill a semester. We just want you to be familiar enough to be able to interact with prewritten code to do FBA calculations later on in the tutorial.

Some motivation

Python in Computational Biology: or why you should care about python

  1. Its what we chose to use in this tutorial
  2. Python is a modern programming language developed in the early 1990s by Guido van Rossum -> A Dutch guy! (https://en.wikipedia.org/wiki/Guido_van_Rossum)
  3. Beginner Friendly
  4. Easy to understand and read
  5. It's free
  6. It is pervasive in computational biology

All you need to know

Some words of wisdom by Tim Peters, i.e. the Zen of Python.


In [2]:
# ignore the first two lines: they allow you to show multiple outputs per cell
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all" 

import this


The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!

**Assignment (lifelong):** Which of these (do not) apply to science as well? Or even, dare we say, to life in general?

Printing text in python


In [ ]:
print("The classic view of the central dogma of biology states that \
'the coded genetic information hard-wired into DNA is transcribed into \
individual transportable cassettes, composed of messenger RNA (mRNA); \
each mRNA cassette contains the program for synthesis of a particular \
protein (or small number of proteins).'")

**Assignment (1 min):** print the name of your favorite protein below.


In [ ]:

Congratulations! You are now a Python programmer.

Numbers matter in biology

Luckily we can use Python as a calculator.

Below we show some examples. Do you understand what each one does?

TIP (Division): In Python 3, 3 / 2 returns 1.5 (floating-point division). 3 // 2 for integer division.


In [3]:
2 * 2
3/2
3//2
6%2
8%3
2**5


Out[3]:
4
Out[3]:
1.5
Out[3]:
1
Out[3]:
0
Out[3]:
2
Out[3]:
32

Python's built-in mathematical operators include +, -, *, **, for exponentiation, / for division, // for integer division (without rest), % for division but giving you the rest.

**Assignment (1 min):** Experiment with these a bit


In [ ]:

See: http://bionumbers.hms.harvard.edu/KeyNumbers.aspx

According to bionumbers (see link) e. coli has a volume of up to $5~\mu m^3$ and yeast has a volume of up to $160~\mu m^3$

**Assignment (3 min):** Calculate in the cell below how many times bigger the volume of yeast is compared to e. coli.


In [ ]:

**Assignment (3 min):** Using the bionumbers page find the "water molecule diameter" and the diameter of a yeast cell. Then calculate how many water molecules you could theoretically lay side by side in a yeast cell.


In [ ]:

Variables

form a fundamental concept of any programming language. A variable has a name and a value. Here is how to create a new variable in Python:


In [ ]:
avogadro = 6e23
print('How many molecules are contained in a mole? Answer:',avogadro,'molecules.')

And here is how to use an existing variable:


In [ ]:
print('How many molecules in a 2 moles? Answer:',2*avogadro,'molecules')

**Assignment (1 min):** Calculate and print the number of molecules in $0.33~\mu M/L$


In [ ]:

Getting help in the notebook

When you want to know what a command or function does you can type a question mark in fron of the command.


In [5]:
import math # load some common mathematical operations
?math.sqrt(2)

When a command requires input arguments, as for math.sqrt(put a number here), a handy keyboard shortcut is Shift-Tab. A tooltip will light up showing you the various arguments you can pass to the command.

**Assignment (2 min):** Ask for help on the "math.sqrt" command using the Shift-Tab method described above. Start by typing math.sqrt below and then press Shift-Tab.

What kind of methods does math contain?

When we import a library like "math", which we did at the top of the notebook, this library will contain many different functions like sqrt above. If you want to find out which ones type "math." (notice the dot!) and then press "TAB".

**Assignment (3 min):** Use the TAB key to find out some other math functions and play around with them maybe look at the help file.


In [ ]:

There are different types of variables. Here, we have used a number (more precisely, an integer). Other important types include floating-point numbers to represent real numbers, strings to represent text, and booleans to represent True/False values. Here are a few examples:


In [7]:
somefloat = 3.1415
sometext = 'pi is about'  # You can also use double quotes.
print(sometext, somefloat)  # Display several variables.
I_am_true = False
I_am_true


pi is about 3.1415
Out[7]:
False

Note how we used the # character to write comments. Whereas Python discards the comments completely, adding comments in the code is important when the code is to be read by other humans (including yourself in the future).

**Assignment (3 min):** Make your own piece of text, i.e. your name and age and print it to the screen. Do not just write a string a text, make your age a variable like pi in the example above.


In [ ]:

Murphy's law: (In a tutorial) Anything that can go wrong will go wrong

Python only understands certain code. When you write something Python doesn't understand it throws an exception and tries to explain what went wrong, but it can only speak in a broken Pythonesque english. Let's see some examples by running these code blocks. This is helpful later on because you will likely encounter (and produce) some errors.


In [ ]:
gibberish

In [ ]:
*adsflf_

In [ ]:
print('Hello'

In [ ]:
1v34

In [ ]:
2000 / 0

Python tries to tell you where it stopped understanding, but in the above examples, each program is only 1 line long.

It also tries to show you where on the line the problem happened with caret ("^").

Finally it tells you the type of thing that went wrong, (NameError, SyntaxError, ZeroDivisionError) and a bit more information like "name 'gibberish' is not defined" or "unexpected EOF while parsing".

Unfortunately you might not find "unexpected EOF while parsing" too helpful. EOF stands for End of File, but what file? What is parsing? Python does it's best, but it does take a bit of time to develop a knack for what these messages mean. If you run into an error you don't understand please ask a tutor.

Types of variables in python

When you define a variable in python it has a type. Above we dealt with numbers which are of type ... Below we briefly introduce the other types you might see in the rest of the tutorial.

Simply put there is text, i.e. strings, and two kinds of containers, e.g. lists and dictionaries.

All together now

An example for each of the most important types in python


In [ ]:
x = 'One definition of systems biology: the study of the interactions between the components of biological systems, \
and how these interactions give rise to the function and behavior of that system (for example, the enzymes and \
metabolites in a metabolic pathway or the heart beats)' # string
print(type(x))
print(x)
print()

x = [0.5,2,24] # list
print(type(x))
print(x)
print('The (rough) cell cycle time of e.coli =',x[0],'hrs, of yeast =',x[1],'hrs, of a human cell',x[2],'hrs')
print()

x = {'e. coli':5,'yeast':12,'human':2.9e3} # dictionary
print(type(x))
print(x)
print('The genome size of e.coli =',x['e. coli'],'Mbp, of yeast =',x['yeast'],'Mbp, of a human cell',x['human'],'Mbp')

The Written Word, i.e. strings

Numbers are great... but most of our day to day computing needs involves text, from emails to tweets to documents. Or in biology: DNA sequences, chemical formulas, hyperlinks between databases etc.

We have already seen a couple strings in Python. Programmers call text strings because they are weird like that. From now on we will only refer to strings, but we just mean pieces of text inside our code.


In [8]:
"Hello, World!"


Out[8]:
'Hello, World!'

Strings are surrounded by quotes. Without the quotes Hello by itself would be viewed as a variable name.

You can use either double quotes (") or single quotes (') for text/strings. As we saw before we can also save text in variables.

Let's use strings with variables!


In [12]:
your_name = "James Watson"
print("Hello,",your_name)


Hello, James Watson

Strings in Python are a bit more complicated because the operations on them aren't just + and * (though those are valid operations).

Dot notation and object oriented programming

Python, like many programming languages, supports Object Oriented Programming or OOP for short. In this paradigm, we approach ideas as Objects much as we do in the real world. Each Object is an instance of a Class or a type of object. Such an object may have certain properties or function that can be applied to them.

So what does all this have to do with dot notation? Dot notation allows us to tell a instance of a class to use one of the functions inside that class. That is why we access the sqrt function from the math module with the dot. And why we access the 'upper' function of a string as my_string.upper()

**Assignment (2 min):** Below, after your_string, type a dot and press Tab. You will get a list of function you can apply to the string. Pick two commands that seem interesting to you.

Note that you have to define a string first to be able to get help. So first execute the cell so that the variable your_string is known. Then using the Tab key find one of your choice.

Note: Don't worry about the ones that are difficult to understand. This is just to get you comfortable with the dot notation and finding functions and properties of objects.


In [14]:
your_string = 'something'
your_string


Out[14]:
'something'

Lists

A list contains a sequence of items. You can concisely instruct Python to perform repeated actions on the elements of a list. Let's first create a list of numbers:


In [ ]:
items = [1, 3, 0, 4, 1]

Note the syntax we used to create the list: square brackets [], and commas , to separate the items.

The built-in function len() returns the number of elements in a list:


In [ ]:
len(items)

Now, let's compute the sum of all elements in the list. Python provides a built-in function for this:


In [ ]:
sum(items)

We can also access individual elements in the list, using the following syntax:


In [ ]:
items[0]

items[-1]

Note that indexing starts at 0 in Python: the first element of the list is indexed by 0, the second by 1, and so on. Also, -1 refers to the last element, -2, to the penultimate element, and so on.

The same syntax can be used to alter elements in the list:


In [ ]:
items[1] = 9
items

We can access sublists with the following syntax:


In [ ]:
items[1:3]

Here, 1:3 represents a slice going from element 1 included (this is the second element of the list) to element 3 excluded. Thus, we get a sublist with the second and third element of the original list. The first-included/last-excluded asymmetry leads to an intuitive treatment of overlaps between consecutive slices. Also, note that a sublist refers to a dynamic view of the original list, not a copy; changing elements in the sublist automatically changes them in the original list.

**Assignment (5 min):** In the code cell below make a list of the first 10 prime numbers: https://en.wikipedia.org/wiki/Prime_number. Use the sum() function to figure out the sum of the first 10 prime numbers.


In [ ]:

**Assignment (1 min):** Print the second-to-last prime number from your list to the screen.


In [ ]:

Dictionaries

Dictionaries contain key-value pairs. They are extremely useful and common. They allow you to map, or point, keys to values. In the example below the letters a,b,c are now pointing to the numbers 1,2,3.

For a flux balance analysis application of a dictionary, you can think of a dictionary that points each of the reactions in a network to its flux in the FBA solution.

You can access the value a certain key points to with square bracket notation:


In [ ]:
my_dict = {'a': 1, 'b': 2, 'c': 3}
print('a:', my_dict['a'])

In [19]:
list(my_dict.keys())


Out[19]:
[0, 18, 23]

The keys in a dictionary can be anything including numbers


In [15]:
my_dict = {18: 1, 23: 2, 0: 3}
my_dict[18]


Out[15]:
1

**Assignment (1 min):** Make your own dictionary. For each person in your group add a 'key' to the dictionary, the name of the person, and a value, the age of the person. Then print the dictionary to the screen.


In [ ]:

for loops

We can run through all elements of a list using a for loop:


In [16]:
a_list = [1,2,3,4,5,6]
for number in a_list:
    number


Out[16]:
1
Out[16]:
2
Out[16]:
3
Out[16]:
4
Out[16]:
5
Out[16]:
6
  • Note that the for loop steps in sequence through the numbers in the list.
  • Every loop, the variable number is assigned the value of the next number in the list
  • You may call this variable number whatever you wish. As long as you also change it in the third line

As a more complex example, we can also loop over the keys of a dictionary.


In [17]:
genome_sizes = {'e. coli':5,'yeast':12,'human':2.9e3} # dictionary

print('A list of genome-sizes in #Mbp:')

for organism in genome_sizes.keys():
    print(organism,genome_sizes[organism])


A list of genome-sizes in #Mbp:
human 2900.0
e. coli 5
yeast 12

There are several things to note here:

  • The for organism in genome_sizes.keys() syntax means that a temporary variable named organism is created at every iteration. This variable contains the value of every item in the list, one at a time.
  • Note the colon : at the end of the for statement. Forgetting it will lead to a syntax error!
  • The print statement will be executed for all items in the list.
  • Note the four spaces before print: this is called the indentation. You will find more details about indentation in the next subsection.

**Assignment (3 min):** Write your own for loop that prints each element of your list of 10 prime numbers divided by 2 separately to the screen.

list comprehensions

Python supports a concise syntax to perform a given operation on all elements of a list using for loops:


In [21]:
items = [1,2,3,4,5,6]
squares = [item * item for item in items]
squares


Out[21]:
[1, 4, 9, 16, 25, 36]

This is called a list comprehension. A new list is created here; it contains the squares of all numbers in the list. This concise syntax leads to highly readable and Pythonic code.

**Assignment (3 min):** Write a list comprehension that calculates the magick square of each of the first 10 prime numbers. Start from the example above but loop over the list of prime numbers you defined above.


In [ ]:

Python supports a concise syntax to select all elements in a list that satisfy certain properties. Here is how to create a sublist with only even numbers:


In [22]:
even = [item for item in items if item % 2 == 0]
even


Out[22]:
[2, 4, 6]

This is also a form of list comprehension.

**Assignment (3 min):** Check that your prime numbers are actually prime numbers by writing list comprehensions that list the numbers in your prime numbers list that are divisible by 2, except of course 2 itself.


In [ ]:

Now that you are a python genius we can move to flux balance analysis! :)


In [ ]: