Procedural programming in python

Topics

  • Tuples, lists and dictionaries
  • Flow control, part 1
    • If
    • For
      • range() function
  • Some hacky hack time
  • Flow control, part 2
    • Functions

Tuples

Let's begin by creating a tuple called my_tuple that contains three elements.


In [1]:
my_tuple = ('I', 'like', 'cake')
my_tuple


Out[1]:
('I', 'like', 'cake')

Tuples are simple containers for data. They are ordered, meaining the order the elements are in when the tuple is created are preserved. We can get values from our tuple by using array indexing, similar to what we were doing with pandas.


In [2]:
my_tuple[0]


Out[2]:
'I'

Recall that Python indexes start at 0. So the first element in a tuple is 0 and the last is array length - 1. You can also address from the end to the front by using negative (-) indexes, e.g.


In [3]:
my_tuple[-1]


Out[3]:
'cake'

You can also access a range of elements, e.g. the first two, the first three, by using the : to expand a range. This is called slicing.


In [4]:
my_tuple[0:2]


Out[4]:
('I', 'like')

In [5]:
my_tuple[0:3]


Out[5]:
('I', 'like', 'cake')

What do you notice about how the upper bound is referenced?

Without either end, the : expands to the entire list.


In [6]:
my_tuple[1:]


Out[6]:
('like', 'cake')

In [9]:
my_tuple[:-1]


Out[9]:
('I', 'like')

In [10]:
my_tuple[:]


Out[10]:
('I', 'like', 'cake')

Tuples have a key feature that distinguishes them from other types of object containers in Python. They are immutable. This means that once the values are set, they cannot change.


In [8]:
my_tuple[2]


Out[8]:
'cake'

So what happens if I decide that I really prefer pie over cake?


In [12]:
my_tuple[2] = 'pie'


---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-12-ccb3e9a1b519> in <module>()
----> 1 my_tuple[2] = 'pie'

TypeError: 'tuple' object does not support item assignment

Facts about tuples:

  • You can't add elements to a tuple. Tuples have no append or extend method.
  • You can't remove elements from a tuple. Tuples have no remove or pop method.
  • You can also use the in operator to check if an element exists in the tuple.

So then, what are the use cases of tuples?

  • Speed
  • Write-protects data that other pieces of code should not alter

You can alter the value of a tuple variable, e.g. change the tuple it holds, but you can't modify it.


In [13]:
my_tuple


Out[13]:
('I', 'like', 'cake')

In [14]:
my_tuple = ('I', 'love', 'pie')
my_tuple


Out[14]:
('I', 'love', 'pie')

There is a really handy operator in that can be used with tuples that will return True if an element is present in a tuple and False otherwise.


In [15]:
'love' in my_tuple


Out[15]:
True

Finally, tuples can contain different types of data, not just strings.


In [ ]:


In [26]:
import math
my_second_tuple = (42, 'Elephants', 'ate', math.pi)
my_second_tuple


Out[26]:
(42, 'Elephants', 'ate', 3.141592653589793)

Numerical operators work... Sort of. What happens when you add?

my_second_tuple + 'plus'


In [27]:
my_second_tuple + 'plus'


---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-27-5dc4624ac1e5> in <module>()
----> 1 my_second_tuple + 'plus'

TypeError: can only concatenate tuple (not "str") to tuple

Not what you expects? What about adding two tuples?


In [28]:
my_second_tuple + my_tuple


Out[28]:
(42, 'Elephants', 'ate', 3.141592653589793, 'I', 'love', 'pie')

Other operators: -, /, *


In [ ]:


In [ ]:


In [ ]:


In [ ]:

Questions about tuples before we move on?


Lists

Let's begin by creating a list called my_list that contains three elements.


In [69]:
my_list = ['I', 'like', 'cake']
my_list


Out[69]:
['I', 'like', 'cake']

At first glance, tuples and lists look pretty similar. Notice the lists use '[' and ']' instead of '(' and ')'. But indexing and refering to the first entry as 0 and the last as -1 still works the same.


In [70]:
my_list[0]


Out[70]:
'I'

In [71]:
my_list[-1]


Out[71]:
'cake'

In [72]:
my_list[0:3]


Out[72]:
['I', 'like', 'cake']

Lists, however, unlike tuples, are mutable.


In [73]:
my_list[2] = 'pie'
my_list


Out[73]:
['I', 'like', 'pie']

Multiple elements in the list can even be changed at once!


In [74]:
my_list[1:] = ['love', 'puppies']
my_list


Out[74]:
['I', 'love', 'puppies']

You can still use the in operator.


In [75]:
'puppies' in my_list


Out[75]:
True

In [76]:
'kittens' in my_list


Out[76]:
False

So when to use a tuple and when to use a list?

  • Use a list when you will modify it after it is created?

Ways to modify a list? You have already seen by index. Let's start with an empty list.


In [60]:
my_new_list = []
my_new_list


Out[60]:
[]

We can add to the list using the append method on it.


In [61]:
my_new_list.append('Now')
my_new_list


Out[61]:
['Now']

We can use the + operator to create a longer list by adding the contents of two lists together.


In [62]:
my_new_list + my_list


Out[62]:
['Now', 'I', 'love', 'puppies']

One of the useful things to know about a list how many elements are in it. This can be found with the len function.


In [63]:
len(my_list)


Out[63]:
3

Some other handy functions with lists:

  • max
  • min
  • cmp

In [ ]:

Sometimes you have a tuple and you need to make it a list. You can cast the tuple to a list with list(my_tuple)


In [64]:
list(my_tuple)


Out[64]:
['I', 'love', 'pie']

What in the above told us it was a list?

You can also use the type function to figure out the type.


In [65]:
type(tuple)


Out[65]:
type

In [66]:
type(list(my_tuple))


Out[66]:
list

There are other useful methods on lists, including:

methods description
list.append(obj) Appends object obj to list
list.count(obj) Returns count of how many times obj occurs in list
list.extend(seq) Appends the contents of seq to list
list.index(obj) Returns the lowest index in list that obj appears
list.insert(index, obj) Inserts object obj into list at offset index
list.pop(obj=list[-1]) Removes and returns last object or obj from list
list.remove(obj) Removes object obj from list
list.reverse() Reverses objects of list in place
list.sort([func]) Sort objects of list, use compare func, if given

Try some of them now.

my_list.count('I')
my_list

my_list.append('I')
my_list

my_list.count('I')
my_list

#my_list.index(42)

my_list.index('puppies')
my_list

my_list.insert(my_list.index('puppies'), 'furry')
my_list

In [82]:
my_list.index('I')


Out[82]:
0

In [86]:
my_list.reverse()

In [87]:
my_list


Out[87]:
['I', 'I', 'puppies', 'love', 'I']

In [ ]:

Any questions about lists before we move on?


Dictionaries

Dictionaries are similar to tuples and lists in that they hold a collection of objects. Dictionaries, however, allow an additional indexing mode: keys. Think of a real dictionary where the elements in it are the definitions of the words and the keys to retrieve the entries are the words themselves.

word definition
tuple An immutable collection of ordered objects
list A mutable collection of ordered objects
dictionary A mutable collection of named objects

Let's create this data structure now. Dictionaries, like tuples and elements use a unique referencing method, '{' and its evil twin '}'.


In [88]:
my_dict = { 'tuple' : 'An immutable collection of ordered objects',
            'list' : 'A mutable collection of ordered objects',
            'dictionary' : 'A mutable collection of objects' }
my_dict


Out[88]:
{'dictionary': 'A mutable collection of objects',
 'list': 'A mutable collection of ordered objects',
 'tuple': 'An immutable collection of ordered objects'}

We access items in the dictionary by name, e.g.


In [89]:
my_dict['dictionary']


Out[89]:
'A mutable collection of objects'

Since the dictionary is mutable, you can change the entries.


In [90]:
my_dict['dictionary'] = 'A mutable collection of named objects'
my_dict


Out[90]:
{'dictionary': 'A mutable collection of named objects',
 'list': 'A mutable collection of ordered objects',
 'tuple': 'An immutable collection of ordered objects'}

Notice that ordering is not preserved!

And we can add new items to the list.


In [91]:
my_dict['cabbage'] = 'Green leafy plant in the Brassica family'
my_dict


Out[91]:
{'cabbage': 'Green leafy plant in the Brassica family',
 'dictionary': 'A mutable collection of named objects',
 'list': 'A mutable collection of ordered objects',
 'tuple': 'An immutable collection of ordered objects'}

To delete an entry, we can't just set it to None


In [100]:
my_dict['cabbage'] = None
my_dict


Out[100]:
{'cabbage': None,
 'dictionary': 'A mutable collection of named objects',
 'list': 'A mutable collection of ordered objects',
 'tuple': 'An immutable collection of ordered objects'}

To delete it propery, we need to pop that specific entry.


In [101]:
my_dict.pop('cabbage', None)

You can use other objects as names, but that is a topic for another time. You can mix and match key types, e.g.


In [102]:
my_new_dict = {}
my_new_dict[1] = 'One'
my_new_dict['42'] = 42
my_new_dict


Out[102]:
{1: 'One', '42': 42}

You can get a list of keys in the dictionary by using the keys method.


In [103]:
my_dict.keys()


Out[103]:
dict_keys(['tuple', 'list', 'dictionary'])

Similarly the contents of the dictionary with the items method.


In [104]:
my_dict.items()


Out[104]:
dict_items([('tuple', 'An immutable collection of ordered objects'), ('list', 'A mutable collection of ordered objects'), ('dictionary', 'A mutable collection of named objects')])

We can use the keys list for fun stuff, e.g. with the in operator.


In [105]:
'dictionary' in my_dict.keys()


Out[105]:
True

This is a synonym for in my_dict


In [106]:
'dictionary' in my_dict


Out[106]:
True

Notice, it doesn't work for elements.


In [107]:
'A mutable collection of ordered objects' in my_dict


Out[107]:
False

Other dictionary methods:

methods description
dict.clear() Removes all elements from dict
dict.get(key, default=None) For key key, returns value or default if key doesn't exist in dict
dict.items() Returns a list of dicts (key, value) tuple pairs
dict.keys() Returns a list of dictionary keys
dict.setdefault(key, default=None) Similar to get, but set the value of key if it doesn't exist in dict
dict.update(dict2) Add the key / value pairs in dict2 to dict
dict.values Returns a list of dictionary values

Feel free to experiment...


In [122]:
my_dict.setdefault('dictionar', "None")


Out[122]:
'None'

In [123]:
my_dict


Out[123]:
{'dictionar': 'None',
 'dictionary': 'A mutable collection of named objects',
 'list': 'A mutable collection of ordered objects',
 'tuple': 'An immutable collection of ordered objects'}

In [129]:
dict2 = {'r':1,'a':2}
my_dict.update(dict2)
my_dict


Out[129]:
{'a': 2,
 'dictionar': 'None',
 'dictionary': 'A mutable collection of named objects',
 'list': 'A mutable collection of ordered objects',
 'r': 1,
 'tuple': 'An immutable collection of ordered objects'}


Flow control

Flow control figure

Flow control refers how to programs do loops, conditional execution, and order of functional operations. Let's start with conditionals, or the venerable if statement.

Let's start with a simple list of instructors for these classes.


In [131]:
instructors = ['Dave', 'Jim', 'Dorkus the Clown']
instructors


Out[131]:
['Dave', 'Jim', 'Dorkus the Clown']

If

If statements can be use to execute some lines or block of code if a particular condition is satisfied. E.g. Let's print something based on the entries in the list.


In [132]:
if 'Dorkus the Clown' in instructors:
    print('#fakeinstructor')


#fakeinstructor

Usually we want conditional logic on both sides of a binary condition, e.g. some action when True and some when False


In [133]:
if 'Dorkus the Clown' in instructors:
    print('There are fake names for class instructors in your list!')
else:
    print("Nothing to see here")


There are fake names for class instructors in your list!

There is a special do nothing word: pass that skips over some arm of a conditional, e.g.


In [134]:
if 'Jim' in instructors:
    print("Congratulations!  Jim is teaching, your class won't stink!")
else:
    pass


Congratulations!  Jim is teaching, your class won't stink!

Note: what have you noticed in this session about quotes? What is the difference between ' and "?

Another simple example:


In [135]:
if True is False:
    print("I'm so confused")
else:
    print("Everything is right with the world")


Everything is right with the world

It is always good practice to handle all cases explicity. Conditional fall through is a common source of bugs.

Sometimes we wish to test multiple conditions. Use if, elif, and else.


In [136]:
my_favorite = 'pie'

if my_favorite is 'cake':
    print("He likes cake!  I'll start making a double chocolate velvet cake right now!")
elif my_favorite is 'pie':
    print("He likes pie!  I'll start making a cherry pie right now!")
else:
    print("He likes " + my_favorite + ".  I don't know how to make that.")


He likes pie!  I'll start making a cherry pie right now!

Conditionals can take and and or and not. E.g.


In [137]:
my_favorite = 'pie'

if my_favorite is 'cake' or my_favorite is 'pie':
    print(my_favorite + " : I have a recipe for that!")
else:
    print("Ew!  Who eats that?")


pie : I have a recipe for that!

For

For loops are the standard loop, though while is also common. For has the general form:

for items in list:
    do stuff

For loops and collections like tuples, lists and dictionaries are natural friends.


In [138]:
for instructor in instructors:
    print(instructor)


Dave
Jim
Dorkus the Clown

You can combine loops and conditionals:


In [139]:
for instructor in instructors:
    if instructor.endswith('Clown'):
        print(instructor + " doesn't sound like a real instructor name!")
    else:
        print(instructor + " is so smart... all those gooey brains!")


Dave is so smart... all those gooey brains!
Jim is so smart... all those gooey brains!
Dorkus the Clown doesn't sound like a real instructor name!

Dictionaries can use the keys method for iterating.


In [160]:
my_dict


Out[160]:
{'a': 2,
 'dictionar': 'None',
 'dictionary': 'A mutable collection of named objects',
 'list': 'A mutable collection of ordered objects',
 'r': 1,
 'tuple': 'An immutable collection of ordered objects'}

In [ ]:


In [161]:
for key in my_dict.keys():
    if len(key) < 5:
        print(my_dict[key])


A mutable collection of ordered objects
1
2

range()

Since for operates over lists, it is common to want to do something like:

NOTE: C-like
for (i = 0; i < 3; ++i) {
    print(i);
}

The Python equivalent is:

for i in [0, 1, 2]:
    do something with i

What happens when the range you want to sample is big, e.g.

NOTE: C-like
for (i = 0; i < 1000000000; ++i) {
    print(i);
}

That would be a real pain in the rear to have to write out the entire list from 1 to 1000000000.

Enter, the range() function. E.g. range(3) is [0, 1, 2]


In [164]:
range(3)


Out[164]:
range(0, 3)

Notice that Python (in the newest versions, e.g. 3+) has an object type that is a range. This saves memory and speeds up calculations vs. an explicit representation of a range as a list - but it can be automagically converted to a list on the fly by Python. To show the contents as a list we can use the type case like with the tuple above.

Sometimes, in older Python docs, you will see xrange. This used the range object back in Python 2 and range returned an actual list. Beware of this!


In [163]:
list(range(3))


Out[163]:
[0, 1, 2]

Remember earlier with slicing, the syntax :3 meant [0, 1, 2]? Well, the same upper bound philosophy applies here.


In [165]:
for index in range(3):
    instructor = instructors[index]
    if instructor.endswith('Clown'):
        print(instructor + " doesn't sound like a real instructor name!")
    else:
        print(instructor + " is so smart... all those gooey brains!")


Dave is so smart... all those gooey brains!
Jim is so smart... all those gooey brains!
Dorkus the Clown doesn't sound like a real instructor name!

This would probably be better written as


In [166]:
for index in range(len(instructors)):
    instructor = instructors[index]
    if instructor.endswith('Clown'):
        print(instructor + " doesn't sound like a real instructor name!")
    else:
        print(instructor + " is so smart... all those gooey brains!")


Dave is so smart... all those gooey brains!
Jim is so smart... all those gooey brains!
Dorkus the Clown doesn't sound like a real instructor name!

But in all, it isn't very Pythonesque to use indexes like that (unless you have another reason in the loop) and you would opt instead for the instructor in instructors form.

More often, you are doing something with the numbers that requires them to be integers, e.g. math.


In [1]:
sum = 0
for i in range(10):
    sum += i
print(sum)


45

For loops can be nested

Note: for more on formatting strings, see: https://pyformat.info


In [2]:
for i in range(1, 4):
    for j in range(1, 4):
        print('%d * %d = %d' % (i, j, i*j))  # Note string formatting here, %d means an integer


1 * 1 = 1
1 * 2 = 2
1 * 3 = 3
2 * 1 = 2
2 * 2 = 4
2 * 3 = 6
3 * 1 = 3
3 * 2 = 6
3 * 3 = 9

You can exit loops early if a condition is met:


In [3]:
for i in range(10):
    if i == 4:
        break
i


Out[3]:
4

You can skip stuff in a loop with continue


In [4]:
sum = 0
for i in range(10):
    if (i == 5):
        continue
    else:
        sum += i
print(sum)


40

There is a unique language feature call for...else


In [5]:
sum = 0
for i in range(10):
    sum += i
else:
    print('final i = %d, and sum = %d' % (i, sum))


final i = 9, and sum = 45

You can iterate over letters in a string


In [6]:
my_string = "DIRECT"
for c in my_string:
    print(c)


D
I
R
E
C
T

In [9]:
tr = {'a':'b'}
tr


Out[9]:
{'a': 'b'}


Hacky Hack Time with Ifs, Fors, Lists, and imports!

Objective: Replace the bash magic bits for downloading the HCEPDB data and uncompressing it with Python code. Since the download is big, check if the zip file exists first before downloading it again. Then load it into a pandas dataframe.

Notes:

  • The os package has tools for checking if a file exists: os.path.exists
    import os
    filename = 'HCEPDB_moldata.zip'
    if os.path.exists(filename):
      print("wahoo!")
  • Use the requests package to get the file given a url (got this from the requests docs)
    import requests
    url = 'http://faculty.washington.edu/dacb/HCEPDB_moldata.zip'
    req = requests.get(url)
    assert req.status_code == 200 # if the download failed, this line will generate an error
    with open(filename, 'wb') as f:
      f.write(req.content)
  • Use the zipfile package to decompress the file while reading it into pandas
    import pandas as pd
    import zipfile
    csv_filename = 'HCEPDB_moldata.csv'
    zf = zipfile.ZipFile(filename)
    data = pd.read_csv(zf.open(csv_filename))

In [ ]:


In [ ]:


In [ ]:

Now, use your code from above for the following URLs and filenames

URL filename csv_filename
http://faculty.washington.edu/dacb/HCEPDB_moldata_set1.zip HCEPDB_mol_data_set1.zip HCEPDB_mol_data_set1.csv
http://faculty.washington.edu/dacb/HCEPDB_moldata_set2.zip HCEPDB_mol_data_set2.zip HCEPDB_mol_data_set2.csv
http://faculty.washington.edu/dacb/HCEPDB_moldata_set3.zip HCEPDB_mol_data_set3.zip HCEPDB_mol_data_set3.csv

What pieces of the data structures and flow control that we talked about earlier can you use?


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:

How did you solve this problem?


Functions

For loops let you repeat some code for every item in a list. Functions are similar in that they run the same lines of code for new values of some variable. They are different in that functions are not limited to looping over items.

Functions are a critical part of writing easy to read, reusable code.

Create a function like:

def function_name (parameters):
    """
    optional docstring
    """
    function expressions
    return [variable]

Note: Sometimes I use the word argument in place of parameter.

Here is a simple example. It prints a string that was passed in and returns nothing.


In [17]:
def print_string(str):
    """This prints out a string passed as the parameter."""
    print(str)
    return

To call the function, use:

print_string("Dave is awesome!")

Note: The function has to be defined before you can call it!


In [18]:
print_string("Dave is awesome!")


Dave is awesome!

If you don't provide an argument or too many, you get an error.


In [ ]:

Parameters (or arguments) in Python are all passed by reference. This means that if you modify the parameters in the function, they are modified outside of the function.

See the following example:

def change_list(my_list):
   """This changes a passed list into this function"""
   my_list.append('four');
   print('list inside the function: ', my_list)
   return

my_list = [1, 2, 3];
print('list before the function: ', my_list)
change_list(my_list);
print('list after the function: ', my_list)

In [19]:
def change_list(my_list):
   """This changes a passed list into this function"""
   my_list.append('four');
   print('list inside the function: ', my_list)
   return

my_list = [1, 2, 3];
print('list before the function: ', my_list)
change_list(my_list);
print('list after the function: ', my_list)


list before the function:  [1, 2, 3]
list inside the function:  [1, 2, 3, 'four']
list after the function:  [1, 2, 3, 'four']

Variables have scope: global and local

In a function, new variables that you create are not saved when the function returns - these are local variables. Variables defined outside of the function can be accessed but not changed - these are global variables, Note there is a way to do this with the global keyword. Generally, the use of global variables is not encouraged, instead use parameters.

my_global_1 = 'bad idea'
my_global_2 = 'another bad one'
my_global_3 = 'better idea'

def my_function():
    print(my_global)
    my_global_2 = 'broke your global, man!'
    global my_global_3
    my_global_3 = 'still a better idea'
    return

my_function()
print(my_global_2)
print(my_global_3)

In [34]:
my_global = (1)
my_global_1 = 'bad idea'
my_global_2 = 'another bad one'
my_global_3 = 'better idea'

def my_function():
    print(my_global_1)
    #print(my_global_1 + my_global_2)
    my_global_2 = 'broke your global, man!'
    global my_global_3
    print(my_global_1 + my_global_3)
    my_global_3 = 'still a better idea'
    return
my_function()
print(my_global_2)
print(my_global_3)


bad idea
bad ideabetter idea
another bad one
still a better idea

In general, you want to use parameters to provide data to a function and return a result with the return. E.g.

def sum(x, y):
    my_sum = x + y
    return my_sum

If you are going to return multiple objects, what data structure that we talked about can be used? Give and example below.


In [25]:
def sum(x, y):
    my_sum = x + y
    return my_sum
sum(1,1)


Out[25]:
[2]

Parameters have four different types:

type behavior
required positional, must be present or error, e.g. my_func(first_name, last_name)
keyword position independent, e.g. my_func(first_name, last_name) can be called my_func(first_name='Dave', last_name='Beck') or my_func(last_name='Beck', first_name='Dave')
default keyword params that default to a value if not provided

In [70]:
def print_name(first, last='the Clown', *k):
    print('Your name is %s %s' % (first, last), len(k))
    return

Play around with the above function.


In [71]:
print_name(3, 5, 7, 9, 10, 12)


Your name is 3 5 4

In [ ]:


In [ ]:

Functions can contain any code that you put anywhere else including:

  • if...elif...else
  • for...else
  • while
  • other function calls

In [73]:
def print_name_age(first, last, age):
    print_name(first, last)
    print('Your age is %d' % (age))
    if age > 35:
        print('You are really old.')
    return

In [74]:
print_name_age(age=40, last='Beck', first='Dave')


Your name is Dave Beck 0
Your age is 40
You are really old.

In [ ]:

How would you functionalize the above code for downloading, unzipping, and making a dataframe?


In [ ]:


In [ ]:

Once you have some code that is functionalized and not going to change, you can move it to a file that ends in .py, check it into version control, import it into your notebook and use it!


In [ ]:


In [ ]:

Homework: Save your functions to hcepdb_utils.py. Import the functions and use them to rewrite HW1. This will be laid out in the homework repo for HW2. Check the website.


In [ ]: