In [1]:
instructors = ['Dave', 'Joe', 'Bernease', 'Dorkus the Clown']
instructors
Out[1]:
In [2]:
if 'Dorkus the Clown' in instructors:
print('#fakeinstructor')
Usually we want conditional logic on both sides of a binary condition, e.g. some action when True
and some when False
In [3]:
if 'Dorkus the Clown' in instructors:
print('There are fake names for class instructors in your list!')
else:
print("Nothing to see here")
There is a special do nothing word: pass
that skips over some arm of a conditional, e.g.
In [4]:
if 'Joe' in instructors:
print("Congratulations! Joe is teaching, your class won't stink!")
else:
pass
Note: what have you noticed in this session about quotes? What is the difference between '
and "
?
Another simple example:
In [5]:
if True is False:
print("I'm so confused")
else:
print("Everything is right with the world")
It is always good practice to handle all cases explicity. Conditional fall through
is a common source of bugs.
Sometimes we wish to test multiple conditions. Use if
, elif
, and else
.
In [6]:
my_favorite = 'pie'
if my_favorite is 'cake':
print("He likes cake! I'll start making a double chocolate velvet cake right now!")
elif my_favorite is 'pie':
print("He likes pie! I'll start making a cherry pie right now!")
else:
print("He likes " + my_favorite + ". I don't know how to make that.")
Conditionals can take and
and or
and not
. E.g.
In [7]:
my_favorite = 'pie'
if my_favorite is 'cake' or my_favorite is 'pie':
print(my_favorite + " : I have a recipe for that!")
else:
print("Ew! Who eats that?")
In [8]:
for instructor in instructors:
print(instructor)
You can combine loops and conditionals:
In [9]:
for instructor in instructors:
if instructor.endswith('Clown'):
print(instructor + " doesn't sound like a real instructor name!")
else:
print(instructor + " is so smart... all those gooey brains!")
Dictionaries can use the keys
method for iterating.
Since for operates over lists, it is common to want to do something like:
NOTE: C-like
for (i = 0; i < 3; ++i) {
print(i);
}
The Python equivalent is:
for i in [0, 1, 2]:
do something with i
What happens when the range you want to sample is big, e.g.
NOTE: C-like
for (i = 0; i < 1000000000; ++i) {
print(i);
}
That would be a real pain in the rear to have to write out the entire list from 1 to 1000000000.
Enter, the range()
function. E.g.
range(3) is [0, 1, 2]
In [11]:
range(3)
Out[11]:
Notice that Python (in the newest versions, e.g. 3+) has an object type that is a range. This saves memory and speeds up calculations vs. an explicit representation of a range as a list - but it can be automagically converted to a list on the fly by Python. To show the contents as a list
we can use the type case like with the tuple above.
Sometimes, in older Python docs, you will see xrange
. This used the range object back in Python 2 and range
returned an actual list. Beware of this!
In [12]:
list(range(3))
Out[12]:
Remember earlier with slicing, the syntax :3
meant [0, 1, 2]
? Well, the same upper bound philosophy applies here.
In [13]:
for index in range(3):
instructor = instructors[index]
if instructor.endswith('Clown'):
print(instructor + " doesn't sound like a real instructor name!")
else:
print(instructor + " is so smart... all those gooey brains!")
This would probably be better written as
In [14]:
for index in range(len(instructors)):
instructor = instructors[index]
if instructor.endswith('Clown'):
print(instructor + " doesn't sound like a real instructor name!")
else:
print(instructor + " is so smart... all those gooey brains!")
But in all, it isn't very Pythonesque to use indexes like that (unless you have another reason in the loop) and you would opt instead for the instructor in instructors
form.
More often, you are doing something with the numbers that requires them to be integers, e.g. math.
In [15]:
sum = 0
for i in range(10):
sum += i
print(sum)
Note: for more on formatting strings, see: https://pyformat.info
In [16]:
for i in range(1, 4):
for j in range(1, 4):
print('%d * %d = %d' % (i, j, i*j)) # Note string formatting here, %d means an integer
In [17]:
for i in range(10):
if i == 4:
break
i
Out[17]:
In [18]:
sum = 0
for i in range(10):
if (i == 5):
continue
else:
sum += i
print(sum)
In [19]:
sum = 0
for i in range(10):
sum += i
else:
print('final i = %d, and sum = %d' % (i, sum))
In [20]:
my_string = "DIRECT"
for c in my_string:
print(c)
Objective: Replace the bash magic
bits for downloading the Pronto data and uncompressing it with Python code. Since the download is big, check if the zip file exists first before downloading it again. Then load it into a pandas dataframe.
Notes:
os
package has tools for checking if a file exists: os.path.exists
import os
filename = 'pronto.csv'
if os.path.exists(filename):
print("wahoo!")
requests
package to get the file given a url (got this from the requests docs)
import requests
url = 'https://s3.amazonaws.com/pronto-data/open_data_year_two.zip'
req = requests.get(url)
assert req.status_code == 200 # if the download failed, this line will generate an error
with open(filename, 'wb') as f:
f.write(req.content)
zipfile
package to decompress the file while reading it into pandas
import pandas as pd
import zipfile
csv_filename = '2016_trip_data.csv'
zf = zipfile.ZipFile(filename)
data = pd.read_csv(zf.open(csv_filename))
URL | filename | csv_filename |
---|---|---|
https://github.com/UWSEDS/LectureNotes/blob/master/open_data_year_two_set1.zip?raw=true | open_data_year_two_set1.zip | 2016_trip_data_set1.csv |
https://github.com/UWSEDS/LectureNotes/blob/master/open_data_year_two_set2.zip?raw=true | open_data_year_two_set2.zip | 2016_trip_data_set2.csv |
https://github.com/UWSEDS/LectureNotes/blob/master/open_data_year_two_set3.zip?raw=true | open_data_year_two_set3.zip | 2016_trip_data_set3.csv |
What pieces of the data structures and flow control that we talked about earlier can you use?
For loops let you repeat some code for every item in a list. Functions are similar in that they run the same lines of code for new values of some variable. They are different in that functions are not limited to looping over items.
Functions are a critical part of writing easy to read, reusable code.
Create a function like:
def function_name (parameters):
"""
optional docstring
"""
function expressions
return [variable]
Note: Sometimes I use the word argument in place of parameter.
Here is a simple example. It prints a string that was passed in and returns nothing.
In [21]:
def print_string(str):
"""This prints out a string passed as the parameter."""
print(str)
return
To call the function, use:
print_string("Dave is awesome!")
Note: The function has to be defined before you can call it!
In [22]:
print_string("Dave is awesome!")
If you don't provide an argument or too many, you get an error.
In [ ]:
Parameters (or arguments) in Python are all passed by reference. This means that if you modify the parameters in the function, they are modified outside of the function.
See the following example:
def change_list(my_list):
"""This changes a passed list into this function"""
my_list.append('four');
print('list inside the function: ', my_list)
return
my_list = [1, 2, 3];
print('list before the function: ', my_list)
change_list(my_list);
print('list after the function: ', my_list)
In [23]:
def change_list(my_list):
"""This changes a passed list into this function"""
my_list.append('four');
print('list inside the function: ', my_list)
return
my_list = [1, 2, 3];
print('list before the function: ', my_list)
change_list(my_list);
print('list after the function: ', my_list)
Variables have scope: global
and local
In a function, new variables that you create are not saved when the function returns - these are local
variables. Variables defined outside of the function can be accessed but not changed - these are global
variables, Note there is a way to do this with the global
keyword. Generally, the use of global
variables is not encouraged, instead use parameters.
my_global_1 = 'bad idea'
my_global_2 = 'another bad one'
my_global_3 = 'better idea'
def my_function():
print(my_global)
my_global_2 = 'broke your global, man!'
global my_global_3
my_global_3 = 'still a better idea'
return
my_function()
print(my_global_2)
print(my_global_3)
In [ ]:
In general, you want to use parameters to provide data to a function and return a result with the return
. E.g.
def sum(x, y):
my_sum = x + y
return my_sum
If you are going to return multiple objects, what data structure that we talked about can be used? Give and example below.
In [ ]:
type | behavior |
---|---|
required | positional, must be present or error, e.g. my_func(first_name, last_name) |
keyword | position independent, e.g. my_func(first_name, last_name) can be called my_func(first_name='Dave', last_name='Beck') or my_func(last_name='Beck', first_name='Dave') |
default | keyword params that default to a value if not provided |
In [24]:
def print_name(first, last='the Clown'):
print('Your name is %s %s' % (first, last))
return
Play around with the above function.
In [ ]:
In [ ]:
In [ ]:
Functions can contain any code that you put anywhere else including:
In [25]:
def print_name_age(first, last, age):
print_name(first, last)
print('Your age is %d' % (age))
if age > 35:
print('You are really old.')
return
In [26]:
print_name_age(age=40, last='Beck', first='Dave')
In [ ]:
In [ ]:
In [ ]:
Once you have some code that is functionalized and not going to change, you can move it to a file that ends in .py
, check it into version control, import it into your notebook and use it!
In [ ]:
In [ ]:
Homework:
Save your functions to pronto_utils.py
. Import the functions and use them to rewrite HW1. This will be laid out in the homework repo for HW2. Check the website.
In [ ]: