High-level, interpreted programming language (like R or Matlab)
In [7]:
print("Hello world")
Advantages: Python code is readable, intuitive to work with, fast to program in and iterate your projects
Disadvantages: it can be slow, but there are ways around this (e.g., multi-core processing)
I took a ~16 hour summer class, and was immediately using it for my own projects
So I'm also a novice! I may have to defer to Henry on some things!
Doing simple calculations from the command line
In [8]:
5 + 5
Out[8]:
In [13]:
3 + 2 ** -9
Out[13]:
In [14]:
'this' + '&' + 'that'
Out[14]:
In [1]:
print(5 + 5) # This is a comment
Basic data structure types: numerics, strings, lists, tuples, dictionaries… (we'll talk about each)
Fall into two classes: mutable and immutable
Mutable: Can be changed in place, i.e., change variable without changing where it is stored in the computer's memory. Defining structure a and then saying b = a will point a and b to the same location in memory. If you define a in terms of b, and then changing one in place changes the other.</br>
Immutable: Cannot be changed in place.
Practical demonstration of this difference in a moment...
Floats (numbers with decimals, roughly), integers. These are immutable.
In [2]:
x = 3 # x is an integer
In [ ]:
type(x)
In [23]:
y = 4.9
In [ ]:
type(y)
In [ ]:
y = x
In [ ]:
x += 1.2 #now x becomes a float
In [ ]:
y # what will the output be? Remember that numerics are immutable
In [ ]:
y = float(y) # this changes y's type to float
In [ ]:
type(y)
Sequences of characters (letters, numbers, punctuation, etc.). Also immutable.
In [32]:
'Hello world'
Out[32]:
In [34]:
mystring = 'Hello World'
In [ ]:
mystring += ', how are you today?'
In [ ]:
mystring
In [ ]:
mystring.lower() # object.method()
In [ ]:
mystring.split() # returns a list...more on those in a bit
In [35]:
mystring.isalpha()
Out[35]:
In [ ]:
mystring = 'HellowWorld'
In [ ]:
mystring.isalpha()
In [32]:
"The sum of 1 + 2 is {} and not {}".format(1+2,99)
Out[32]:
True and False are their own type: boolean. Equivalent to 1 and 0 (which can be extremely handy.)
Tuples & lists: ordered containers of any combination of data structures (strings, integers, variables, other tuples or lists). Tuples are immutable, lists mutable.
In [ ]:
z = ('our', 'first', 'tuple', 9, x) # put paren around items for tuple
In [ ]:
z = ['our', 'first', 'list', 3.4, [3,'hi']] # brackets for list
Some list methods
In [ ]:
z.append('eats') # append adds the object itself
In [ ]:
z
In [ ]:
z.extend('eats') # extend adds the pieces of the object, or iterable
In [ ]:
z
In [ ]:
z.index('eats')
In [ ]:
z.sort()
In [ ]:
z
In [ ]:
z.pop(2)
In [ ]:
z
In [34]:
z = 'Russell'
In [ ]:
z[0] # indexes first element in iterable
0 is the first index in Python (cf. R and Matlab)!!! </br></br>
How to think of it...
Why?
See:
http://en.wikipedia.org/wiki/Zero-based_numbering
http://www.cs.utexas.edu/users/EWD/ewd08xx/EWD831.PDF <-- charming hand-written note by Edsger W. Dijkstra, a giant in computer science
https://plus.google.com/115212051037621986145/posts/YTUxbXYZyfi <-- written by Guido van Rossum, inventor of Python
In [51]:
z[:3], z[3:] # slicing...3 is, in a manner of speaking, indexing the space between the two 's'
Out[51]:
In [9]:
y = ['my', 'list']
In [10]:
y[0]
Out[10]:
In [11]:
y[:1]
Out[11]:
In [ ]:
x = ('my', 'tuple')
In [ ]:
x[0]
In [ ]:
x[:1]
Can also index, slice, assign using negative numbers, which index from the back of the sequence
In [45]:
z[-1] # NOTE -- negative indexing does NOT start at 0!
Out[45]:
In [41]:
z[-3:]
Out[41]:
Assigning new value via indexing works for lists (which are mutable), but not strings or tuples (which are immutable)
In [ ]:
x[0] = 'your'
In [ ]:
z[0] = 'Z'
In [ ]:
y[0] = 'your'
So why use tuple instead of list? Computer creates tuples faster, they use less memory, and, like other immutables, they can be used as keys in dictionary.
Unordered collection with no duplicate elements. Basic uses include membership testing and eliminating duplicate entries. Sets support basic set operations like union, intersection, difference, etc. Sets are mutable.
Disclaimer: I haven't used these that much yet (though I probably should)...
In [42]:
engineers = {'John', 'Jane', 'Jack', 'Janice'}
In [44]:
programmers = {'Jack', 'Sam', 'Susan', 'Janice'}
In addition to set-literals using braces, you can make a set from any iterable. More on iterables next week, but for now, e.g. a list is an iterable.
In [45]:
managers = set(['Jane', 'Jack', 'Susan', 'Zack'])
In [46]:
employees = engineers | programmers | managers # union
In [47]:
engineering_management = engineers & managers # intersection
In [48]:
fulltime_management = managers - engineers - programmers # difference
In [49]:
engineers.add('Marvin') # add element
In [50]:
print(engineers)
In [51]:
employees.issuperset(engineers) # superset test
Out[51]:
In [52]:
employees.update(engineers)
In [53]:
employees.issuperset(engineers)
Out[53]:
Key-value pairs. Keys can be any immutable (e.g., tuples, strings, numerics, but not lists). Members of dictionary are indexed with key, cf. lists which are indexed with a range of numbers. Like sets, checking for membership is fast, O(1) in average case. Cf. lists, for which membership checking is O(n). Dictionaries are mutable.
In [3]:
mydict = {'the':10, 'of': 6}
In [ ]:
mydict['the'] #lookup with key
In [ ]:
mydict['the'] = 20
In [ ]:
mydict
Some useful methods...
In [4]:
mydict.get('cromulent',0) # (key, value to be returned if key not in dict)
Out[4]:
In [55]:
mydict.items()
Out[55]:
In [ ]:
mydict.values()
In [ ]:
mydict.keys()
Do you just need an ordered sequence of items?
Another way to decide between tuples and lists (from Henry):
"By convention lists tend to be homogeneous. You don't know how many you'll end up with necessarily, but you have a bunch of the same thing. Tuples are not necessarily homogeneous, and the slots usually have some kind of predefined semantics, like ('Henry', 'Harrison', 27)."
In [56]:
x, y, z = 2, 'hi', ['my', 'list']
In [57]:
print(x,y,z)
In [ ]:
len(z)
In [ ]:
range(10)
Conditionals (if/thens), loops, list comprehension
In [ ]:
x = 3
if x == 3: # notice == for computing truth value
print('yes')
In [ ]:
y = 'tested'
if len(y) > 6:
print('yes')
elif len(y) = 6:
print('maybe')
else:
print('no')
For every item in some list, tuple, string, or other iterable, do something with that item
In [ ]:
x = list('Russell')
for letter in x:
print(x)
In [ ]:
for index, letter in enumerate(x): # enumerate is a generator, not an ordinary function, more next time on genereators
print letter, index
In [ ]:
for index, letter in enumerate(x):
if index == 0 or index == 1:
print index, letter
While something is True, do something else.
In [141]:
list1 = ['the','rain','in','spain','falls','mainly','on','the','plain']
list2 = []
while len(list1): # len(list1) is True if list1 is not empty, i.e., if len(list1) is > 1
list2.append(list1.pop(0))
list2
Out[141]:
A compressed, elegant way to construct lists with loops
In [ ]:
x = [y**2 for y in range(10)]
In [65]:
my_name = 'Russell'
consonants = set('bcdfghjklmnpqrstvwxz')
my_consonants = [x for x in my_name if x.lower() in consonants]
In [67]:
''.join(my_consonants) #a string method...take an iterable containing strings, and join them by the string object ('') in the first part of the line
Out[67]:
Let's first make a little text file in textedit, textwrangler, etc.
In [103]:
myfile = open('test_file.txt','r+') # r+ enables reading and writing
In [104]:
myfile.read()
Out[104]:
Note: if we try to do this again, we get nothing, because the read method changes our position in the file object. We can read the file in again, or change the position with myfile.seek(0).
In [105]:
myfile.read()
Out[105]:
In [112]:
myfile.seek(0)
Out[112]:
In [107]:
for line in myfile:
print(line)
In [108]:
firstline = myfile.readline()
In [109]:
print(firstline)
In [113]:
all_lines = myfile.readlines()
In [114]:
print(all_lines)
In [115]:
myfile.write('\nThis is another line\n')
Out[115]:
In [116]:
x = 'This is a line to be saved to the file'.split()
In [117]:
myfile.write(str(x))
Out[117]:
In [118]:
myfile.close() # to save some system resources...important if you have big files
All the above works fine if your data are simple types like numerics or strings. But if you want a to save more complicated structures like dictionaries, and especially if you want to do it on the cheap, there are Python-specific files for doing this (e.g., pickles or pandas).
In [121]:
import pickle
my_dict = dict([('jake', 4139), ('jack', 4127), ('john', 4098)])
pickle.dump( my_dict, open( "save.p", "wb" ) )
Many premade functions -- print, range, len, string.upper(), etc.
You can also make your own. Let's say we wanted a function that took a string and capitalized every other letter in it. So 'russell' would become 'RuSsElL'
In [130]:
def make_so_dope(string):
listed_string = list(string)
for index, letter in enumerate(listed_string):
if index % 2 == 0: # x % y returns the remainder when you divide x by 2. if x is even, then % should return 0
listed_string[index] = letter.upper()
completed_string = ''.join(listed_string)
return completed_string
In [131]:
make_so_dope('Russell')
Out[131]:
Let's make a few functions to solve some simple problems (taken from codingbat.com and Google's online Python class). In doing so, think about what data structures we will need, and what kinds of operations or procedures.
Ex. 1 We want make a package of goal kilos of chocolate. We have small bars (1 kilo each) and big bars (5 kilos each). Return the number of small bars to use, assuming we always use big bars before small bars. Return -1 if it can't be done.
Ex. 2 Given a list of numbers, return a list where all adjacent == elements have been reduced to a single element, so [1, 2, 2, 3] returns [1, 2, 3]. You may create a new list or modify the passed in list.
Ex. 3 Given two lists sorted in increasing order, create and return a merged list of all the elements in sorted order. You may modify the passed in lists. Ideally, the solution should work in "linear" time, making a single pass of both lists.
In [ ]:
def make_chocolate(small,big,goal):
if (small + big * 5) < goal:
return -1
else:
num_small = goal % (5 * big)
return num_small
In [142]:
def remove_adjacent(nums):
result = []
for num in nums:
if len(result) == 0 or num != result[-1]:
result.append(num)
return result
In [ ]:
def linear_merge(list1, list2):
result = []
while len(list1) and len(list2):
if list1[0] < list2[0]:
result.append(list1.pop(0))
else:
result.append(list2.pop(0))
result.extend(list1)
result.extend(list2)
return result
Some other handy built-in functions and methods
any() returns True if any element of iterable is True. all() returns True if all elements of an iterable are True
In [52]:
any([1,0,0,0,0])
Out[52]:
In [29]:
all([1,0,0,0,0])
Out[29]:
Some other string and list methods...
In [31]:
ourstring = "it was a dark and stormy night 3 days ago, when i went to boston and found my dog"
ourstring.capitalize()
Out[31]:
In [33]:
ourstring.replace('and','but')
Out[33]:
In [47]:
ourlist = ourstring.split()
print(ourlist)
In [48]:
ourlist.insert(3,'very')
print(ourlist)
In [49]:
ourlist.remove('very') # remove first matching value
print(ourlist)
In [50]:
del ourlist[7:10] # remove by index
print(ourlist)
In [ ]:
ourlist.index('dark')
zip() aggregates elements from each of the iterables it is passed.
In [55]:
words = ['the','dog','ate','my','homework']
tags = ['det','noun','verb','poss','noun']
freq = [1000,100,50,500,10]
words_and_tags_and_freqs = zip(words,tags,freq)
In [56]:
words_and_tags_and_freqs # apparently in Python 3 (cf. Python 2.x), zip is now an iterator (more on those next time), so calling it doesn't give us the list
Out[56]:
In [58]:
list(words_and_tags_and_freqs) # to get the list of zipped tuples, have to either use list(), next(), or loop through, as in for x in words_and_tags_and_freqs
Out[58]: