Functions


In [1]:
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

What's wrong with this code?


In [2]:
max_iter = 100
h, w = 400, 400

img = np.zeros((h, w)).astype('int')
for i, real in enumerate(np.linspace(-1.5, 0.5, w)):
    for j, imag in enumerate(np.linspace(-1, 1, h)):
        c = complex(real, imag)
        z = 0 + 0j
        for k in range(max_iter):
            z = z*z + c
            if abs(z) > 2:
                break
        img[j, i] = k

plt.grid(False)
plt.imshow(img, cmap=plt.cm.jet)
pass


  • hard to understand
  • uses global variables
  • not re-usable except by copy and paste

Refactoring to use functions


In [3]:
def mandel(c, z=0, max_iter=100):
    for k in range(max_iter):
        z = z*z + c
        if abs(z) > 2:
            return k
    return k

In [4]:
def mandelbrot(w, h, xl=-1.5, xu=0.5, yl=-1, yu=1):
    img = np.zeros((h, w)).astype('int')
    for i, real in enumerate(np.linspace(xl, xu, w)):
        for j, imag in enumerate(np.linspace(yl, yu, h)):
            c = complex(real, imag)
            img[j, i] = mandel(c)
    return img

In [5]:
img = mandelbrot(w=400, h=400)
plt.grid(False)
plt.imshow(img, cmap=plt.cm.jet)
pass


Function is re-usable


In [6]:
img = mandelbrot(w=400, h=400, xl=-0.75, xu=-0.73, yl=0.1, yu=0.12)
plt.grid(False)
plt.imshow(img, cmap=plt.cm.jet)
pass


Anonymous functions (lambdas)


In [7]:
def square(x):
    return x*x

In [8]:
square(3)


Out[8]:
9

In [9]:
square2 = lambda x: x*x

In [10]:
square2(3)


Out[10]:
9

First class functions


In [11]:
# functions can be treated the same way as (say) an integer

Functions can be passed in as arguments


In [12]:
def grad(x, f, h=0.01):
    return (f(x+h) - f(x-h))/(2*h)

In [13]:
def f(x):
    return 3*x**2 + 5*x + 3

In [14]:
grad(0, f)


Out[14]:
5.000000000000004

Functions can also be returned by functions


In [15]:
import time

def timer(f):
    def g(*args, **kwargs):
        start = time.time()
        result = f(*args, **kwargs)   
        elapsed = time.time() - start
        return result, elapsed
    return g

In [16]:
def f(n=1000000):
    s = sum([x*x for x in range(n)])
    return s

timed_func = timer(f)

In [17]:
timed_func()


Out[17]:
(333332833333500000, 0.18930506706237793)

Decorators


In [18]:
@timer
def g(n=1000000):
    s = sum([x*x for x in range(n)])
    return s

In [19]:
g()


Out[19]:
(333332833333500000, 0.192213773727417)

Map, filter, reduce


In [20]:
map(lambda x: x*x, [1,2,3,4])


Out[20]:
<map at 0x10996d5c0>

In [21]:
list(map(lambda x: x*x, [1,2,3,4]))


Out[21]:
[1, 4, 9, 16]

In [22]:
list(filter(lambda x: x%2==0, [1,2,3,4]))


Out[22]:
[2, 4]

In [23]:
from functools import reduce

In [24]:
reduce(lambda x, y: x*y, [1,2,3,4], 10)


Out[24]:
240

List comprehension


In [25]:
[x*x for x in [1,2,3,4]]


Out[25]:
[1, 4, 9, 16]

In [26]:
[x for x in [1,2,3,4] if x%2 == 0]


Out[26]:
[2, 4]

Set and dictionary comprehension


In [27]:
{i%3 for i in range(10)}


Out[27]:
{0, 1, 2}

In [28]:
{i: i%3 for i in range(10)}


Out[28]:
{0: 0, 1: 1, 2: 2, 3: 0, 4: 1, 5: 2, 6: 0, 7: 1, 8: 2, 9: 0}

Generator expressions


In [29]:
(i**2 for i in range(10,15))


Out[29]:
<generator object <genexpr> at 0x109e5ca40>

In [30]:
for x in (i**2 for i in range(10,15)):
    print(x)


100
121
144
169
196

Generator expressions

Generator expressions return a potentially infinite stream, but one at a time thus sparing memory. They are ubiquitous in Python 3, allowing us to handle arbitrarily large data sets.


In [31]:
# Note that count can generate an infinite stream
def count(i=0):
    while True:
        yield i
        i += 1

In [32]:
c = count()
next(c)


Out[32]:
0

In [33]:
next(c)


Out[33]:
1

In [34]:
next(c)


Out[34]:
2

In [35]:
list(zip('abcde', count(10)))


Out[35]:
[('a', 10), ('b', 11), ('c', 12), ('d', 13), ('e', 14)]

In [36]:
for i in count():
    print(i)
    if i >= 10:
        break


0
1
2
3
4
5
6
7
8
9
10

In [37]:
def palindrome_numbers(n):
    yield from range(1, n+1)
    yield from range(n, 0, -1)

In [38]:
list(palindrome_numbers(5))


Out[38]:
[1, 2, 3, 4, 5, 5, 4, 3, 2, 1]

Itertools


In [39]:
import itertools as it

In [40]:
for i in it.islice(count(), 5, 10):
    print(i)


5
6
7
8
9

In [41]:
for i in it.takewhile(lambda i: i< 5, count()):
    print(i)


0
1
2
3
4

In [42]:
import operator as op

[i for i in it.starmap(op.add, [(1,2), (2,3), (3,4)])]


Out[42]:
[3, 5, 7]

In [43]:
fruits = ['appple', 'banana', 'cherry', 'durain', 'eggplant',  'fig']

for k, group in it.groupby(sorted(fruits, key=len), len):
    print(k, list(group))


3 ['fig']
6 ['appple', 'banana', 'cherry', 'durain']
8 ['eggplant']

Functools


In [44]:
import functools as fn

In [45]:
rng1 = fn.partial(np.random.normal, 2, .3)
rng2 = fn.partial(np.random.normal, 10, 1)

In [46]:
rng1(10)


Out[46]:
array([ 2.13849718,  1.5807533 ,  1.92939089,  2.32091577,  1.75429334,
        2.39892103,  2.13631947,  1.90810476,  1.54398362,  2.22273936])

In [47]:
rng2(10)


Out[47]:
array([  9.46427924,  10.75766948,   9.79962611,  10.46099347,
        10.44005324,   9.69270764,   8.788236  ,  10.32903729,
         8.98723117,   9.97326292])

In [48]:
fn.reduce(op.add, rng2(10))


Out[48]:
95.284222396568097

Modules


In [49]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

In [50]:
from pandas import DataFrame, Series
import scipy.stats as ss

In [51]:
DataFrame(ss.beta(2,5).rvs((3,4)), columns=['a', 'b', 'c', 'd'])


Out[51]:
a b c d
0 0.128479 0.285557 0.380817 0.223367
1 0.391506 0.282001 0.231474 0.196180
2 0.567670 0.122379 0.278288 0.151692

Where does Python search for modules?


In [52]:
import sys
sys.path


Out[52]:
['',
 '/Users/cliburn/anaconda2/envs/p3/lib/python3.5/site-packages/StreamLib-1.0.1-py3.5.egg',
 '/Users/cliburn/anaconda2/envs/p3/lib/python3.5/site-packages/julia-0.1.1-py3.5.egg',
 '/Users/cliburn/anaconda2/envs/p3/lib/python3.5/site-packages/pybind11-1.9.dev0-py3.5.egg',
 '/Users/cliburn/anaconda2/envs/p3/lib/python3.5/site-packages/scikits.odes-2.3.0.dev0-py3.5-macosx-10.6-x86_64.egg',
 '/Users/cliburn/anaconda2/envs/p3/lib/python3.5/site-packages/ReFlowRESTClient-0.4-py3.5.egg',
 '/Users/cliburn/spark/python',
 '/Users/cliburn/git-teach/sta-663-2017-public/notebook',
 '/Users/cliburn/anaconda2/envs/p3/lib/python35.zip',
 '/Users/cliburn/anaconda2/envs/p3/lib/python3.5',
 '/Users/cliburn/anaconda2/envs/p3/lib/python3.5/plat-darwin',
 '/Users/cliburn/anaconda2/envs/p3/lib/python3.5/lib-dynload',
 '/Users/cliburn/.local/lib/python3.5/site-packages',
 '/Users/cliburn/anaconda2/envs/p3/lib/python3.5/site-packages',
 '/Users/cliburn/anaconda2/envs/p3/lib/python3.5/site-packages/Sphinx-1.4.1-py3.5.egg',
 '/Users/cliburn/anaconda2/envs/p3/lib/python3.5/site-packages/aeosa',
 '/Users/cliburn/anaconda2/envs/p3/lib/python3.5/site-packages/IPython/extensions',
 '/Users/cliburn/.ipython']

Creating your own module


In [53]:
%%file my_module.py

PI = 3.14

def my_f(x):
    return PI*x


Overwriting my_module.py

In [54]:
import my_module as mm

mm.PI


Out[54]:
3.14

In [55]:
mm.my_f(2)


Out[55]:
6.28

In [56]:
from my_module import PI

In [57]:
PI * 2 * 2


Out[57]:
12.56

Note: Modules can also be nested within each other - e.g. numpy.random to creaate a package. We will explore how to create packages in a later session.