John Parejko, Lia Corrales, Phil Marshall, Andrew Hearin and Your Name Here>
This notebook demonstrates some ways to make your python code go faster.
Because how can you optimize something if you haven't first evaluated it?
Because you probably own more than one CPU.
In [2]:
import numpy as np
In [5]:
x = np.random.randn(1000)
In [6]:
%timeit np.power(x,2)
In [8]:
%timeit x**2
In [11]:
import cProfile
import pstats
In [23]:
def square(x):
for k in range(1000):
sq = np.power(x,2)
sq = x**2
sq = x*x
return
In [24]:
log = 'square.profile'
cProfile.run('square(x)',filename=log)
stats = pstats.Stats(log)
stats.strip_dirs()
stats.sort_stats('cumtime').print_stats(20)
In [30]:
def bettersquare(x):
def powersquare(x):
return np.power(x,2)
def justsquare(x):
return x**2
def selfmultiply(x):
return x*x
for k in range(1000):
sq = powersquare(x)
sq = justsquare(x)
sq = selfmultiply(x)
return
In [31]:
log = 'bettersquare.profile'
cProfile.run('bettersquare(x)',filename=log)
stats = pstats.Stats(log)
stats.strip_dirs()
stats.sort_stats('cumtime').print_stats(20)
Out[31]:
Much better - you can see the cumulative time spent in each function.
Another useful tool is the line_profiler
, from rkern on GitHub.
In [70]:
!pip install --upgrade line_profiler
We could also run the line_profiler from the command line...
Which means the square function needs writing out to a file...
Can we do this from this notebook?
Cython
allows us to replace simple lines of math with the equivalent lines of C, while still coding in python.cython -a file.pyx
makes file.c
, but also file.html
. The html file shows you the lines that were unwrapped into C. Can we demo this process from this notebook? Hmm.
In [52]:
def my_expensive_loop(n):
x = 0
for i in range(int(n)):
for j in range(int(n)):
x += i + j
In [53]:
%timeit my_expensive_loop(1000)
Let's write the same exact function in cython syntax:
In [63]:
%load_ext cython
In [64]:
%%cython
def my_cythonized_loop(int n):
cdef int i, j, x
x = 0
for i in range(int(n)):
for j in range(int(n)):
x += i + j
In [65]:
%timeit my_cythonized_loop(1000)
In [66]:
"""
The multiprocessing joke.
"""
from __future__ import print_function
import multiprocessing
def print_function(word):
print(word, end=' ')
def tell_the_joke():
print()
print('Why did the parallel chicken cross the road?')
answer = 'To get to the other side.'
print()
# Summon a pool to handle some number of processes.
# Think of N as the number of processors you have?
N = 2
pool = multiprocessing.Pool(processes=N)
# Prepare a list of function inputs:
args = answer.split()
# Pass the function, and its arguments, to the pool:
pool.map(print_function, args)
# Tell the pool members to finish their work.
pool.close()
# "Ask the pool to report that they are done.
pool.join()
print()
print()
return
In [67]:
tell_the_joke()
pool
of processors.
In [68]:
def new_function(word):
return word+' '
def tell_the_joke_better():
print()
print('Why did the parallel chicken cross the road?')
answer = 'To get to the other side.'
print()
# Summon a pool to handle some number of processes.
# Leave N = blank to have multiprocessing guess!
# Or measure it yourself:
N = multiprocessing.cpu_count()
pool = multiprocessing.Pool(processes=N)
# Prepare a list of function inputs:
args = answer.split()
# Pass the function, and its arguments, to the pool:
punchline = pool.map(new_function, args)
# Tell the pool members to finish their work.
pool.close()
# "Ask the pool to report that they are done.
pool.join()
# Use the outputs of the function, which are accessible via the map() method:
print(punchline)
print()
print()
return
In [69]:
tell_the_joke_better()
In [ ]: