In [1]:
name = "2019-10-08-speedy-python"
title = "Speeding up python using tools from the standard library"
tags = "basics, optimisation, numpy, multiprocessing"
author = "Callum Rollo"
In [2]:
from nb_tools import connect_notebook_to_post
from IPython.core.display import HTML
html = connect_notebook_to_post(name, title, tags, author)
To demonstrate Python's performance, we'll use a short function
In [3]:
import math
import numpy as np
from datetime import datetime
In [4]:
def cart2pol(x, y):
r = np.sqrt(x**2 + y**2)
phi = np.arctan2(y, x)
return(r, phi)
As the name suggest cart2pol converts a pair of cartesian coordinates [x, y] to polar coordinates [r, phi]
In [5]:
from IPython.core.display import Image
Image(url='https://upload.wikimedia.org/wikipedia/commons/thumb/7/78/Polar_to_cartesian.svg/1024px-Polar_to_cartesian.svg.png',width=400)
Out[5]:
In [6]:
x = 3
y = 4
r, phi = cart2pol(x,y)
print(r,phi)
All well and good. However, what if we want to convert a list of cartesian coordinates to polar coordinates?
We could loop through both lists and perform the conversion for each x-y pair:
In [7]:
def cart2pol_list(list_x, list_y):
# Prepare empty lists for r and phi values
r = np.empty(len(list_x))
phi = np.empty(len(list_x))
# Loop through the lists of x and y, calculating the r and phi values
for i in range(len(list_x)):
r[i] = np.sqrt(list_x[i]**2 + list_y[i]**2)
phi[i] = np.arctan2(list_y[i], list_x[i])
return(r, phi)
In [8]:
x_list = np.sin(np.arange(0,2*np.pi,0.1))
y_list = np.cos(np.arange(0,2*np.pi,0.1))
These coordinates make a circle centered at [0,0]
In [9]:
import matplotlib.pyplot as plt
fig, ax = plt.subplots(1, 1, figsize=(6,6))
ax.scatter(x_list,y_list)
Out[9]:
In [10]:
r_list, phi_list = cart2pol_list(x_list,y_list)
print(r_list)
print(phi_list)
This is a bit time consuming to type out though, surely there is a better way to make our functions work for lists of inputs?
Step forward vectorise
In [11]:
cart2pol_vec = np.vectorize(cart2pol)
In [12]:
r_list_vec, phi_list_vec = cart2pol_vec(x_list, y_list)
Like magic! We can assure ourselves that these two methods produce the same answers
In [13]:
print(r_list == r_list_vec)
print(phi_list == phi_list_vec)
But how do they perform?
We can use Python's magic %timeit function to test this
In [14]:
%timeit cart2pol_list(x_list, y_list)
In [15]:
%timeit cart2pol_vec(x_list, y_list)
It is significantly faster, both for code writing and at runtime, to use vectorsie rather than manually looping through lists
Another important consideration when code becomes computationally intensive is multiprocessing. Python normally runs on one core, so you won't feel the full benefit of your quad-core or greater machine. You can see this when you run a section of code. To demonstrate the effect of multiprocessing we'll need some heftier maths:
In [16]:
def do_maths(start=0, num=10):
pos = start
big = 1000 * 1000
ave = 0
while pos < num:
pos += 1
val = math.sqrt((pos - big) * (pos - big))
ave += val / num
return int(ave)
In [17]:
t0 = datetime.now()
do_maths(num=30000000)
dt = datetime.now() - t0
print("Done in {:,.2f} sec.".format(dt.total_seconds()))
In [18]:
import multiprocessing
In [19]:
t0 = datetime.now()
pool = multiprocessing.Pool()
processor_count = multiprocessing.cpu_count()
# processor_count = 2 # we can Python to use a specific number of cores if desired
print(f"Computing with {processor_count} processor(s)")
tasks = []
for n in range(1, processor_count + 1):
task = pool.apply_async(do_maths, (30000000 * (n - 1) / processor_count,
30000000 * n / processor_count))
tasks.append(task)
pool.close()
pool.join()
dt = datetime.now() - t0
print("Done in {:,.2f} sec.".format(dt.total_seconds()))
Note that you can recover results stored in the task list with get(). This list will be in the same order as that which you used to spawn the processes
In [20]:
for t in tasks:
print(t.get())
The structure of a multiproccess call is:
In [21]:
pool = multiprocessing.Pool() # Make a pool ready to recieve taks
results = [] # empty list for results
for n in range(1, processor_count + 1): # Loop for assigning a number of tasks
result = pool.appy_async(function, (arguments)) # make a task by passing it a function and arguments
results.append(result) # append the result(s) of this task to the list
pool.close() # tell async there are no more tasks coming
pool.join() # start running the tasks concurrently
for t in results:
t.get() # ret`rieve your results, You could print or assign each result to a variable
Out[21]:
If you have experience of other programming languages, you may wonder why we can't assign tasks to multiple threads to speed up execution. We are prevented from doing this by the Global Interpreter Lock (GIL). This is a lock on the interpreter which ensures that only one thread can be in a state of execution at any one time. This is essential to protect Python's reference system that keeps track of all of the objects in memory.
To get around this lock we spawn several processes which each have their own instance of the interpreter and allocated memory so cannot block one another or cause mischief with references. There's a great summary of the GIL on the real Python website here
tl;dr multithreading won't speed up your compute heavy calcualtions as only one thread can execute at any. time. Use multiprocessing instead
Multiprocessing example adapted from Talk Python To Me Training: async techniques
In [22]:
HTML(html)
Out[22]: