Losing your Loops

Fast Numerical Computing with NumPy

Python is Fast

For Writing, Testing, and Developing of Code

# Hello World in Python
print("hello world")
/* Hello World in Java */
public class HelloWorld {
    public static void main(String[] args) {
        System.out.println("Hello, World");
    }
}

In [ ]:
%matplotlib inline

In [ ]:
import seaborn as sns
data = sns.load_dataset("iris")
sns.pairplot(data, hue="species");

Python is Slow

Compared to compiled languages


In [ ]:
# A silly function implemented in Python

def func_python(N):
    d = 0.0
    for i in range(N):
        d += (i % 3 - 1) * i
    return d

In [ ]:
# Use IPython timeit magic to time the execution
%timeit func_python(10000)

Fortran version:


In [ ]:
%load_ext fortranmagic

In [ ]:
%%fortran
subroutine func_fort(n, d)
    integer, intent(in) :: n
    double precision, intent(out) :: d
    integer :: i
    d = 0
    do i = 0, n - 1
        d = d + (mod(i, 3) - 1) * i
    end do
end subroutine func_fort

In [ ]:
%%file func_fortran.f

      subroutine func_fort(n, d)
           integer, intent(in) :: n
           double precision, intent(out) :: d
           integer :: i
           d = 0
           do i = 0, n - 1
                d = d + (mod(i, 3) - 1) * i
           end do
      end subroutine func_fort

In [ ]:
# use f2py rather than f2py3 for Python 2
!f2py3 -c func_fortran.f -m func_fortran > /dev/null

In [ ]:
from func_fortran import func_fort

In [ ]:
%timeit func_fort(10000)

Outline

  1. Use Numpy ufuncs to your advantage

  2. Use Numpy aggregates to your advantage

  3. Use Numpy broadcasting to your advantage

  4. Use Numpy slicing and masking to your advantage

  5. Use a tool like SWIG, Weave, cython, f2py, Numba, etc. to compile Python or to interface to compiled code.

Fortran is about 100 times faster for this task!

Why is Python so slow?

We alluded to this yesterday, but languages tend to have a compromise between convenience and performance.

  • C, Fortran, etc.: static typing and compiled code leads to fast execution

    • But: lots of development overhead in declaring variables, no interactive prompt, etc.
  • Python, R, Matlab, IDL, etc.: dynamic typing and interpreted excecution leads to fast development

    • But: lots of execution overhead in dynamic type-checking, etc.

We like Python because our development time is generally more valuable than execution time. But sometimes speed can be an issue.

Strategies for making Python fast

  1. Use Numpy ufuncs to your advantage

  2. Use Numpy aggregates to your advantage

  3. Use Numpy broadcasting to your advantage

  4. Use Numpy slicing and masking to your advantage

  5. Use a tool like SWIG, cython or f2py to interface to compiled code.

Here we'll cover the first four, and leave the fifth strategy for a later session.

Strategy 1: Use ufuncs to your advantage

A ufunc in numpy is a Universal Function. This is a function which operates element-wise on an array. We've already seen examples of these in the various arithmetic operations:


In [ ]:
a = [1, 3, 2, 4, 3, 1, 4, 2]
b = [val + 5 for val in a]
print(b)

In [ ]:
import numpy as np
a = np.array(a)

In [ ]:
b = a + 5  # element-wise
print(b)

The speed of ufuncs


In [ ]:
a = list(range(100000))
%timeit [val + 5 for val in a]

In [ ]:
a = np.array(a)
%timeit a + 5

Other ufuncs

There are many, many ufuncs available:

  • Arithmetic: + - * / // % **
  • Bitwise Operations: & | ~ ^ >> <<
  • Comparisons: < > <= >= == !=
  • Trig Functions: np.sin np.cos np.tan ...etc.
  • Exponential Family: np.exp np.log np.log10 ...etc.
  • Special Functions: scipy.special.*

and many, many more.

Strategy 2. use aggregations to your advantage

Aggregations are functions over arrays which return smaller arrays.

Suppose you want to compute the minimum of an array


In [ ]:
from random import random
c = [random() for i in range(100000)]

In [ ]:
%timeit min(c)

In [ ]:
c = np.array(c)

In [ ]:
%timeit c.min()

Aggregates along axes


In [ ]:
M = np.random.randint(0, 10, (3, 5))
M

In [ ]:
M.sum()

In [ ]:
M.sum(axis=0)

In [ ]:
M.sum(axis=1)

Other Aggregation Functions

Numpy has many useful aggregation functions:

np.min np.max np.sum np.prod np.mean np.std np.var np.any np.all np.median np.percentile np.argmin np.argmax

Most also have a NaN-aware equivalent:

np.nanmin np.nanmax np.nansum ...etc.

Strategy 3: Use Broadcasting to your advantage

Broadcasting in NumPy is the set of rules for applying ufuncs on arrays of different sizes and/or dimensions.


In [ ]:
np.arange(3) + 5

In [ ]:
np.ones((3, 3)) + np.arange(3)

In [ ]:
np.arange(3).reshape((3, 1)) + np.arange(3)

Visualizing Broadcasting

Rules of Broadcasting

  1. If array dimensions differ, left-pad the smaller shape with 1s

  2. If any dimension does not match, stretch the dimension with size=1

  3. If neither non-matching dimension is 1, raise an error.

Examples

Example 1


In [ ]:
M = np.ones((2, 3))
M

In [ ]:
a = np.arange(3)
a

In [ ]:
M + a

Example 2


In [ ]:
a = np.arange(3).reshape((3, 1))
a

In [ ]:
b = np.arange(3)
b

In [ ]:
a + b

Example 3


In [ ]:
M = np.ones((3, 2))
M

In [ ]:
a = np.arange(3)
a

In [ ]:
M + a

Strategy 4: Use slicing and masking to your advantage

The last strategy we will cover is slicing and masking.

Python lists can be indexed with integers or slices:


In [ ]:
L = [2, 3, 5, 7, 11]

In [ ]:
L[0]  # integer index

In [ ]:
L[1:3]  # slice for multiple elements

NumPy arrays are like lists


In [ ]:
L = np.array(L)
L

In [ ]:
L[0]

In [ ]:
L[1:3]

Masking


In [ ]:
L

In [ ]:
mask = np.array([False, True, True,
                 False, True])
L[mask]

In [ ]:
mask = (L < 4) | (L > 8) # "|" = "bitwise OR"
L[mask]

Fancy Indexing


In [ ]:
L

In [ ]:
ind = [0, 4, 2]
L[ind]

Multiple Dimensions


In [ ]:
M = np.arange(6).reshape(2, 3)
M

In [ ]:
# multiple indices separated by comma
M[0, 1]

In [ ]:
# mixing slices and indices
M[:, 1]

In [ ]:
# masking the full array
M[abs(M - 3) < 2]

In [ ]:
# mixing fancy indexing and slicing
M[[1, 0], :2]

In [ ]:
# mixing masking and slicing 
M[M.sum(axis=1) > 4, 1:]

In [ ]:

Putting it All Together

Nearest Neighbors of some data


In [ ]:
# 1000 points in 3 dimensions
X = np.random.random((1000, 3))
X.shape

In [ ]:
# Broadcasting to find pairwise differences
diff = X.reshape(1000, 1, 3) - X
diff.shape

In [ ]:
# Aggregate to find pairwise distances
D = (diff ** 2).sum(2)
D.shape

In [ ]:
# set diagonal to infinity to skip self-neighbors
i = np.arange(1000)
D[i, i] = np.inf

In [ ]:
# print the indices of the nearest neighbor
i = np.argmin(D, 1)
print(i[:10])

In [ ]:
# double-check with scikit-learn
from sklearn.neighbors import NearestNeighbors
d, i = NearestNeighbors().fit(X).kneighbors(X, 2)
print(i[:10, 1])

Summary: Speeding up NumPy

It's all about moving loops into compiled code:

  1. Use Numpy ufuncs to your advantage (eliminate loops!)

  2. Use Numpy aggregates to your advantage (eliminate loops!)

  3. Use Numpy broadcasting to your advantage (eliminate loops!)

  4. Use Numpy slicing and masking to your advantage (eliminate loops!)

  5. Use a tool like SWIG, cython or f2py to interface to compiled code.


In [ ]: