Efficient Numerical Code With Numpy Vectorization

C and fortran paradigm: For loops

  • Python loops have python overhead
  • Numpy operations are lower level

Ex: adding two arrays

n = 100
x = np.random.rand(n)
y = np.random.rand(n)
f = []
for i in range(100):
    f.append(x[i] + y[i])

Numpy paradigm: Array operations

  • Useful when performing the same operation to many numerical pieces of data.
  • Ex: Adding all the elements from two lists of numbers
x = np.rand(100)
y = np.rand(100)
f = x + y

Simple operations

  • Addition two lists of numbers
  • numpy array functions
    • np.sum(), np.cumsum(), np.exp()

Some Numpy version have additional optimizations:

Under the hood, complex operations will automatically be run on multiple processors.

  • SIMD
  • Intel MKL
  • To get these benefits in anaconda:
conda install mkl

Some Types of Numerical Data Structures

  • Grid data - possibly regularly spaced
  • Point data

Diffusion Example


In [ ]:
# run these functions in iPython to see some 
from diffusion import *

%timeit diff1d_loop(n_x=1000, plotUpdate=False, showPlots=False)
%timeit diff1d_vec(n_x=1000, plotUpdate=False, showPlots=False)


1 loops, best of 3: 373 ms per loop
100 loops, best of 3: 7.43 ms per loop

In [ ]:
%timeit diff2d_loop(n_x=100, plotUpdate=False, showPlots=False)
%timeit diff2d_vec(n_x=100, plotUpdate=False, showPlots=False)

Monte Carlo Example

  • Gaussian motion
  • List of Particles

In [ ]:
from motion_points import *

%timeit simulateParticles_loop(showPlots=False)
%timeit simulateParticles_vec(showPlots=False)

Data reshaping

Often data is not in correct shape for easily performing array manipulations

  • Useful array member functions:
    • .T
    • .ravel()
    • .reshape(newShape)
  • Array views
  • Rotating 2-D grid of points
  • Rotating 2-D list of points from gaussian motion

In [ ]:
x = np.random.rand(10, 100)
y = x.ravel()
print x.T.shape
print y.shape
print y.reshape(x.shape).shape
print 'x flags:\n', x.flags
print 'y flags:\n', y.flags

In [4]: