EuroSciPy 2018: NumPy tutorial

Let's do some slicing


In [ ]:
mylist = list(range(10))
print(mylist)

Use slicing to produce the following outputs:

[2, 3, 4, 5]


In [ ]:

[0, 1, 2, 3, 4]


In [ ]:

[6, 7, 8, 9]


In [ ]:

[0, 2, 4, 6, 8]


In [ ]:

[9, 8, 7, 6, 5, 4, 3, 2, 1, 0]


In [ ]:

[7, 5, 3]


In [ ]:

Matrices and lists of lists


In [ ]:
matrix = [[0, 1, 2],
          [3, 4, 5],
          [6, 7, 8]]

Get the second row by slicing twice


In [ ]:

Try to get the second column by slicing. Do not use a list comprehension!


In [ ]:

Getting started

Import the NumPy package


In [ ]:

Create an array


In [ ]:
np.lookfor('create array')

In [ ]:
help(np.array)

The variable matrix contains a list of lists. Turn it into an ndarray and assign it to the variable myarray. Verify that its type is correct.


In [ ]:

For practicing purposes, arrays can conveniently be created with the arange method.


In [ ]:
myarray1 = np.arange(6)
myarray1

In [ ]:
def array_attributes(a):
    for attr in ('ndim', 'size', 'itemsize', 'dtype', 'shape', 'strides'):
        print('{:8s}: {}'.format(attr, getattr(a, attr)))

In [ ]:
array_attributes(myarray1)

Data types

Use np.array() to create arrays containing

  • floats
  • complex numbers
  • booleans
  • strings

and check the dtype attribute.


In [ ]:


In [ ]:


In [ ]:


In [ ]:

Do you understand what is happening in the following statement?


In [ ]:
np.arange(1, 160, 10, dtype=np.int8)

Strides


In [ ]:
myarray2 = myarray1.reshape(2, 3)
myarray2

In [ ]:
array_attributes(myarray2)

In [ ]:
myarray3 = myarray1.reshape(3, 2)

In [ ]:
array_attributes(myarray3)

Views

Set the first entry of myarray1 to a new value, e.g. 42.


In [ ]:

What happened to myarray2?


In [ ]:


In [ ]:

What happens when a matrix is transposed?


In [ ]:
a = np.arange(9).reshape(3, 3)
a

In [ ]:
a.T

Check the strides!


In [ ]:
a.strides

In [ ]:
a.T.strides

View versus copy

identical object


In [ ]:
a = np.arange(4)
b = a
id(a), id(b)

view: a different object working on the same data


In [ ]:
b = a[:]
id(a), id(b)

In [ ]:
a[0] = 42
a, b

an independent copy


In [ ]:
a = np.arange(4)
b = np.copy(a)
id(a), id(b)

In [ ]:
a[0] = 42
a, b

Some array creation routines

numerical ranges

arange(start, stop, step), stop is not included in the array


In [ ]:
np.arange(5, 30, 5)

arange resembles range, but also works for floats

Create the array [1, 1.1, 1.2, 1.3, 1.4, 1.5]


In [ ]:

linspace(start, stop, num) determines the step to produce num equally spaced values, stop is included by default

Create the array [1., 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 2.]


In [ ]:

For equally spaced values on a logarithmic scale, use logspace.


In [ ]:
np.logspace(-2, 2, 5)

In [ ]:
np.logspace(0, 4, 9, base=2)

Application


In [ ]:
import matplotlib.pyplot as plt

In [ ]:
%matplotlib inline

In [ ]:
x = np.linspace(0, 10, 100)
y = np.cos(x)

In [ ]:
plt.plot(x, y)

Homogeneous data


In [ ]:
np.zeros((4, 4))

Create a 4x4 array with integer zeros


In [ ]:


In [ ]:
np.ones((2, 3, 3))

Create a 3x3 array filled with tens


In [ ]:

Diagonal elements


In [ ]:
np.diag([1, 2, 3, 4])

diag has an optional argument k. Try to find out what its effect is.


In [ ]:

Replace the 1d array by a 2d array. What does diag do?


In [ ]:


In [ ]:
np.info(np.eye)

Create the 3x3 array

[[2, 1, 0], [1, 2, 1], [0, 1, 2]]


In [ ]:

Random numbers


In [ ]:
np.random.rand(5, 2)

In [ ]:
np.random.seed(1234)
np.random.rand(5, 2)

In [ ]:
data = np.random.rand(20, 20)
plt.imshow(data, cmap=plt.cm.hot, interpolation='none')
plt.colorbar()

In [ ]:
casts = np.random.randint(1, 7, (100, 3))
plt.hist(casts, np.linspace(0.5, 6.5, 7))

Indexing and slicing

1d arrays


In [ ]:
a = np.arange(10)

Create the array [7, 8, 9]


In [ ]:

Create the array [2, 4, 6, 8]


In [ ]:

Create the array [9, 8, 7, 6, 5, 4, 3, 2, 1, 0]


In [ ]:

Higher dimensions


In [ ]:


In [ ]:


In [ ]:


In [ ]:

Fancy indexing ‒ Boolean mask


In [ ]:
a = np.arange(40).reshape(5, 8)

In [ ]:
a %3 == 0

In [ ]:
a[a %3 == 0]

Application: sieve of Eratosthenes


In [ ]:
nmax = 50
integers = np.arange(nmax)
is_prime = np.ones(nmax, dtype=bool)
is_prime[:2] = False
for j in range(2, int(np.sqrt(nmax))+1):
    if is_prime[j]:
        print(integers[is_prime])
        is_prime[j*j::j] = False
print(integers[is_prime])

Axes

Create an array and calculate the sum over all elements


In [ ]:

Now calculate the sum along axis 0 ...


In [ ]:

and now along axis 1


In [ ]:

Identify the axis in the following array


In [ ]:
a = np.arange(24).reshape(2, 3, 4)
a

Axes in more than two dimensions

Create a three-dimensional array


In [ ]:

Produce a two-dimensional array by cutting along axis 0 ...


In [ ]:

and axis 1 ...


In [ ]:

and axis 2


In [ ]:

What do you get by simply using the index [0]?


In [ ]:

What do you get by using [..., 0]?


In [ ]:

Exploring numerical operations


In [ ]:
a = np.arange(4)
b = np.arange(4, 8)
a, b

In [ ]:
a+b

In [ ]:
a*b

Operations are elementwise. Check this by multiplying two 2d array...


In [ ]:

... and now do a real matrix multiplication


In [ ]:

Application: Random walk


In [ ]:
length_of_walk = 10000
realizations = 5
angles = 2*np.pi*np.random.rand(length_of_walk, realizations)
x = np.cumsum(np.cos(angles), axis=0)
y = np.cumsum(np.sin(angles), axis=0)
plt.plot(x, y)
plt.axis('scaled')

In [ ]:
plt.plot(np.hypot(x, y))

In [ ]:
plt.plot(np.mean(x**2+y**2, axis=1))
plt.axis('scaled')

Let's check the speed


In [ ]:
%%timeit a = np.arange(1000000)
a**2

In [ ]:
%%timeit xvals = range(1000000)
[xval**2 for xval in xvals]

In [ ]:
%%timeit a = np.arange(100000)
np.sin(a)

In [ ]:
import math

In [ ]:
%%timeit xvals = range(100000)
[math.sin(xval) for xval in xvals]

Broadcasting


In [ ]:
a = np.arange(12).reshape(3, 4)
a

In [ ]:
a+1

In [ ]:
a+np.arange(4)

In [ ]:
a+np.arange(3)

In [ ]:
np.arange(3)

In [ ]:
np.arange(3).reshape(3, 1)

In [ ]:
a+np.arange(3).reshape(3, 1)

In [ ]:
%%timeit a = np.arange(10000).reshape(100, 100); b = np.ones((100, 100))
a+b

In [ ]:
%%timeit a = np.arange(10000).reshape(100, 100)
a+1

Create a multiplication table for the numbers from 1 to 10 starting from two appropriately chosen 1d arrays.


In [ ]:

As an alternative to reshape one can add additional axes with newaxes:


In [ ]:
a = np.arange(5)
b = a[:, np.newaxis]

Check the shapes.


In [ ]:

Functions of two variables


In [ ]:
x = np.linspace(-40, 40, 200)
y = x[:, np.newaxis]
z = np.sin(np.hypot(x-10, y))+np.sin(np.hypot(x+10, y))
plt.imshow(z, cmap='viridis')

In [ ]:
x, y = np.mgrid[-10:10:0.1, -10:10:0.1]

In [ ]:
x

In [ ]:
y

In [ ]:
plt.imshow(np.sin(x*y))

In [ ]:
x, y = np.mgrid[-10:10:50j, -10:10:50j]

In [ ]:
x

In [ ]:
y

In [ ]:
plt.imshow(np.arctan2(x, y))

It is natural to use broadcasting. Check out what happens when you replace mgrid by ogrid.


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:

Application: Mandelbrot set


In [ ]:
#
# put code here to create a Boolean array which contains True if a point
# belongs to the Mandelbrot set
#
# reasonable values: 50 iterations, threshold at 100, and a 300x300 grid
# but feel free to choose other values
#
plt.imshow(imdata, cmap='gray')

Application: π from random numbers

Create an array of random numbers and determine the fraction of points with distance from the origin smaller than one. Determine an approximation for π.


In [ ]:

Linear Algebra in NumPy


In [ ]:
import numpy.linalg as LA

In [ ]:
a = np.arange(4).reshape(2, 2)
eigenvalues, eigenvectors = LA.eig(a)
eigenvalues

In [ ]:
eigenvectors

Explore whether the eigenvectors are the rows or the columns.


In [ ]:

Try out eigvals and other methods offered by linalg which your are interested in


In [ ]:


In [ ]:


In [ ]:

Determine the eigenvalue larger than one appearing in the Fibonacci problem. Verify the result by calculating the ratio of successive Fibonacci numbers. Do you recognize the result?


In [ ]:

Application: Brownian motion

  1. Simulate several trajectories for a one-dimensional Brownian motion
    Hint: np.random.choice
  2. Plot the mean distance from the origin as a function of time
  3. Plot the variance of the trajectories as a function of time

In [ ]:

Application: identify entry closest to ½

Create a 2d array containing random numbers and generate a vector containing for each row the entry closest to one-half.


In [ ]:

Polynomials


In [ ]:
from numpy.polynomial import polynomial as P

Powers increase from left to right (index corresponds to power)


In [ ]:
p1 = P.Polynomial([1, 2])

In [ ]:
p1.degree()

In [ ]:
p1.roots()

In [ ]:
p4 = P.Polynomial([24, -50, 35, -10, 1])

In [ ]:
p4.degree()

In [ ]:
p4.roots()

In [ ]:
p4.deriv()

In [ ]:
p4.integ()

In [ ]:
P.polydiv(p4.coef, p1.coef)

Application: polynomial fit


In [ ]:

Application: image manipulation


In [ ]:
from scipy import misc
face = misc.face(gray=True)
face

In [ ]:
plt.imshow(face, cmap=plt.cm.gray)

Modify this image, e.g. convert it to a black and white image, put a black frame, change the contrast, ...


In [ ]:


In [ ]:


In [ ]:


In [ ]: