Python Numpy

Numpy is a numerical package used extensively in python coding. You can call the install the numpy package by

pip install numpy

When you import a module, you can choose to bound an alias to the package. In python communities, we usually import the numpy module like this:


In [1]:
import numpy as np

A numpy array

A numpy array is a grid of values, all of the same type. The number of dimensions give the rank of the array. To initilze a 1D array, we will do:


In [2]:
# 1D array
a = np.array([2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97])

In [3]:
print(a.shape) # Return the ``shape`` of the array
print(a)


(25,)
[ 2  3  5  7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89
 97]

In [4]:
# 2D array
b = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

In [5]:
print(b.shape) # Return the ``shape`` of the array
print(b)


(3, 3)
[[1 2 3]
 [4 5 6]
 [7 8 9]]

To call or change the element in the array, we can apply similar operation as a list


In [6]:
print(a[0], a[3], a[5])


2 7 13

In [7]:
print(b[0,0],b[1,1],b[2,2])


1 5 9

Universal functions (ufunc)

A universal function (or ufunc for short) is a function that operates on numpy ndarrays in an element-by-element fashion that has been written in compiled C code. That is, a ufunc is a "vectorized" wrapper in high performance code.


In [8]:
x = range(100000)
sum(x)


Out[8]:
4999950000

In [9]:
y = np.array(x)
np.sum(y)


Out[9]:
4999950000

In [10]:
%timeit sum(x)


1.23 ms ± 10.3 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [11]:
%timeit np.sum(y)


40.7 µs ± 193 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

Initlize arrays


In [12]:
# Create an array of all zeros
np.zeros((5, 5))


Out[12]:
array([[0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.]])

In [13]:
# Create an array of all ones
np.ones((5,5))


Out[13]:
array([[1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.]])

In [14]:
# Create a constant array
np.ones((5,5)) * 7


Out[14]:
array([[7., 7., 7., 7., 7.],
       [7., 7., 7., 7., 7.],
       [7., 7., 7., 7., 7.],
       [7., 7., 7., 7., 7.],
       [7., 7., 7., 7., 7.]])

In [15]:
np.full((5,5), 7)


Out[15]:
array([[7, 7, 7, 7, 7],
       [7, 7, 7, 7, 7],
       [7, 7, 7, 7, 7],
       [7, 7, 7, 7, 7],
       [7, 7, 7, 7, 7]])

In [16]:
# Create a 3x3 identity matrix
np.eye(3)


Out[16]:
array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

Pseudo-random number generators


In [17]:
# Create an array filled with random values from 0 to 1
np.random.random((3,2))


Out[17]:
array([[0.09050258, 0.20141557],
       [0.59200162, 0.58821197],
       [0.16836812, 0.55997005]])

In [18]:
# Create an 1D array filled with random integer from 1 to n
np.random.randint(1, 1000, 10)


Out[18]:
array([978, 583, 984, 442, 880, 818, 627, 180, 999, 766])

In [19]:
# Seeding the random values will always give you the same "random" numbers on next run
# We put the answer to the Ultimate Question of Life, the Universe, and Everything to the seed integer
np.random.seed(42)

In [20]:
# Create an 1D array filled with random integer from 1 to n
np.random.randint(1, 1000, 10)


Out[20]:
array([103, 436, 861, 271, 107,  72, 701,  21, 615, 122])

Array indexing


In [21]:
### Slicing

In [22]:
e = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
e


Out[22]:
array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [23]:
# Each dimensions slice similar to a list, here it means 1st dimension select all, 
# 2nd dimension select from one till end.
e[:,1:]


Out[23]:
array([[2, 3],
       [5, 6],
       [8, 9]])

In [24]:
# Here it means 1st dimension and 2nd dimension select from start till 2, 
# i.e. the upper left part of the array.
e[:2,:2]


Out[24]:
array([[1, 2],
       [4, 5]])

In [25]:
# An operation like numpy array > than a value return a boolean array
e > 5


Out[25]:
array([[False, False, False],
       [False, False,  True],
       [ True,  True,  True]])

In [26]:
# If we put the boolean array into the same array, it will select all element that satisfy the conditions
e[e > 5]


Out[26]:
array([6, 7, 8, 9])

Summary

Numpy arrays are very powerful tools in numerical calculations, there are many ufuncs that has been written to optimize numerical operations with speed as close as C (because the underlying procedures are written in C).