Python Numpy Tutorial

Enes Kemal Ergin

The advantages of Core Python:

  • High-level number objects: integers, floating point
  • Containers: Lists, with cheap insertion and append methods, dictionaries with fast look-up

Adventages of using Numpy with Python:

  • Array oriente computing
  • Efficiently implemented multi-dimensional arrays
  • Designed for scientific computation

One of the main adventages of NumPy is its advantage in time compared to standart Python.


In [17]:
import numpy as np
import time
size_of_vec = 1000
def pure_python_version():
    t1 = time.time()
    x = range(size_of_vec)
    y = range(size_of_vec)
    z = []
    for i in range(len(x)):
        z.append(x[i] + y[i])
    return time.time() - t1

def numpy_version():
    t1 = time.time()
    x=np.arange(size_of_vec)
    y = np.arange(size_of_vec)
    z = x + y
    return time.time() - t1

In [18]:
t1 = pure_python_version()
t2 = numpy_version()
print(t1, t2)
print("NumPy is in this example " + str(t1/t2) + " faster!")


(0.0009779930114746094, 6.699562072753906e-05)
NumPy is in this example 14.5978647687 faster!

Multi-dimensional arrays in numpy is possible in numpy. There are zero, one, and multi dimensional types in numpy.

Zero-dimensionl Arrays in Numpy

Zero-dimensional arrays are called scalars.


In [19]:
x = np.array(42) # Create a scalar array scalar 42
print type(x) # Print the type of the x
print np.ndim(x) # Print the dimension of the array


<type 'numpy.ndarray'>
0

One-Dimensional Arrays

One dimensional arrays are called vectors. Arrays are containers of the same type.


In [20]:
F = np.array([1, 1, 2, 3, 5, 8, 13, 21]) # Create an array with integer elements
V = np.array([3.4, 6.9, 99.8, 12.8]) # Create an array with float elements

print F.dtype # Get the type of a homogenous vector
print V.dtype # Get the type of a homogenous vector

print np.ndim(F) # Print the dimension of F
print np.ndim(V) # Print the dimension of V


int64
float64
1
1

Multi-Dimensional Arrays

Arrays of numpy are not limited to one dimension. We create them by passing nested lists to the array method of numpy.


In [21]:
A = np.array([ [3.4, 8.7, 9.9],
               [1.1, -7.8, -0.7],
               [4.1, 12.3, 4.8]])
print A
print A.ndim


[[  3.4   8.7   9.9]
 [  1.1  -7.8  -0.7]
 [  4.1  12.3   4.8]]
2

In [22]:
B = np.array([ [[111, 112], [121, 122]],
               [[211, 212], [221, 222]],
               [[311, 312], [321, 322]] ])
print B
print B.ndim


[[[111 112]
  [121 122]]

 [[211 212]
  [221 222]]

 [[311 312]
  [321 322]]]
3

Shape of an Array

The function shape returns the shape of an array. Returns the shape as tuple of integers


In [23]:
x = np.array([ [67, 63, 87],
               [77, 69, 59],
               [85, 87, 99],
               [79, 72, 71],
               [63, 89, 93],
               [68, 92, 78] ]) # Create an array
print(np.shape(x)) # Print the shape of array created
print(x.shape) # Also possible to print shape as array property


(6, 3)
(6, 3)

In [24]:
x = np.array(11) # Create an scalar 
print(np.shape(x)) # Which will return an empty tuple


()

In [25]:
B = np.array([ [[111, 112], [121, 122]],
               [[211, 212], [221, 222]],
               [[311, 312], [321, 322]] ]) # Create 3D matrix
print(B.shape) # And give the shape


(3, 2, 2)

Indexing and Slicing

Indexing and slicing is very similar to core python data structures. We have also many options to indexing, which makes indexing in Numpy very powerful.

Single indexing is the way, you will most probably expect it:


In [26]:
F = np.array([1, 1, 2, 3, 5, 8, 13, 21]) # Create a vector
print(F[0]) # print the first element of F, i.e. the element with the index 0
print(F[-1]) # print the last element of F
B = np.array([ [[111, 112], [121, 122]],
               [[211, 212], [221, 222]],
               [[311, 312], [321, 322]] ]) # Create 3D matrix
print(B[0][1][0])


1
21
121

In [27]:
A = np.array([ [3.4, 8.7, 9.9], 
               [1.1, -7.8, -0.7],
               [4.1, 12.3, 4.8]]) # Create a multi dimensional array
print(A[1][0]) # Access the element in second row and first column


1.1

There is another way to access elements of multidimensional arrays in numpy. We use only one pair of square brackets and all the indices are separated by commas:


In [28]:
print(A[1, 0]) # Access the element in second row and first column


1.1

You have to be aware of the fact, that the second way is more efficient. In the first case, we create an intermediate array A[1] from which we access the element with the index 0. So it behaves similar to this:


In [29]:
tmp = A[1] # Assign the second row as a vector
print(tmp) # Print the temporary
print(tmp[0]) # Print the first element


[ 1.1 -7.8 -0.7]
1.1

The general syntax of indexing for a one-dimensional array A looks like this:

A[start:stop:step]


In [30]:
S = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) # Create a vector
print(S[2:5]) 
print(S[:4])
print(S[6:])
print(S[:])


[2 3 4]
[0 1 2 3]
[6 7 8 9]
[0 1 2 3 4 5 6 7 8 9]

Multidimensional slicing in the following examples. The ranges for each dimension are separated by commas:


In [31]:
A = np.array([
            [11,12,13,14,15],
            [21,22,23,24,25],
            [31,32,33,34,35],
            [41,42,43,44,45],
            [51,52,53,54,55]]) # Create a multi-dimensional array
print(A[:3,2:]) # Slice first 3 row, and columns except first, second columns
print ("-")*60
print(A[3:,:]) # Slice the last 2 row
print ("-")*60
print(A[:,4:]) # Slice the last column
print ("-")*60


[[13 14 15]
 [23 24 25]
 [33 34 35]]
------------------------------------------------------------
[[41 42 43 44 45]
 [51 52 53 54 55]]
------------------------------------------------------------
[[15]
 [25]
 [35]
 [45]
 [55]]
------------------------------------------------------------

The following two examples use the third parameter "step". The reshape module is used to construct the two-dimensional array.


In [32]:
X = np.arange(28).reshape(4,7) # Create a multi-dimensional array with reshape function
X # Print the array


Out[32]:
array([[ 0,  1,  2,  3,  4,  5,  6],
       [ 7,  8,  9, 10, 11, 12, 13],
       [14, 15, 16, 17, 18, 19, 20],
       [21, 22, 23, 24, 25, 26, 27]])

In [33]:
print(X[::2, ::3]) # 
print ("-")*60
print(X[::, ::3])


[[ 0  3  6]
 [14 17 20]]
------------------------------------------------------------
[[ 0  3  6]
 [ 7 10 13]
 [14 17 20]
 [21 24 27]]

Attention: Whereas slicings on lists and tuples create new objects, a slicing operation on an array creates a view on the original array. So we get a another way to access the array, or better a part of the array. From this follows that if we modify a view, the original array will be modified as well. If you want to check, if two array names share the same memory block, you can use the function np.may_share_memory.

may_share_memory(A, B)

To determine if two arrays can share memory the memory-bounds of A and B are computed. The function returns True, if they overlap and False otherwise.


In [34]:
# In NumPy
A = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
S = A[2:6]
S[0] = 22
S[1] = 23
print(A)
print ("-")*60
# Same thing in lists
lst = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
lst2 = lst[2:6]
lst2[0] = 22
lst2[1] = 23
print(lst)


[ 0  1 22 23  4  5  6  7  8  9]
------------------------------------------------------------
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

flatten Function

flatten is a ndarry method with an optional keyword parameter "order". order can have the values "C", "F" and "A". The default of order is "C". "C" means to flatten C style in row-major ordering, i.e. the rightmost index "changes the fastest" or in other words: In row-major order, the row index varies the slowest, and the column index the quickest, so that a[0,1] follows [0,0].


In [35]:
A = np.array([[[ 0,  1],
               [ 2,  3],
               [ 4,  5],
               [ 6,  7]],
              [[ 8,  9],
               [10, 11],
               [12, 13],
               [14, 15]],
              [[16, 17],
               [18, 19],
               [20, 21],
               [22, 23]]])
Flattened_X = A.flatten() 
print(Flattened_X)
print(A.flatten(order="C"))
print(A.flatten(order="F"))
print(A.flatten(order="A"))


[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23]
[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23]
[ 0  8 16  2 10 18  4 12 20  6 14 22  1  9 17  3 11 19  5 13 21  7 15 23]
[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23]

ravel function

The order of the elements in the array returned by ravel() is normally "C-style".

ravel(a, order='C')

ravel returns a flattened one-dimensional array. A copy is made only if needed.

The optional keyword parameter "order" can be 'C','F', 'A', or 'K'

'C': C-like order, with the last axis index changing fastest, back to the first axis index changing slowest. "C" is the default!

'F': Fortran-like index order with the first index changing fastest, and the last index changing slowest.

'A': Fortran-like index order if the array "a" is Fortran contiguous in memory, C-like order otherwise.

'K': read the elements in the order they occur in memory, except for reversing the data when strides are negative.


In [36]:
print(A.ravel())
print(A.ravel(order="A"))
print(A.ravel(order="F"))
print(A.ravel(order="A"))
print(A.ravel(order="K"))


[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23]
[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23]
[ 0  8 16  2 10 18  4 12 20  6 14 22  1  9 17  3 11 19  5 13 21  7 15 23]
[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23]
[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23]

reshape function

The method reshape() gives a new shape to an array without changing its data, i.e. it returns a new array with a new shape.

reshape(a, newshape, order='C')

Parameter | Description |

a | array_like, Array to be reshaped.|

newshape | int or tuple of ints|

order |'C', 'F', 'A', like in flatten or ravel |


In [37]:
X = np.array(range(24))
Y = X.reshape((3,4,2))
Y


Out[37]:
array([[[ 0,  1],
        [ 2,  3],
        [ 4,  5],
        [ 6,  7]],

       [[ 8,  9],
        [10, 11],
        [12, 13],
        [14, 15]],

       [[16, 17],
        [18, 19],
        [20, 21],
        [22, 23]]])

Concatenating Arrays


In [38]:
x = np.array([11,22])
y = np.array([18,7,6])
z = np.array([1,3,5])
c = np.concatenate((x,y,z))
print(c)


[11 22 18  7  6  1  3  5]

If we are concatenating multidimensional arrays, we can concatenate the arrays according to axis. Arrays must have the same shape to be concatenated with concatenate(). In the case of multidimensional arrays, we can arrange them according to the axis. The default value is axis = 0:


In [39]:
x = np.array(range(24))
x = x.reshape((3,4,2))
y = np.array(range(100,124))
y = y.reshape((3,4,2))
z = np.concatenate((x,y))
print(z)


[[[  0   1]
  [  2   3]
  [  4   5]
  [  6   7]]

 [[  8   9]
  [ 10  11]
  [ 12  13]
  [ 14  15]]

 [[ 16  17]
  [ 18  19]
  [ 20  21]
  [ 22  23]]

 [[100 101]
  [102 103]
  [104 105]
  [106 107]]

 [[108 109]
  [110 111]
  [112 113]
  [114 115]]

 [[116 117]
  [118 119]
  [120 121]
  [122 123]]]

In [40]:
# Same thing with axis 1
z = np.concatenate((x,y),axis = 1)
print(z)


[[[  0   1]
  [  2   3]
  [  4   5]
  [  6   7]
  [100 101]
  [102 103]
  [104 105]
  [106 107]]

 [[  8   9]
  [ 10  11]
  [ 12  13]
  [ 14  15]
  [108 109]
  [110 111]
  [112 113]
  [114 115]]

 [[ 16  17]
  [ 18  19]
  [ 20  21]
  [ 22  23]
  [116 117]
  [118 119]
  [120 121]
  [122 123]]]

Adding New Dimensions


In [41]:
x = np.array([2,5,18,14,4])
y = x[:, np.newaxis]
print(y)


[[ 2]
 [ 5]
 [18]
 [14]
 [ 4]]

Ones and Zeros


In [42]:
E = np.ones((2,3))
print(E)
F = np.ones((3,4),dtype=int)
print(F)


[[ 1.  1.  1.]
 [ 1.  1.  1.]]
[[1 1 1 1]
 [1 1 1 1]
 [1 1 1 1]]

In [43]:
Z = np.zeros((2,4))
print(Z)


[[ 0.  0.  0.  0.]
 [ 0.  0.  0.  0.]]

In [44]:
x = np.array([2,5,18,14,4])
E = np.ones_like(x)
print(E)


[1 1 1 1 1]

In [45]:
Z = np.zeros_like(x)
print(Z)


[0 0 0 0 0]

Array Generating Functions

For larger arrays it is inpractical to initialize the data manually, using explicit python lists. Instead we can use one of the many functions in numpy that generate arrays of different forms. Some of the more common are:

arange():


In [47]:
# create a range
x = np.arange(0, 10, 1) # arguments: start, stop, step
x


Out[47]:
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

linspace() and logspace():


In [48]:
# using linspace, both end points ARE included
np.linspace(0, 10, 25)


Out[48]:
array([  0.        ,   0.41666667,   0.83333333,   1.25      ,
         1.66666667,   2.08333333,   2.5       ,   2.91666667,
         3.33333333,   3.75      ,   4.16666667,   4.58333333,
         5.        ,   5.41666667,   5.83333333,   6.25      ,
         6.66666667,   7.08333333,   7.5       ,   7.91666667,
         8.33333333,   8.75      ,   9.16666667,   9.58333333,  10.        ])

In [51]:
np.logspace(0, 10, 10)


Out[51]:
array([  1.00000000e+00,   1.29154967e+01,   1.66810054e+02,
         2.15443469e+03,   2.78255940e+04,   3.59381366e+05,
         4.64158883e+06,   5.99484250e+07,   7.74263683e+08,
         1.00000000e+10])

mgrid():


In [53]:
x, y = np.mgrid[0:5, 0:5] # ?

In [55]:
x


Out[55]:
array([[0, 0, 0, 0, 0],
       [1, 1, 1, 1, 1],
       [2, 2, 2, 2, 2],
       [3, 3, 3, 3, 3],
       [4, 4, 4, 4, 4]])

In [56]:
y


Out[56]:
array([[0, 1, 2, 3, 4],
       [0, 1, 2, 3, 4],
       [0, 1, 2, 3, 4],
       [0, 1, 2, 3, 4],
       [0, 1, 2, 3, 4]])

Randomizing Data:


In [59]:
from numpy import random
# uniform random numbers in [0,1]
random.rand(5,5)


Out[59]:
array([[ 0.39451979,  0.70957463,  0.08106091,  0.76094718,  0.2216067 ],
       [ 0.1755761 ,  0.36808268,  0.05464136,  0.90650999,  0.42170502],
       [ 0.91859219,  0.41244075,  0.19168834,  0.90732581,  0.8620175 ],
       [ 0.09217468,  0.39344227,  0.91732713,  0.04427794,  0.85354225],
       [ 0.08343417,  0.12703978,  0.9752614 ,  0.42759298,  0.82225003]])

In [60]:
random.rand(5,5) # They are different


Out[60]:
array([[ 0.6794475 ,  0.45968471,  0.46671798,  0.65564811,  0.83663506],
       [ 0.81863609,  0.57483191,  0.10650636,  0.64597465,  0.76828702],
       [ 0.96625782,  0.83881063,  0.5439683 ,  0.96777224,  0.62642134],
       [ 0.50152963,  0.11919023,  0.37199078,  0.04417827,  0.00167931],
       [ 0.99602782,  0.32498907,  0.318584  ,  0.37283322,  0.17712643]])

In [61]:
# standard normal distributed random numbers
random.randn(5,5)


Out[61]:
array([[ 0.0229682 , -0.54307186, -1.0804057 ,  0.13009614, -0.81884631],
       [ 0.45451923,  0.03674482, -0.01171764,  0.63988805, -0.73583658],
       [-0.12555472, -1.21912343, -0.02668445, -0.51155256, -2.19814439],
       [ 0.71917079,  0.01198344,  0.67122964, -0.01170156, -1.00739333],
       [ 0.16156657, -0.30443936,  0.75600987, -0.13558503, -1.54303398]])

diag():


In [64]:
# a diagonal matrix
np.diag([1,2,3])


Out[64]:
array([[1, 0, 0],
       [0, 2, 0],
       [0, 0, 3]])

In [66]:
# diagonal with offset from the main diagonal
np.diag([1,2,3], k=1)


Out[66]:
array([[0, 1, 0, 0],
       [0, 0, 2, 0],
       [0, 0, 0, 3],
       [0, 0, 0, 0]])

Linear Algebra Using NumPy

  • First we can use the usual arithmetic operators to multiply, add, subtract, and divide arrays with scalar numbers.

In [67]:
from numpy import * # I don't want to write np everytime

In [71]:
v1 = arange(0, 5) # Create a vector
v1


Out[71]:
array([0, 1, 2, 3, 4])

In [70]:
# Scalar addition
v1 + 2


Out[70]:
array([2, 3, 4, 5, 6])

In [72]:
# scalar multiplication 
v1 * 2


Out[72]:
array([0, 2, 4, 6, 8])

In [74]:
A


Out[74]:
array([[[ 0,  1],
        [ 2,  3],
        [ 4,  5],
        [ 6,  7]],

       [[ 8,  9],
        [10, 11],
        [12, 13],
        [14, 15]],

       [[16, 17],
        [18, 19],
        [20, 21],
        [22, 23]]])

In [73]:
A * A # element-wise multiplication


Out[73]:
array([[[  0,   1],
        [  4,   9],
        [ 16,  25],
        [ 36,  49]],

       [[ 64,  81],
        [100, 121],
        [144, 169],
        [196, 225]],

       [[256, 289],
        [324, 361],
        [400, 441],
        [484, 529]]])

In [75]:
v1 * v1 # Another example of element-vise multiplication


Out[75]:
array([ 0,  1,  4,  9, 16])

In [76]:
A.shape, v1.shape


Out[76]:
((3, 4, 2), (5,))

Matrix Algebra

What about matrix mutiplication? There are two ways. We can either use the dot function, which applies a matrix-matrix, matrix-vector, or inner vector multiplication to its two arguments:


In [78]:
# Let's create a multi-dimensional array
A = array([[n+m*10 for n in range(5)] for m in range(5)])

In [79]:
A


Out[79]:
array([[ 0,  1,  2,  3,  4],
       [10, 11, 12, 13, 14],
       [20, 21, 22, 23, 24],
       [30, 31, 32, 33, 34],
       [40, 41, 42, 43, 44]])

In [80]:
dot(A, A) # Matrix-Matrix Multiplication using dot function


Out[80]:
array([[ 300,  310,  320,  330,  340],
       [1300, 1360, 1420, 1480, 1540],
       [2300, 2410, 2520, 2630, 2740],
       [3300, 3460, 3620, 3780, 3940],
       [4300, 4510, 4720, 4930, 5140]])

In [81]:
dot(A, v1) # Matrix-Vector Multiplication using dot function


Out[81]:
array([ 30, 130, 230, 330, 430])

In [82]:
dot(v1, v1) # Vector-Vector Mutliplication using dot function


Out[82]:
30

Alternatively, we can cast the array objects to the type matrix. This changes the behavior of the standard arithmetic operators +, -, * to use matrix algebra.


In [83]:
M = matrix(A)
v = matrix(v1).T # make it a column vector

In [85]:
v # Print column vector


Out[85]:
matrix([[0],
        [1],
        [2],
        [3],
        [4]])

In [86]:
M * M # Now we can use simple multiplication operator since we make them matrix object


Out[86]:
matrix([[ 300,  310,  320,  330,  340],
        [1300, 1360, 1420, 1480, 1540],
        [2300, 2410, 2520, 2630, 2740],
        [3300, 3460, 3620, 3780, 3940],
        [4300, 4510, 4720, 4930, 5140]])

In [87]:
M * v


Out[87]:
matrix([[ 30],
        [130],
        [230],
        [330],
        [430]])

In [88]:
# inner product
v.T * v


Out[88]:
matrix([[30]])

In [89]:
v = matrix([1,2,3,4,5,6]).T
shape(M), shape(v)


Out[89]:
((5, 5), (6, 1))

In [ ]: