NumPy - multidimensional data arrays

Ondrej Lexa 2016

Introduction

The numpy package (module) is used in almost all numerical computation using Python. It is a package that provide high-performance vector, matrix and higher-dimensional data structures for Python. It is implemented in C and Fortran so when calculations are vectorized (formulated with vectors and matrices), performance is very good.

NumPy adds basic MATLAB-like capability to Python:

multidimensional arrays with homogeneous data types
specific numeric data types (e.g.\ \pyv{int8}, \pyv{uint32}, \pyv{float64})
array manipulation functions (e.g.\ reshape, transpose, concatenate)
array generation (e.g.\ ones, zeros, eye, random)
element-wise math operations (e.g.\ add, multiply, max, sin)
matrix math operations (e.g.\ inner/outer product, rank, trace)
linear algebra (e.g.\ inv, pinv, svd, eig, det, qr)

SciPy builds on NumPy (much like MATLAB toolboxes) adding:

multidimensional image processing
non-linear solvers, optimization, root finding
signal processing, fast Fourier transforms
numerical integration, interpolation, statistical functions
sparse matrices, sparse solvers
clustering algorithms, distance metrics, spatial data structures
file IO (including to MATLAB .mat files)

Matplotlib adds MATLAB-like plotting capability on top of NumPy.

Interactive Scientific Python (aka PyLab)

PyLab is a meta-package that import most of the NumPy, SciPy and Matplotlib into the global name space. It is the easiest (and most MATLAB-like) way to work with scientific Python.



In [1]:

    
from pylab import *

Writing scripts

When writing scripts it is recommended that you:

only import what you need, for efficiency
import packages into namespaces, to avoid name clashes

The community has adopted abbreviated naming conventions:



In [2]:

    
import numpy as np
import scipy as sp
import matplotlib as mpl
import matplotlib.pyplot as plt

Some different ways of working with NumPy are:



In [3]:

    
from numpy import eye, array   # Import only what you need
from numpy.linalg import svd

`NumPy` arrays

In the numpy package the terminology used for vectors, matrices and higher-dimensional data sets is array. There are a number of ways to initialize new numpy arrays, for example from

a Python list or tuples
using functions that are dedicated to generating numpy arrays, such as arange, linspace, etc.
reading data from files

From lists

For example, to create new vector and matrix arrays from Python lists we can use the numpy.array function.



In [4]:

    
# a vector: the argument to the array function is a Python list
v = array([1,2,3,4])
v









    Out[4]:





array([1, 2, 3, 4])



In [5]:

    
# a matrix: the argument to the array function is a nested Python list
M = array([[1, 2], [3, 4]])
M









    Out[5]:





array([[1, 2],
       [3, 4]])

The v and M objects are both of the type ndarray that the numpy module provides.



In [6]:

    
type(v), type(M)









    Out[6]:





(numpy.ndarray, numpy.ndarray)

The difference between the v and M arrays is only their shapes. We can get information about the shape of an array by using the ndarray.shape property.



In [7]:

    
v.shape









    Out[7]:





(4,)



In [8]:

    
M.shape









    Out[8]:





(2, 2)

The number of elements in the array is available through the ndarray.size property:



In [9]:

    
M.size









    Out[9]:





4

Equivalently, we could use the function numpy.shape and numpy.size



In [10]:

    
shape(M)









    Out[10]:





(2, 2)



In [11]:

    
size(M)









    Out[11]:





4

The number of dimensions of the array is available through the ndarray.ndim property:



In [12]:

    
v.ndim









    Out[12]:





1



In [13]:

    
M.ndim









    Out[13]:





2

So far the numpy.ndarray looks awefully much like a Python list (or nested list). Why not simply use Python lists for computations instead of creating a new array type?

There are several reasons:

Python lists are very general. They can contain any kind of object. They are dynamically typed. They do not support mathematical functions such as matrix and dot multiplications, etc. Implementing such functions for Python lists would not be very efficient because of the dynamic typing.
Numpy arrays are statically typed and homogeneous. The type of the elements is determined when the array is created.
Numpy arrays are memory efficient.
Because of the static typing, fast implementation of mathematical functions such as multiplication and addition of numpy arrays can be implemented in a compiled language (C and Fortran is used).

Using the dtype (data type) property of an ndarray, we can see what type the data of an array has:



In [14]:

    
M.dtype









    Out[14]:





dtype('int64')

We get an error if we try to assign a value of the wrong type to an element in a numpy array:



In [15]:

    
M[0,0] = "hello"









    



---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-15-a09d72434238> in <module>()
----> 1 M[0,0] = "hello"

ValueError: invalid literal for int() with base 10: 'hello'

Using array-generating functions

For larger arrays it is inpractical to initialize the data manually, using explicit python lists. Instead we can use one of the many functions in numpy that generate arrays of different forms. Some of the more common are:

arange



In [16]:

    
# create a range
x = arange(0, 10, 1) # arguments: start, stop, step
x









    Out[16]:





array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])



In [17]:

    
x = arange(-1, 1, 0.1)
x









    Out[17]:





array([ -1.00000000e+00,  -9.00000000e-01,  -8.00000000e-01,
        -7.00000000e-01,  -6.00000000e-01,  -5.00000000e-01,
        -4.00000000e-01,  -3.00000000e-01,  -2.00000000e-01,
        -1.00000000e-01,  -2.22044605e-16,   1.00000000e-01,
         2.00000000e-01,   3.00000000e-01,   4.00000000e-01,
         5.00000000e-01,   6.00000000e-01,   7.00000000e-01,
         8.00000000e-01,   9.00000000e-01])

linspace and logspace



In [18]:

    
# using linspace, both end points ARE included
linspace(0, 10, 25)









    Out[18]:





array([  0.        ,   0.41666667,   0.83333333,   1.25      ,
         1.66666667,   2.08333333,   2.5       ,   2.91666667,
         3.33333333,   3.75      ,   4.16666667,   4.58333333,
         5.        ,   5.41666667,   5.83333333,   6.25      ,
         6.66666667,   7.08333333,   7.5       ,   7.91666667,
         8.33333333,   8.75      ,   9.16666667,   9.58333333,  10.        ])



In [19]:

    
logspace(0, 10, 10, base=e)









    Out[19]:





array([  1.00000000e+00,   3.03773178e+00,   9.22781435e+00,
         2.80316249e+01,   8.51525577e+01,   2.58670631e+02,
         7.85771994e+02,   2.38696456e+03,   7.25095809e+03,
         2.20264658e+04])

mgrid



In [20]:

    
x, y = mgrid[0:5, 0:5] # similar to meshgrid in MATLAB



In [21]:

    
x









    Out[21]:





array([[0, 0, 0, 0, 0],
       [1, 1, 1, 1, 1],
       [2, 2, 2, 2, 2],
       [3, 3, 3, 3, 3],
       [4, 4, 4, 4, 4]])



In [22]:

    
y









    Out[22]:





array([[0, 1, 2, 3, 4],
       [0, 1, 2, 3, 4],
       [0, 1, 2, 3, 4],
       [0, 1, 2, 3, 4],
       [0, 1, 2, 3, 4]])

diag



In [23]:

    
# a diagonal matrix
diag([1,2,3])









    Out[23]:





array([[1, 0, 0],
       [0, 2, 0],
       [0, 0, 3]])



In [24]:

    
# diagonal with offset from the main diagonal
diag([1,2,3], k=1)









    Out[24]:





array([[0, 1, 0, 0],
       [0, 0, 2, 0],
       [0, 0, 0, 3],
       [0, 0, 0, 0]])

zeros and ones



In [25]:

    
zeros((3,3))









    Out[25]:





array([[ 0.,  0.,  0.],
       [ 0.,  0.,  0.],
       [ 0.,  0.,  0.]])



In [26]:

    
ones((3,3))









    Out[26]:





array([[ 1.,  1.,  1.],
       [ 1.,  1.,  1.],
       [ 1.,  1.,  1.]])

zeros_like and ones_like



In [27]:

    
zeros_like(x)









    Out[27]:





array([[0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0]])



In [28]:

    
ones_like(x)









    Out[28]:





array([[1, 1, 1, 1, 1],
       [1, 1, 1, 1, 1],
       [1, 1, 1, 1, 1],
       [1, 1, 1, 1, 1],
       [1, 1, 1, 1, 1]])

Manipulating arrays

Indexing

We can index elements in an array using square brackets and indices:



In [29]:

    
# v is a vector, and has only one dimension, taking one index
v[0]









    Out[29]:





1



In [30]:

    
# M is a matrix, or a 2 dimensional array, taking two indices 
M[1,1]









    Out[30]:





4

If we omit an index of a multidimensional array it returns the whole row (or, in general, a N-1 dimensional array)



In [31]:

    
M









    Out[31]:





array([[1, 2],
       [3, 4]])



In [32]:

    
M[1]









    Out[32]:





array([3, 4])

The same thing can be achieved with using : instead of an index:



In [33]:

    
M[1,:] # row 1









    Out[33]:





array([3, 4])



In [34]:

    
M[:,1] # column 1









    Out[34]:





array([2, 4])

We can assign new values to elements in an array using indexing:



In [35]:

    
M[0,0] = -1



In [36]:

    
M









    Out[36]:





array([[-1,  2],
       [ 3,  4]])



In [37]:

    
# also works for rows and columns
M[0,:] = 0
M[:,1] = -1



In [38]:

    
M









    Out[38]:





array([[ 0, -1],
       [ 3, -1]])

Index slicing

Index slicing is the technical name for the syntax M[lower:upper:step] to extract part of an array:



In [39]:

    
A = array([1,2,3,4,5])
A









    Out[39]:





array([1, 2, 3, 4, 5])



In [40]:

    
A[1:3]









    Out[40]:





array([2, 3])

Array slices are mutable: if they are assigned a new value the original array from which the slice was extracted is modified:



In [41]:

    
A[1:3] = [-2,-3]

A









    Out[41]:





array([ 1, -2, -3,  4,  5])

We can omit any of the three parameters in M[lower:upper:step]:



In [42]:

    
A[::] # lower, upper, step all take the default values









    Out[42]:





array([ 1, -2, -3,  4,  5])



In [43]:

    
A[::2] # step is 2, lower and upper defaults to the beginning and end of the array









    Out[43]:





array([ 1, -3,  5])



In [44]:

    
A[:3] # first three elements









    Out[44]:





array([ 1, -2, -3])



In [45]:

    
A[3:] # elements from index 3









    Out[45]:





array([4, 5])

Negative indices counts from the end of the array (positive index from the begining):



In [46]:

    
A = array([1,2,3,4,5])



In [47]:

    
A[-1] # the last element in the array









    Out[47]:





5



In [48]:

    
A[-3:] # the last three elements









    Out[48]:





array([3, 4, 5])

Index slicing works exactly the same way for multidimensional arrays:



In [49]:

    
A = array([[n+m*10 for n in range(5)] for m in range(5)])
A









    Out[49]:





array([[ 0,  1,  2,  3,  4],
       [10, 11, 12, 13, 14],
       [20, 21, 22, 23, 24],
       [30, 31, 32, 33, 34],
       [40, 41, 42, 43, 44]])



In [50]:

    
# a block from the original array
A[1:4, 1:4]









    Out[50]:





array([[11, 12, 13],
       [21, 22, 23],
       [31, 32, 33]])



In [51]:

    
# strides
A[::2, ::2]









    Out[51]:





array([[ 0,  2,  4],
       [20, 22, 24],
       [40, 42, 44]])

Fancy indexing

Fancy indexing is the name for when an array or list is used in-place of an index:



In [52]:

    
row_indices = [1, 2, 3]
A[row_indices]









    Out[52]:





array([[10, 11, 12, 13, 14],
       [20, 21, 22, 23, 24],
       [30, 31, 32, 33, 34]])



In [53]:

    
col_indices = [1, 2, -1] # remember, index -1 means the last element
A[row_indices, col_indices]









    Out[53]:





array([11, 22, 34])

We can also use index masks: If the index mask is an Numpy array of data type bool, then an element is selected (True) or not (False) depending on the value of the index mask at the position of each element:



In [54]:

    
B = array([n for n in range(5)])
B









    Out[54]:





array([0, 1, 2, 3, 4])



In [55]:

    
row_mask = array([True, False, True, False, False])
B[row_mask]









    Out[55]:





array([0, 2])



In [56]:

    
# same thing
row_mask = array([1,0,1,0,0], dtype=bool)
B[row_mask]









    Out[56]:





array([0, 2])

This feature is very useful to conditionally select elements from an array, using for example comparison operators:



In [57]:

    
x = arange(0, 10, 0.5)
x









    Out[57]:





array([ 0. ,  0.5,  1. ,  1.5,  2. ,  2.5,  3. ,  3.5,  4. ,  4.5,  5. ,
        5.5,  6. ,  6.5,  7. ,  7.5,  8. ,  8.5,  9. ,  9.5])



In [58]:

    
mask = (5 < x) * (x <= 7)
mask









    Out[58]:





array([False, False, False, False, False, False, False, False, False,
       False, False,  True,  True,  True,  True, False, False, False,
       False, False], dtype=bool)



In [59]:

    
x[mask]









    Out[59]:





array([ 5.5,  6. ,  6.5,  7. ])

Linear algebra

Vectorizing code is the key to writing efficient numerical calculation with Python/Numpy. That means that as much as possible of a program should be formulated in terms of matrix and vector operations, like matrix-matrix multiplication.

Scalar-array operations

We can use the usual arithmetic operators to multiply, add, subtract, and divide arrays with scalar numbers.



In [60]:

    
v1 = arange(0, 5)



In [61]:

    
v1 * 2









    Out[61]:





array([0, 2, 4, 6, 8])



In [62]:

    
v1 + 2









    Out[62]:





array([2, 3, 4, 5, 6])



In [63]:

    
A * 2









    Out[63]:





array([[ 0,  2,  4,  6,  8],
       [20, 22, 24, 26, 28],
       [40, 42, 44, 46, 48],
       [60, 62, 64, 66, 68],
       [80, 82, 84, 86, 88]])



In [64]:

    
A + 2









    Out[64]:





array([[ 2,  3,  4,  5,  6],
       [12, 13, 14, 15, 16],
       [22, 23, 24, 25, 26],
       [32, 33, 34, 35, 36],
       [42, 43, 44, 45, 46]])



In [65]:

    
A + A.T









    Out[65]:





array([[ 0, 11, 22, 33, 44],
       [11, 22, 33, 44, 55],
       [22, 33, 44, 55, 66],
       [33, 44, 55, 66, 77],
       [44, 55, 66, 77, 88]])

Above we have used the .T to transpose the matrix object v. We could also have used the transpose function to accomplish the same thing.

Element-wise array-array operations

When we add, subtract, multiply and divide arrays with each other, the default behaviour is element-wise operations:



In [66]:

    
A * A # element-wise multiplication









    Out[66]:





array([[   0,    1,    4,    9,   16],
       [ 100,  121,  144,  169,  196],
       [ 400,  441,  484,  529,  576],
       [ 900,  961, 1024, 1089, 1156],
       [1600, 1681, 1764, 1849, 1936]])



In [67]:

    
v1 * v1









    Out[67]:





array([ 0,  1,  4,  9, 16])

If we multiply arrays with compatible shapes, we get an element-wise multiplication of each row:



In [68]:

    
A.shape, v1.shape









    Out[68]:





((5, 5), (5,))



In [69]:

    
A * v1









    Out[69]:





array([[  0,   1,   4,   9,  16],
       [  0,  11,  24,  39,  56],
       [  0,  21,  44,  69,  96],
       [  0,  31,  64,  99, 136],
       [  0,  41,  84, 129, 176]])

Matrix algebra

What about matrix mutiplication? There are two ways. We can either use the dot function, which applies a matrix-matrix, matrix-vector, or inner vector multiplication to its two arguments:



In [70]:

    
dot(A, A)









    Out[70]:





array([[ 300,  310,  320,  330,  340],
       [1300, 1360, 1420, 1480, 1540],
       [2300, 2410, 2520, 2630, 2740],
       [3300, 3460, 3620, 3780, 3940],
       [4300, 4510, 4720, 4930, 5140]])



In [71]:

    
dot(A, v1)









    Out[71]:





array([ 30, 130, 230, 330, 430])



In [72]:

    
dot(v1, v1)









    Out[72]:





30

Alternatively, we can cast the array objects to the type matrix. This changes the behavior of the standard arithmetic operators +, -, * to use matrix algebra.



In [73]:

    
M = matrix(A)
v = matrix(v1).T # make it a column vector



In [74]:

    
v









    Out[74]:





matrix([[0],
        [1],
        [2],
        [3],
        [4]])



In [75]:

    
M * M









    Out[75]:





matrix([[ 300,  310,  320,  330,  340],
        [1300, 1360, 1420, 1480, 1540],
        [2300, 2410, 2520, 2630, 2740],
        [3300, 3460, 3620, 3780, 3940],
        [4300, 4510, 4720, 4930, 5140]])



In [76]:

    
M * v









    Out[76]:





matrix([[ 30],
        [130],
        [230],
        [330],
        [430]])



In [77]:

    
# inner product
v.T * v









    Out[77]:





matrix([[30]])



In [78]:

    
# with matrix objects, standard matrix algebra applies
v + M*v









    Out[78]:





matrix([[ 30],
        [131],
        [232],
        [333],
        [434]])

If we try to add, subtract or multiply objects with incomplatible shapes we get an error:



In [79]:

    
v = matrix([1,2,3,4,5,6]).T



In [80]:

    
shape(M), shape(v)









    Out[80]:





((5, 5), (6, 1))



In [81]:

    
M * v









    



---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-81-995fb48ad0cc> in <module>()
----> 1 M * v

/home/ondro/miniconda3/lib/python3.4/site-packages/numpy/matrixlib/defmatrix.py in __mul__(self, other)
    341         if isinstance(other, (N.ndarray, list, tuple)) :
    342             # This promotes 1-D vectors to row vectors
--> 343             return N.dot(self, asmatrix(other))
    344         if isscalar(other) or not hasattr(other, '__rmul__') :
    345             return N.dot(self, other)

ValueError: shapes (5,5) and (6,1) not aligned: 5 (dim 1) != 6 (dim 0)

See also the related functions: inner, outer, cross, kron, tensordot. Try for example help(kron).

Matrix computations



In [82]:

    
M = array([[1, 2], [3, 4]])

Inverse



In [83]:

    
inv(M) # equivalent to M.I









    Out[83]:





array([[-2. ,  1. ],
       [ 1.5, -0.5]])



In [84]:

    
dot(inv(M), M)









    Out[84]:





array([[  1.00000000e+00,   4.44089210e-16],
       [ -5.55111512e-17,   1.00000000e+00]])

Determinant



In [85]:

    
det(M)









    Out[85]:





-2.0000000000000004



In [86]:

    
det(inv(M))









    Out[86]:





-0.50000000000000011

Data processing

Often it is useful to store datasets in Numpy arrays. Numpy provides a number of functions to calculate statistics of datasets in arrays.



In [87]:

    
d = arange(0, 10)
d









    Out[87]:





array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

mean



In [88]:

    
mean(d)









    Out[88]:





4.5

standard deviations and variance



In [89]:

    
std(d), var(d)









    Out[89]:





(2.8722813232690143, 8.25)

min and max



In [90]:

    
d.min()









    Out[90]:





0



In [91]:

    
d.max()









    Out[91]:





9

sum, prod, and trace



In [92]:

    
# sum up all elements
sum(d)









    Out[92]:





45



In [93]:

    
# product of all elements
prod(d+1)









    Out[93]:





3628800



In [94]:

    
# cummulative sum
cumsum(d)









    Out[94]:





array([ 0,  1,  3,  6, 10, 15, 21, 28, 36, 45])



In [95]:

    
# cummulative product
cumprod(d+1)









    Out[95]:





array([      1,       2,       6,      24,     120,     720,    5040,
         40320,  362880, 3628800])



In [96]:

    
A









    Out[96]:





array([[ 0,  1,  2,  3,  4],
       [10, 11, 12, 13, 14],
       [20, 21, 22, 23, 24],
       [30, 31, 32, 33, 34],
       [40, 41, 42, 43, 44]])



In [97]:

    
# same as: diag(A).sum()
trace(A)









    Out[97]:





110

Calculations with higher-dimensional data

When functions such as min, max, etc. are applied to a multidimensional arrays, it is sometimes useful to apply the calculation to the entire array, and sometimes only on a row or column basis. Using the axis argument we can specify how these functions should behave:



In [98]:

    
M = rand(3,4)
M









    Out[98]:





array([[ 0.04512605,  0.10592677,  0.8258787 ,  0.3923606 ],
       [ 0.61702031,  0.8113163 ,  0.52166172,  0.748868  ],
       [ 0.43768503,  0.26005353,  0.12467702,  0.98016084]])



In [99]:

    
# global max
M.max()









    Out[99]:





0.98016083845119628



In [100]:

    
# max in each column
M.max(axis=0)









    Out[100]:





array([ 0.61702031,  0.8113163 ,  0.8258787 ,  0.98016084])



In [101]:

    
# max in each row
M.max(axis=1)









    Out[101]:





array([ 0.8258787 ,  0.8113163 ,  0.98016084])

Many other functions and methods in the array and matrix classes accept the same (optional) axis keyword argument.

Reshaping, resizing and stacking arrays

The shape of an Numpy array can be modified without copying the underlaying data, which makes it a fast operation even for large arrays.



In [102]:

    
M









    Out[102]:





array([[ 0.04512605,  0.10592677,  0.8258787 ,  0.3923606 ],
       [ 0.61702031,  0.8113163 ,  0.52166172,  0.748868  ],
       [ 0.43768503,  0.26005353,  0.12467702,  0.98016084]])



In [103]:

    
n, m = M.shape
n, m









    Out[103]:





(3, 4)



In [104]:

    
N = M.reshape((6, 2))
N









    Out[104]:





array([[ 0.04512605,  0.10592677],
       [ 0.8258787 ,  0.3923606 ],
       [ 0.61702031,  0.8113163 ],
       [ 0.52166172,  0.748868  ],
       [ 0.43768503,  0.26005353],
       [ 0.12467702,  0.98016084]])



In [105]:

    
O = M.reshape((1, 12))
O









    Out[105]:





array([[ 0.04512605,  0.10592677,  0.8258787 ,  0.3923606 ,  0.61702031,
         0.8113163 ,  0.52166172,  0.748868  ,  0.43768503,  0.26005353,
         0.12467702,  0.98016084]])



In [106]:

    
N[0:2,:] = 1 # modify the array
N









    Out[106]:





array([[ 1.        ,  1.        ],
       [ 1.        ,  1.        ],
       [ 0.61702031,  0.8113163 ],
       [ 0.52166172,  0.748868  ],
       [ 0.43768503,  0.26005353],
       [ 0.12467702,  0.98016084]])



In [107]:

    
M # and the original variable is also changed. B is only a different view of the same data









    Out[107]:





array([[ 1.        ,  1.        ,  1.        ,  1.        ],
       [ 0.61702031,  0.8113163 ,  0.52166172,  0.748868  ],
       [ 0.43768503,  0.26005353,  0.12467702,  0.98016084]])



In [108]:

    
O









    Out[108]:





array([[ 1.        ,  1.        ,  1.        ,  1.        ,  0.61702031,
         0.8113163 ,  0.52166172,  0.748868  ,  0.43768503,  0.26005353,
         0.12467702,  0.98016084]])

We can also use the function flatten to make a higher-dimensional array into a vector. But this function create a copy of the data.



In [109]:

    
F = M.flatten()
F









    Out[109]:





array([ 1.        ,  1.        ,  1.        ,  1.        ,  0.61702031,
        0.8113163 ,  0.52166172,  0.748868  ,  0.43768503,  0.26005353,
        0.12467702,  0.98016084])



In [110]:

    
F[0:5] = 0
F









    Out[110]:





array([ 0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
        0.8113163 ,  0.52166172,  0.748868  ,  0.43768503,  0.26005353,
        0.12467702,  0.98016084])



In [111]:

    
M # now M has not changed, because F's data is a copy of M's, not refering to the same data









    Out[111]:





array([[ 1.        ,  1.        ,  1.        ,  1.        ],
       [ 0.61702031,  0.8113163 ,  0.52166172,  0.748868  ],
       [ 0.43768503,  0.26005353,  0.12467702,  0.98016084]])

Stacking and repeating arrays

Using function repeat, tile, vstack, hstack, and concatenate we can create larger vectors and matrices from smaller ones:

tile and repeat



In [112]:

    
a = array([[1, 2], [3, 4]])



In [113]:

    
# repeat each element 3 times
repeat(a, 3)









    Out[113]:





array([1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4])



In [114]:

    
# tile the matrix 3 times 
tile(a, 3)









    Out[114]:





array([[1, 2, 1, 2, 1, 2],
       [3, 4, 3, 4, 3, 4]])

concatenate



In [115]:

    
b = array([[5, 6]])



In [116]:

    
concatenate((a, b), axis=0)









    Out[116]:





array([[1, 2],
       [3, 4],
       [5, 6]])



In [117]:

    
concatenate((a, b.T), axis=1)









    Out[117]:





array([[1, 2, 5],
       [3, 4, 6]])

hstack and vstack



In [118]:

    
vstack((a,b))









    Out[118]:





array([[1, 2],
       [3, 4],
       [5, 6]])



In [119]:

    
hstack((a,b.T))









    Out[119]:





array([[1, 2, 5],
       [3, 4, 6]])

Linear equations

System of linear equations like: \begin{array}{rcl} x + 2y & = & 5\\ 3x + 4y & = & 7 \end{array}

\begin{array}{rcl} \left[ {\begin{array}{*{20}{c}} 1&2\\ 3&4 \end{array}

} \right] \left[ {\begin{array}{*{20}{c}} x\\ y \end{array}} \right] & = & \left[ {\begin{array}{*{20}{c}} 5\\ 7 \end{array}} \right] \end{array}

could be written in matrix form as $\mathbf {Ax} = \mathbf b$ and could be solved using numpy solve:



In [120]:

    
A = array([[1, 2], [3, 4]])
b = array([5,7])
solve(A,b)









    Out[120]:





array([-3.,  4.])



In [121]:

    
dot(inv(A),b)









    Out[121]:





array([-3.,  4.])

Copy and "deep copy"

To achieve high performance, assignments in Python usually do not copy the underlaying objects. This is important for example when objects are passed between functions, to avoid an excessive amount of memory copying when it is not necessary (technical term: pass by reference).



In [122]:

    
# now B is referring to the same array data as A 
B = A



In [123]:

    
# changing B affects A
B[0,0] = 10
B









    Out[123]:





array([[10,  2],
       [ 3,  4]])



In [124]:

    
A









    Out[124]:





array([[10,  2],
       [ 3,  4]])

If we want to avoid this behavior, so that when we get a new completely independent object B copied from A, then we need to do a so-called "deep copy" using the function copy:



In [125]:

    
B = copy(A)



In [126]:

    
# now, if we modify B, A is not affected
B[0,0] = -5
B









    Out[126]:





array([[-5,  2],
       [ 3,  4]])



In [127]:

    
A









    Out[127]:





array([[10,  2],
       [ 3,  4]])

Iterating over array elements

Generally, we want to avoid iterating over the elements of arrays whenever we can (at all costs). The reason is that in a interpreted language like Python (or MATLAB), iterations are really slow compared to vectorized operations.

However, sometimes iterations are unavoidable. For such cases, the Python for loop is the most convenient way to iterate over an array:



In [128]:

    
v = array([1,2,3,4])

for element in v:
    print(element)



In [129]:

    
M = array([[1,2], [3,4]])

for row in M:
    print('Row', row)
    
    for element in row:
        print('Element', element)









    



Row [1 2]
Element 1
Element 2
Row [3 4]
Element 3
Element 4

When we need to iterate over each element of an array and modify its elements, it is convenient to use the enumerate function to obtain both the element and its index in the for loop:



In [130]:

    
for row_idx, row in enumerate(M):
    print("row_idx", row_idx, "row", row)
    
    for col_idx, element in enumerate(row):
        print("col_idx", col_idx, "element", element)
       
        # update the matrix M: square each element
        M[row_idx, col_idx] = element ** 2









    



row_idx 0 row [1 2]
col_idx 0 element 1
col_idx 1 element 2
row_idx 1 row [3 4]
col_idx 0 element 3
col_idx 1 element 4



In [131]:

    
# each element in M is now squared
M









    Out[131]:





array([[ 1,  4],
       [ 9, 16]])

Using arrays in conditions

When using arrays in conditions,for example if statements and other boolean expressions, one needs to use any or all, which requires that any or all elements in the array evalutes to True:



In [132]:

    
M









    Out[132]:





array([[ 1,  4],
       [ 9, 16]])



In [133]:

    
if (M > 5).any():
    print("at least one element in M is larger than 5")
else:
    print("no element in M is larger than 5")









    



at least one element in M is larger than 5



In [134]:

    
if (M > 5).all():
    print("all elements in M are larger than 5")
else:
    print("all elements in M are not larger than 5")









    



all elements in M are not larger than 5



In [135]:

    
from IPython.core.display import HTML
def css_styling():
    styles = open("./css/sg2.css", "r").read()
    return HTML(styles)
css_styling()









    Out[135]: