IHE Python course, 2017

Arrays for engineers and scientists

by T.N. Olsthoorn

Arrays are the workhorse for engineers and scientists.

Arrays are part of the numpy package. For a good overvies see here.

For visualization we use matplotlib as always.


In [272]:
import matplotlib.pyplot as plt
import numpy as np

"""You may have to uncomment one of the two following lines to make in line plots available.
   But this is not always necessary, for instance not on the mac.
"""

#%matplotlib inline
#%matplotlib notebook   # for intractive plots


Out[272]:
'You may have to uncomment one of the two following lines to make in line plots available.\n   But this is not always necessary, for instance not on the mac.\n'

Arrays

Arrays are contiguous blocks of data in memory with all elements of the same type. That is mostly ints or floats, but can be anything. This type is called dtype. Arrays can have as many dimensions as you like and may be very large. THey can be treated as a single unit, which makes them very powerful.

There are many ways to create arrays. One is to type them.

A 1D array is looks a bit like a list but it is not. A 2D array looks like a list of lists. A 3D array looks like a list of lists of lists and so on.


In [273]:
# A one-D array
A = np.array([1, 7, 2, 12]) # specify a one-dimensional array

# A 2D array
B = np.array([[1, 3, 5], [3, 1, 6.2], [4.1, 6, 1], [4, 1, 3]])

# A 2D array
C = np.array([[ 1,   3, 5],
              [ 3,   1, 6.2],
              [ 4.1, 6, 1],
              [ 4,   1, 3]])

#A 3D array
D = np.array([ [ [ 1,  2,  3],
                 [ 4,  5,  6],
                 [ 7,  8,  9]  ],
               [ [10, 11, 12],
                 [13, 14, 15],
                 [16, 17, 18] ],
               [ [19, 20, 21],
                 [22, 23, 24],
                 [25, 26, 27] ] ])


print("A = \n", A, "\n")
print("B = \n", B, "\n")
print("C = \n", C, "\n")
print("D = \b", D, "\n")


A = 
 [ 1  7  2 12] 

B = 
 [[ 1.   3.   5. ]
 [ 3.   1.   6.2]
 [ 4.1  6.   1. ]
 [ 4.   1.   3. ]] 

C = 
 [[ 1.   3.   5. ]
 [ 3.   1.   6.2]
 [ 4.1  6.   1. ]
 [ 4.   1.   3. ]] 

D =  [[[ 1  2  3]
  [ 4  5  6]
  [ 7  8  9]]

 [[10 11 12]
  [13 14 15]
  [16 17 18]]

 [[19 20 21]
  [22 23 24]
  [25 26 27]]] 


In [237]:
print("The type of the data in array A is :", A.dtype)
print("The type of the data in array C is :", C.dtype)
print("The type of the data in array D is :", D.dtype)
print("Size of A: ", A.size)
print("size of C: ", C.size)
print("Size of D: ", D.size)
print("Shape of A: ", A.shape, " <-- Notice the comma !")
print("Shape of C: ", C.shape)
print("Shape of D: ", D.shape)
print("Nr of dimensions of A: ", A.ndim)
print("Nr of dimensions of C: ", C.ndim)
print("Nr of dimensions of D: ", D.ndim)


The type of the data in array A is : int64
The type of the data in array C is : float64
The type of the data in array D is : int64
Size of A:  4
size of C:  12
Size of D:  27
Shape of A:  (4,)  <-- Notice the comma !
Shape of C:  (4, 3)
Shape of D:  (3, 3, 3)
Nr of dimensions of A:  1
Nr of dimensions of C:  2
Nr of dimensions of D:  3

In [238]:
A = np.array([1, 2, 3, 4], dtype=float)
print("Array A :", A, " now has dtype=", A.dtype)


Array A : [ 1.  2.  3.  4.]  now has dtype= float64

To fast generate arrays of ones or zeros we can use the function np.ones(tuple) or np.zeros(tuple).


In [239]:
A = np.zeros((4, 5)) # note (4, 5) is a tuple, parantheses needed.
B = np.ones((3, 6))
print("A =\n", A)
print("B =\n", B)


A =
 [[ 0.  0.  0.  0.  0.]
 [ 0.  0.  0.  0.  0.]
 [ 0.  0.  0.  0.  0.]
 [ 0.  0.  0.  0.  0.]]
B =
 [[ 1.  1.  1.  1.  1.  1.]
 [ 1.  1.  1.  1.  1.  1.]
 [ 1.  1.  1.  1.  1.  1.]]

Or generate an array with a shape that is the same as an already defined other array:


In [240]:
C = np.ones_like(A)
print("C =\n", C)


C =
 [[ 1.  1.  1.  1.  1.]
 [ 1.  1.  1.  1.  1.]
 [ 1.  1.  1.  1.  1.]
 [ 1.  1.  1.  1.  1.]]

Use the function np.arange(start, end, step) to generate a regular sequence.


In [241]:
print(np.arange(1, 6)) # start, end with default steps 1
print(np.arange(7))   #  starts at 0 end ends at 6 (7 numbers in total)
print(np.arange(2, 20, 3))


[1 2 3 4 5]
[0 1 2 3 4 5 6]
[ 2  5  8 11 14 17]

A practical way to generate one-dimensional arrays is by using the functions linspace(start, end, number of points) and logspace(log10(start), log10(end), number of points)


In [242]:
x = np.linspace(0, 6 * np.pi, 201)

y = np.sin(x)
z = y * np.exp(-x/5.0)

plt.figure()
plt.plot(x, y, label= '$sin(x)$')
plt.plot(x, z, label= '$sin(x) e^{-x/(2 \pi)}$')
plt.title('Some graphs')
plt.xlabel('x')
plt.ylabel('y')
plt.legend(loc='best', fontsize='small')
plt.show()


Individual elements of an array can be addressed by index. Indices in Python start at 0 and are placed in square brackets.

Some examples


In [243]:
x = np.arange(20, 30) # starts at 20 and stops before 30, 10 elements
print(x)
print(x[0:5]) # 0 through 4 is 5 elements
print(x[:5])  # same, zero assumed by Python
print(x[3:7]) # 3 through 6, 7-3 = 4 elements
print(x[2:9:2]) # starts at 2 stops before 9 with steps equal to 2


[20 21 22 23 24 25 26 27 28 29]
[20 21 22 23 24]
[20 21 22 23 24]
[23 24 25 26]
[22 24 26 28]

Backcounting:


In [244]:
print(x[-1:0:-1])  # starts at last ends before first with steps -1
print(x[-1:0:-2])  # starts at last end before first with steps -2
print(x[-1::-1])   # starts at last ends at first with setps of -1 !! (note the difference)


[29 28 27 26 25 24 23 22 21]
[29 27 25 23 21]
[29 28 27 26 25 24 23 22 21 20]

More-dimensional arrays. It's easy to generate multidimensional arrays from one-dimensional ones using the reshape function. For instance to show the element number of an array at which it is stored in memory:


In [245]:
A = np.arange(3 * 4 * 5).reshape((3, 4, 5))
print("3D array A: \n", A)


3D array A: 
 [[[ 0  1  2  3  4]
  [ 5  6  7  8  9]
  [10 11 12 13 14]
  [15 16 17 18 19]]

 [[20 21 22 23 24]
  [25 26 27 28 29]
  [30 31 32 33 34]
  [35 36 37 38 39]]

 [[40 41 42 43 44]
  [45 46 47 48 49]
  [50 51 52 53 54]
  [55 56 57 58 59]]]

The easiest way to work with 3D spatial arrays is to consider them as [Layer Row Col] or [z, y, x]. When looked upon in this way, each bock is a layer, each row is along the y-axis on paper ane each row along the x-axis.

Of course, you could associate the first dimension with y the second with x and the thrid with z, as it s natural in Matlab, but that is completely confusing in Python. This is because the elements in a numpy array are stored with the last dimension changing fastest (Fortran order), while in Matlab it's the other way around (C order). This has consequences for the way arrays are printed. To match the print with the imagination, stick with the layer, row, col association in Python numpy arrays.

The reshape function often very useful. Notice that it requires a tuple with the dimensions and it requires that the number of elements in the array does not change by the reshape.

We can address any cell by using indices:


In [246]:
print(A[2, 2, 1])


51

This is is layer 3, row 3 column 2 (because indexing starts at 0).


In [247]:
print(A[1:3, 0:2, 1:3])


[[[21 22]
  [26 27]]

 [[41 42]
  [46 47]]]

That is: of layer 1 and 2 we take rows 0, and 1 and of them columns 1 and 2.

The other way around, starting at the last index: Columns 1:3 (= 1 and 2) of rows 0:2 (=0 and 1) of layers 1:3 (= 1 and 2)

We can simply select layers by using only the first dimension:


In [248]:
print(A[1:3]) # selects block 1 and 2


[[[20 21 22 23 24]
  [25 26 27 28 29]
  [30 31 32 33 34]
  [35 36 37 38 39]]

 [[40 41 42 43 44]
  [45 46 47 48 49]
  [50 51 52 53 54]
  [55 56 57 58 59]]]

And the rows of them by using only the first 2 dimensions:


In [249]:
print(A[1:3, 0:2])


[[[20 21 22 23 24]
  [25 26 27 28 29]]

 [[40 41 42 43 44]
  [45 46 47 48 49]]]

And, finally


In [250]:
print(A[1:3, 0:2, 1:3])


[[[21 22]
  [26 27]]

 [[41 42]
  [46 47]]]

You can assign one value to a range of an array by specifying a range of indices, or you can assign an array to a range of another array, as long as the ranges have equal length. In the last example below, the first 5 values of x (specified as x[0:5]) are given the values [40,42,44,46,48].


In [251]:
print(A[1:3])


[[[20 21 22 23 24]
  [25 26 27 28 29]
  [30 31 32 33 34]
  [35 36 37 38 39]]

 [[40 41 42 43 44]
  [45 46 47 48 49]
  [50 51 52 53 54]
  [55 56 57 58 59]]]

Indexing in sequence has a different meaning:


In [252]:
print(A[1:3][0:2][1:3])


[[[40 41 42 43 44]
  [45 46 47 48 49]
  [50 51 52 53 54]
  [55 56 57 58 59]]]

This should be read as


In [253]:
(((A[1:3])[0:2])[1:3])


Out[253]:
array([[[40, 41, 42, 43, 44],
        [45, 46, 47, 48, 49],
        [50, 51, 52, 53, 54],
        [55, 56, 57, 58, 59]]])

A[1:3] picks blocks 1 and 2 of the 3D array A, which is also a 3D array. Of that, the next index [0:2] takes blocks 0 and 1, which is, in fact the entire array A[1:3] Of that array, the index [1:3] picks blocks 1 and 2.

Notice that if the end is larger than the size of the array in that dimensions, its value defaults to the last. In some languages this would be considered a bug.


In [254]:
((A[1:3])[1:10000])[0:1000]


Out[254]:
array([[[40, 41, 42, 43, 44],
        [45, 46, 47, 48, 49],
        [50, 51, 52, 53, 54],
        [55, 56, 57, 58, 59]]])

It's easy to revert an array:


In [255]:
A[-1::-1, -1::-1, -1::-1]


Out[255]:
array([[[59, 58, 57, 56, 55],
        [54, 53, 52, 51, 50],
        [49, 48, 47, 46, 45],
        [44, 43, 42, 41, 40]],

       [[39, 38, 37, 36, 35],
        [34, 33, 32, 31, 30],
        [29, 28, 27, 26, 25],
        [24, 23, 22, 21, 20]],

       [[19, 18, 17, 16, 15],
        [14, 13, 12, 11, 10],
        [ 9,  8,  7,  6,  5],
        [ 4,  3,  2,  1,  0]]])

Or with the well-known short hand:


In [256]:
A[::-1, :, ::-1] # only the layers and columns reverted, not the rows


Out[256]:
array([[[44, 43, 42, 41, 40],
        [49, 48, 47, 46, 45],
        [54, 53, 52, 51, 50],
        [59, 58, 57, 56, 55]],

       [[24, 23, 22, 21, 20],
        [29, 28, 27, 26, 25],
        [34, 33, 32, 31, 30],
        [39, 38, 37, 36, 35]],

       [[ 4,  3,  2,  1,  0],
        [ 9,  8,  7,  6,  5],
        [14, 13, 12, 11, 10],
        [19, 18, 17, 16, 15]]])

Arrays use one dimension if only one layer, row or column in selected:


In [257]:
print("A=\n", A)
print("\nA.shape = ", A.shape, " <--- 3 dimensions")
print("\nA.ndim  = ", A.ndim)
print("\nA[1]=\n", A[1])
print("\nA[1].shape =", A[1].shape, " <--- 2 dimensions")
print("\nA[1].ndim  =", A[1].ndim)
print("\nA[1][1] =\n", A[1][1])
print("\nA[1][1].shape =", A[1][1].shape, " <--- 1 dimension")
print("\nA[1][1].ndim  =", A[1][1].ndim)
print("\nA[1][1][1] =", A[1][1][1])
print("\nA[1][1][1].shape =", A[1][1][1].shape, "<--- 0 dimensions")
print("\nA[1][1][1].ndim  =", A[1][1][1].ndim)


A=
 [[[ 0  1  2  3  4]
  [ 5  6  7  8  9]
  [10 11 12 13 14]
  [15 16 17 18 19]]

 [[20 21 22 23 24]
  [25 26 27 28 29]
  [30 31 32 33 34]
  [35 36 37 38 39]]

 [[40 41 42 43 44]
  [45 46 47 48 49]
  [50 51 52 53 54]
  [55 56 57 58 59]]]

A.shape =  (3, 4, 5)  <--- 3 dimensions

A.ndim  =  3

A[1]=
 [[20 21 22 23 24]
 [25 26 27 28 29]
 [30 31 32 33 34]
 [35 36 37 38 39]]

A[1].shape = (4, 5)  <--- 2 dimensions

A[1].ndim  = 2

A[1][1] =
 [25 26 27 28 29]

A[1][1].shape = (5,)  <--- 1 dimension

A[1][1].ndim  = 1

A[1][1][1] = 26

A[1][1][1].shape = () <--- 0 dimensions

A[1][1][1].ndim  = 0

Broadcasting

Arrays of different size can be broadcasted if the broadcasting rul applies.

The broadcasting rule

Two arrays are compatible for broadcasting if that is, starting from the end, the axis lengths match or if either of the lengths is 1. Broadcasting is then performed over the missing and / or length 1 dimensions.

This sounds very complicated, but one may also say. Both arrays must have the same number of dimensions and the lengths along the dimimsions in the two arrray must either match or be 1.

For example:

We want to generate a 3D array with conductivities, one for every layer as defined by a vector. The array in the xy plane has 5 by 4 elements


In [258]:
D = np.ones((5, 4))  # size of array in xy  plane
k = np.array([4.2, 11.7, 13.1, 25.2, 33.]) # layer conductivities

To generate the desired array K, we have to turn the vector k in to a 3D vector


In [259]:
print("Original k.shape :", k.shape, " <-- 1D")
k = k.reshape((len(k), 1, 1))
print("New k.shape      :", k.shape, " <-- 3D")


Original k.shape : (5,)  <-- 1D
New k.shape      : (5, 1, 1)  <-- 3D

Still we cannot multiply by D because D is not threedimensional. So turn D into a 3D array with only one layer


In [260]:
print("Origianal D.shape :", D.shape)  #  <-- 2D
D = D.reshape((1, *D.shape))
print("New D.shape       :", D.shape) # <-- 3D


Origianal D.shape : (5, 4)
New D.shape       : (1, 5, 4)

Now generate the conductivity array K


In [261]:
K = D * k
print("K =\n", K)


K =
 [[[  4.2   4.2   4.2   4.2]
  [  4.2   4.2   4.2   4.2]
  [  4.2   4.2   4.2   4.2]
  [  4.2   4.2   4.2   4.2]
  [  4.2   4.2   4.2   4.2]]

 [[ 11.7  11.7  11.7  11.7]
  [ 11.7  11.7  11.7  11.7]
  [ 11.7  11.7  11.7  11.7]
  [ 11.7  11.7  11.7  11.7]
  [ 11.7  11.7  11.7  11.7]]

 [[ 13.1  13.1  13.1  13.1]
  [ 13.1  13.1  13.1  13.1]
  [ 13.1  13.1  13.1  13.1]
  [ 13.1  13.1  13.1  13.1]
  [ 13.1  13.1  13.1  13.1]]

 [[ 25.2  25.2  25.2  25.2]
  [ 25.2  25.2  25.2  25.2]
  [ 25.2  25.2  25.2  25.2]
  [ 25.2  25.2  25.2  25.2]
  [ 25.2  25.2  25.2  25.2]]

 [[ 33.   33.   33.   33. ]
  [ 33.   33.   33.   33. ]
  [ 33.   33.   33.   33. ]
  [ 33.   33.   33.   33. ]
  [ 33.   33.   33.   33. ]]]

A few more examples to exercise

First, only one dimension has length 1.


In [262]:
n, m, k = A.shape
B = np.ones((1, m, k))
A * B
C = np.ones((n, 1, k))
A * C
D = np.ones((n, m, 1))
A * D
A * B * C * D


Out[262]:
array([[[  0.,   1.,   2.,   3.,   4.],
        [  5.,   6.,   7.,   8.,   9.],
        [ 10.,  11.,  12.,  13.,  14.],
        [ 15.,  16.,  17.,  18.,  19.]],

       [[ 20.,  21.,  22.,  23.,  24.],
        [ 25.,  26.,  27.,  28.,  29.],
        [ 30.,  31.,  32.,  33.,  34.],
        [ 35.,  36.,  37.,  38.,  39.]],

       [[ 40.,  41.,  42.,  43.,  44.],
        [ 45.,  46.,  47.,  48.,  49.],
        [ 50.,  51.,  52.,  53.,  54.],
        [ 55.,  56.,  57.,  58.,  59.]]])

Next two dimensions have lengt 1.


In [263]:
n, m, k = A.shape
B = 3 * np.ones((n, 1, 1))
A * B
C = 3 * np.ones((1, m, 1))
A * C
D = 3 * np.ones((1, 1, k))
A * D
A * B * C * D


Out[263]:
array([[[    0.,    27.,    54.,    81.,   108.],
        [  135.,   162.,   189.,   216.,   243.],
        [  270.,   297.,   324.,   351.,   378.],
        [  405.,   432.,   459.,   486.,   513.]],

       [[  540.,   567.,   594.,   621.,   648.],
        [  675.,   702.,   729.,   756.,   783.],
        [  810.,   837.,   864.,   891.,   918.],
        [  945.,   972.,   999.,  1026.,  1053.]],

       [[ 1080.,  1107.,  1134.,  1161.,  1188.],
        [ 1215.,  1242.,  1269.,  1296.,  1323.],
        [ 1350.,  1377.,  1404.,  1431.,  1458.],
        [ 1485.,  1512.,  1539.,  1566.,  1593.]]])

B, C an D are vectors pointing in the z, y an x directions respectively:


In [264]:
n, m, k = A.shape
B = A[:, 0:1, 0:1]
print("B=\n", B)
C = A[0:1, :, 0:1]
print("C=\n", C)
D = A[0:1, 0:1, :]
print("D=\n", D)
print("B * C * D = \n", B * C * D)


B=
 [[[ 0]]

 [[20]]

 [[40]]]
C=
 [[[ 0]
  [ 5]
  [10]
  [15]]]
D=
 [[[0 1 2 3 4]]]
B * C * D = 
 [[[   0    0    0    0    0]
  [   0    0    0    0    0]
  [   0    0    0    0    0]
  [   0    0    0    0    0]]

 [[   0    0    0    0    0]
  [   0  100  200  300  400]
  [   0  200  400  600  800]
  [   0  300  600  900 1200]]

 [[   0    0    0    0    0]
  [   0  200  400  600  800]
  [   0  400  800 1200 1600]
  [   0  600 1200 1800 2400]]]

Arrays are not matrices

Arithmatic between arrays is alway done element wise. For matrix multiplication with 2D arrays one must use the dot function


In [265]:
B = np.random.randint(0, 17, size=(7, 3))
print("B = \n", B)
print("\nB*B=\n", B * B)
print("\nnp.dot(B.T, B)=\n", np.dot(B.T, B))
print("\n B @ B = \n", B.T @ B)  # is the same as np.dot(B.T, B)


B = 
 [[ 1 16  6]
 [ 2  0  9]
 [14 10 15]
 [ 3 15 11]
 [11  6  7]
 [ 3 10 16]
 [ 5  3 13]]

B*B=
 [[  1 256  36]
 [  4   0  81]
 [196 100 225]
 [  9 225 121]
 [121  36  49]
 [  9 100 256]
 [ 25   9 169]]

np.dot(B.T, B)=
 [[365 312 457]
 [312 726 652]
 [457 652 937]]

 B @ B = 
 [[365 312 457]
 [312 726 652]
 [457 652 937]]

For true linear algebra, one may choose to use the np.linalg package which provides matrices and linear algebra functions.

With the array converted to a matrix the multiplication is treated as a matrix multiplication.


In [266]:
print("Using linear algebra, matrix:")
B = np.matrix(B)
print("\nB.T * B = \n ", B.T * B)
print("\nInverse of B = B.I = \n",B.I)


Using linear algebra, matrix:

B.T * B = 
  [[365 312 457]
 [312 726 652]
 [457 652 937]]

Inverse of B = B.I = 
 [[-0.01173367 -0.01780205  0.04701094 -0.01551803  0.05360584 -0.03400974
  -0.01037925]
 [ 0.04316563 -0.02338962 -0.0005714   0.02663365  0.00532715 -0.00491121
  -0.02243054]
 [-0.01791004  0.03456304 -0.00652235  0.00077545 -0.02238118  0.03708064
   0.03454432]]

Using conditional indexing

A condition on an array like y<20 yields a boolean array of the same dimension as y with the value True wherever the condition is met and False otherwise. We can use conditions, i.e. boolean arrays, on for logical indexing of arrays. Only the values where the boolean array is True will be used.

so if y has 100 values and y<3 is true for only 20 values, then y[y<3] also has only 20 values because only the y's are picked from the array y where the condition is True. Logical indexing is very often the most effective, intuitive and elegant way of picking parts from arrays, without using any numerical indices.


In [271]:
x = np.linspace(0, 4 * np.pi, 201)

y = np.sin(x)
z = y * np.exp(-x/5.)
w = (x - 2.3) * (x - 5.1 ) * (x - 10.3) * (x - 12) 

# some logical indexing
w[w > z] = z[w > z]
w[w < y] = y[w < y]

plt.plot(x, y, label="y")
plt.plot(x, z, label="z")
plt.plot(x, w, label="w", lw=2)
plt.legend()
plt.show()

# or, to show how intuitive this indexing is:
high = w > z  # w > z yields a boolean array
low  = w < y

w[high] = z[high]
w[low ] = y[low]
plt.plot(x, w, label="w", lw=2)
plt.show()


Pick out the values within a circle and a box

First generate a grid of coordinates using broad casting

Then compute the distance to a point in the grid

Next select all points within a given distance

Also select points within a box


In [269]:
AND = np.logical_and

x = np.linspace(-500, 500, 200)
y = np.linspace(-500, 500, 200)
xBox = [-300, 50]
yBox = [-200, 300]
R0 = 300.
x0, y0 = 50, 120

y = y.reshape((len(y), 1))
x = x.reshape((1, len(x)))
X = np.ones_like(y) * x
Y = y * np.ones_like(x)
R = np.sqrt((X - x0)**2 + (Y - y0)**2)
inBox = AND(AND(X > xBox[0], X < xBox[1]), AND(Y > yBox[0], Y < yBox[1]))
inIsland = R < R0

V = np.zeros_like(X)
V[inBox]  = 1
V[inIsland] = 2

plt.matshow(V)
plt.show()