Multi-Dimensional Arrays

Matrices and Arrays You will delve into this in great depth when you take linear algebra, which is crucial for astronomy (think of representing astronomical images as 2-D matrices)!

Standard matrix notation is $A_{i,j}$, where i and j are the row and column numbers, respectively.

Read $A_{i,j}$ as "A-sub-i-sub-j" or "A-sub-i-j". Commas are often not used in the subscripts or have different meanings.

In standard mathematics, the indexing starts with 1.

In Python, the indexing starts with 0.

Matrices (arrays) can have an arbitrary number of dimensions. The number of dimensions is the "rank".

Q. What is the rank of $A_{i,j,k,l}$?

The shape of an array is a $d$-vector (or 1-D array) that holds the number of elements in each dimension. $d$ represents the dimensionality of the array.

E.g., the shape of a $A_{i,j,k,l}$ is ($n_i$, $n_j$, $n_k$, $n_l$), where n denotes the number of elements in dimensions $i$, $j$, $k$, and $l$.

Two-Dimensional Numerical Python Arrays

A 2-D array is a matrix, and is analogous to an array of arrays, though each element of an array must have the same data type.

Example: $$wave = \frac{c}{freq}$$

with wavelength in meters, c = 3.00e8 m/s, and frequency in Hz.

We will convert wavelengths of 1 mm to 3 mm to frequencies, bracketing the peak in the cosmic microwave background radiation.



In [ ]:

    
import numpy as np



In [143]:

    
from scipy.constants import c, G, h



In [151]:

    
# Create a wavelength array (in mm):
waves = np.linspace(1.0, 3.0, 21)

Q. What will the maximum (last element) of wave be? How to check?



In [152]:

    
print(waves.max())
waves









    



3.0






    Out[152]:





array([ 1. ,  1.1,  1.2,  1.3,  1.4,  1.5,  1.6,  1.7,  1.8,  1.9,  2. ,
        2.1,  2.2,  2.3,  2.4,  2.5,  2.6,  2.7,  2.8,  2.9,  3. ])



In [154]:

    
# Now, convert to frequency 
# (note conversion from mm to m):

freqs = c / (waves / 1e3)
freqs









    Out[154]:





array([  2.99792458e+11,   2.72538598e+11,   2.49827048e+11,
         2.30609583e+11,   2.14137470e+11,   1.99861639e+11,
         1.87370286e+11,   1.76348505e+11,   1.66551366e+11,
         1.57785504e+11,   1.49896229e+11,   1.42758313e+11,
         1.36269299e+11,   1.30344547e+11,   1.24913524e+11,
         1.19916983e+11,   1.15304792e+11,   1.11034244e+11,
         1.07068735e+11,   1.03376710e+11,   9.99308193e+10])



In [155]:

    
# Make a table & print (zip pairs up wave and freq 
# into a list of tuples):

table = [[wave, freq] for wave, freq in zip(waves, freqs)]

for row in table:
    print(row)









    



[1.0, 299792458000.0]
[1.1000000000000001, 272538598181.81818]
[1.2, 249827048333.33334]
[1.3, 230609583076.9231]
[1.3999999999999999, 214137470000.0]
[1.5, 199861638666.66666]
[1.6000000000000001, 187370286250.0]
[1.7000000000000002, 176348504705.88235]
[1.8, 166551365555.55557]
[1.8999999999999999, 157785504210.52631]
[2.0, 149896229000.0]
[2.1000000000000001, 142758313333.33331]
[2.2000000000000002, 136269299090.90909]
[2.2999999999999998, 130344546956.52174]
[2.4000000000000004, 124913524166.66666]
[2.5, 119916983200.0]
[2.6000000000000001, 115304791538.46155]
[2.7000000000000002, 111034243703.7037]
[2.7999999999999998, 107068735000.0]
[2.9000000000000004, 103376709655.17241]
[3.0, 99930819333.333328]



In [159]:

    
print(np.array(table))









    



[[  1.00000000e+00   2.99792458e+11]
 [  1.10000000e+00   2.72538598e+11]
 [  1.20000000e+00   2.49827048e+11]
 [  1.30000000e+00   2.30609583e+11]
 [  1.40000000e+00   2.14137470e+11]
 [  1.50000000e+00   1.99861639e+11]
 [  1.60000000e+00   1.87370286e+11]
 [  1.70000000e+00   1.76348505e+11]
 [  1.80000000e+00   1.66551366e+11]
 [  1.90000000e+00   1.57785504e+11]
 [  2.00000000e+00   1.49896229e+11]
 [  2.10000000e+00   1.42758313e+11]
 [  2.20000000e+00   1.36269299e+11]
 [  2.30000000e+00   1.30344547e+11]
 [  2.40000000e+00   1.24913524e+11]
 [  2.50000000e+00   1.19916983e+11]
 [  2.60000000e+00   1.15304792e+11]
 [  2.70000000e+00   1.11034244e+11]
 [  2.80000000e+00   1.07068735e+11]
 [  2.90000000e+00   1.03376710e+11]
 [  3.00000000e+00   9.99308193e+10]]



In [160]:

    
# Just for review:

print(list(zip(waves, freqs)))









    



[(1.0, 299792458000.0), (1.1000000000000001, 272538598181.81818), (1.2, 249827048333.33334), (1.3, 230609583076.9231), (1.3999999999999999, 214137470000.0), (1.5, 199861638666.66666), (1.6000000000000001, 187370286250.0), (1.7000000000000002, 176348504705.88235), (1.8, 166551365555.55557), (1.8999999999999999, 157785504210.52631), (2.0, 149896229000.0), (2.1000000000000001, 142758313333.33331), (2.2000000000000002, 136269299090.90909), (2.2999999999999998, 130344546956.52174), (2.4000000000000004, 124913524166.66666), (2.5, 119916983200.0), (2.6000000000000001, 115304791538.46155), (2.7000000000000002, 111034243703.7037), (2.7999999999999998, 107068735000.0), (2.9000000000000004, 103376709655.17241), (3.0, 99930819333.333328)]

Alternatively,



In [170]:

    
table = np.array([waves, freqs])
table









    Out[170]:





array([[  1.00000000e+00,   1.10000000e+00,   1.20000000e+00,
          1.30000000e+00,   1.40000000e+00,   1.50000000e+00,
          1.60000000e+00,   1.70000000e+00,   1.80000000e+00,
          1.90000000e+00,   2.00000000e+00,   2.10000000e+00,
          2.20000000e+00,   2.30000000e+00,   2.40000000e+00,
          2.50000000e+00,   2.60000000e+00,   2.70000000e+00,
          2.80000000e+00,   2.90000000e+00,   3.00000000e+00],
       [  2.99792458e+11,   2.72538598e+11,   2.49827048e+11,
          2.30609583e+11,   2.14137470e+11,   1.99861639e+11,
          1.87370286e+11,   1.76348505e+11,   1.66551366e+11,
          1.57785504e+11,   1.49896229e+11,   1.42758313e+11,
          1.36269299e+11,   1.30344547e+11,   1.24913524e+11,
          1.19916983e+11,   1.15304792e+11,   1.11034244e+11,
          1.07068735e+11,   1.03376710e+11,   9.99308193e+10]])

This isn't quite what we had above. Instead of (wavelength, frequency) pairs, all wavelengths are in one sub-array, and all the frequencies in another. The table is column major now.

Q. How could we regroup elements to match the previous incarnation? (row major)



In [171]:

    
table.transpose()









    Out[171]:





array([[  1.00000000e+00,   2.99792458e+11],
       [  1.10000000e+00,   2.72538598e+11],
       [  1.20000000e+00,   2.49827048e+11],
       [  1.30000000e+00,   2.30609583e+11],
       [  1.40000000e+00,   2.14137470e+11],
       [  1.50000000e+00,   1.99861639e+11],
       [  1.60000000e+00,   1.87370286e+11],
       [  1.70000000e+00,   1.76348505e+11],
       [  1.80000000e+00,   1.66551366e+11],
       [  1.90000000e+00,   1.57785504e+11],
       [  2.00000000e+00,   1.49896229e+11],
       [  2.10000000e+00,   1.42758313e+11],
       [  2.20000000e+00,   1.36269299e+11],
       [  2.30000000e+00,   1.30344547e+11],
       [  2.40000000e+00,   1.24913524e+11],
       [  2.50000000e+00,   1.19916983e+11],
       [  2.60000000e+00,   1.15304792e+11],
       [  2.70000000e+00,   1.11034244e+11],
       [  2.80000000e+00,   1.07068735e+11],
       [  2.90000000e+00,   1.03376710e+11],
       [  3.00000000e+00,   9.99308193e+10]])

Now, table is a two-dimensional array with 21 rows and 2 columns.



In [172]:

    
# let's just work with the transpose

table = table.T

Q. What should this yield?



In [169]:

    
table.shape









    Out[169]:





(21, 2)

Arrays can be indexed in one of two ways:

Q. What should this be?



In [173]:

    
table[20][0]









    Out[173]:





3.0

Or, alternatively:



In [174]:

    
table[20,0]









    Out[174]:





3.0

Not possible for lists! :



In [175]:

    
l = list(table)
print(l[20][0])
l[20,0]









    



3.0






    



---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-175-48cdfca09bf5> in <module>()
      1 l = list(table)
      2 print(l[20][0])
----> 3 l[20,0]

TypeError: list indices must be integers or slices, not tuple

To loop over & print all the dimensions of the array:



In [177]:

    
table.shape









    Out[177]:





(21, 2)



In [178]:

    
for index1 in range(table.shape[0]):

    # Q. What is table.shape[0]?
    
    for index2 in range(table.shape[1]):
        print('table[{}, {}] = {:g}'.format(index1, index2, 
                                            table[index1, index2]))
        
        # Q. What will this loop print?









    



table[0, 0] = 1
table[0, 1] = 2.99792e+11
table[1, 0] = 1.1
table[1, 1] = 2.72539e+11
table[2, 0] = 1.2
table[2, 1] = 2.49827e+11
table[3, 0] = 1.3
table[3, 1] = 2.3061e+11
table[4, 0] = 1.4
table[4, 1] = 2.14137e+11
table[5, 0] = 1.5
table[5, 1] = 1.99862e+11
table[6, 0] = 1.6
table[6, 1] = 1.8737e+11
table[7, 0] = 1.7
table[7, 1] = 1.76349e+11
table[8, 0] = 1.8
table[8, 1] = 1.66551e+11
table[9, 0] = 1.9
table[9, 1] = 1.57786e+11
table[10, 0] = 2
table[10, 1] = 1.49896e+11
table[11, 0] = 2.1
table[11, 1] = 1.42758e+11
table[12, 0] = 2.2
table[12, 1] = 1.36269e+11
table[13, 0] = 2.3
table[13, 1] = 1.30345e+11
table[14, 0] = 2.4
table[14, 1] = 1.24914e+11
table[15, 0] = 2.5
table[15, 1] = 1.19917e+11
table[16, 0] = 2.6
table[16, 1] = 1.15305e+11
table[17, 0] = 2.7
table[17, 1] = 1.11034e+11
table[18, 0] = 2.8
table[18, 1] = 1.07069e+11
table[19, 0] = 2.9
table[19, 1] = 1.03377e+11
table[20, 0] = 3
table[20, 1] = 9.99308e+10

When you just loop over the elements of an array, you get rows:



In [179]:

    
table.shape[0]









    Out[179]:





21



In [180]:

    
for row in table:   # don't be fooled, it's not my naming of the looper that does that!
    print(row)









    



[  1.00000000e+00   2.99792458e+11]
[  1.10000000e+00   2.72538598e+11]
[  1.20000000e+00   2.49827048e+11]
[  1.30000000e+00   2.30609583e+11]
[  1.40000000e+00   2.14137470e+11]
[  1.50000000e+00   1.99861639e+11]
[  1.60000000e+00   1.87370286e+11]
[  1.70000000e+00   1.76348505e+11]
[  1.80000000e+00   1.66551366e+11]
[  1.90000000e+00   1.57785504e+11]
[  2.00000000e+00   1.49896229e+11]
[  2.10000000e+00   1.42758313e+11]
[  2.20000000e+00   1.36269299e+11]
[  2.30000000e+00   1.30344547e+11]
[  2.40000000e+00   1.24913524e+11]
[  2.50000000e+00   1.19916983e+11]
[  2.60000000e+00   1.15304792e+11]
[  2.70000000e+00   1.11034244e+11]
[  2.80000000e+00   1.07068735e+11]
[  2.90000000e+00   1.03376710e+11]
[  3.00000000e+00   9.99308193e+10]



In [181]:

    
for idontknowwhat in table:   
    print(idontknowwhat)









    



[  1.00000000e+00   2.99792458e+11]
[  1.10000000e+00   2.72538598e+11]
[  1.20000000e+00   2.49827048e+11]
[  1.30000000e+00   2.30609583e+11]
[  1.40000000e+00   2.14137470e+11]
[  1.50000000e+00   1.99861639e+11]
[  1.60000000e+00   1.87370286e+11]
[  1.70000000e+00   1.76348505e+11]
[  1.80000000e+00   1.66551366e+11]
[  1.90000000e+00   1.57785504e+11]
[  2.00000000e+00   1.49896229e+11]
[  2.10000000e+00   1.42758313e+11]
[  2.20000000e+00   1.36269299e+11]
[  2.30000000e+00   1.30344547e+11]
[  2.40000000e+00   1.24913524e+11]
[  2.50000000e+00   1.19916983e+11]
[  2.60000000e+00   1.15304792e+11]
[  2.70000000e+00   1.11034244e+11]
[  2.80000000e+00   1.07068735e+11]
[  2.90000000e+00   1.03376710e+11]
[  3.00000000e+00   9.99308193e+10]

This could also be done with one loop using numpy's ndenumerate.

ndenumerate will enumerate the rows and columns of the array:



In [182]:

    
for index_tuple, value in np.ndenumerate(table):
    print('index {} has value {:.2e}'.format(index_tuple, value))









    



index (0, 0) has value 1.00e+00
index (0, 1) has value 3.00e+11
index (1, 0) has value 1.10e+00
index (1, 1) has value 2.73e+11
index (2, 0) has value 1.20e+00
index (2, 1) has value 2.50e+11
index (3, 0) has value 1.30e+00
index (3, 1) has value 2.31e+11
index (4, 0) has value 1.40e+00
index (4, 1) has value 2.14e+11
index (5, 0) has value 1.50e+00
index (5, 1) has value 2.00e+11
index (6, 0) has value 1.60e+00
index (6, 1) has value 1.87e+11
index (7, 0) has value 1.70e+00
index (7, 1) has value 1.76e+11
index (8, 0) has value 1.80e+00
index (8, 1) has value 1.67e+11
index (9, 0) has value 1.90e+00
index (9, 1) has value 1.58e+11
index (10, 0) has value 2.00e+00
index (10, 1) has value 1.50e+11
index (11, 0) has value 2.10e+00
index (11, 1) has value 1.43e+11
index (12, 0) has value 2.20e+00
index (12, 1) has value 1.36e+11
index (13, 0) has value 2.30e+00
index (13, 1) has value 1.30e+11
index (14, 0) has value 2.40e+00
index (14, 1) has value 1.25e+11
index (15, 0) has value 2.50e+00
index (15, 1) has value 1.20e+11
index (16, 0) has value 2.60e+00
index (16, 1) has value 1.15e+11
index (17, 0) has value 2.70e+00
index (17, 1) has value 1.11e+11
index (18, 0) has value 2.80e+00
index (18, 1) has value 1.07e+11
index (19, 0) has value 2.90e+00
index (19, 1) has value 1.03e+11
index (20, 0) has value 3.00e+00
index (20, 1) has value 9.99e+10

Q. Reminder: what is the shape of table?



In [183]:

    
print(table.shape)
print(type(table.shape))









    



(21, 2)
<class 'tuple'>

Q. So what is table.shape[0]?



In [184]:

    
table.shape[0]









    Out[184]:





21

Q. And table.shape[1]?



In [185]:

    
table.shape[1]









    Out[185]:





2

Arrays can be sliced analogously to lists.

But we already saw, there's more indexing posssibilities on top with numpy.



In [186]:

    
table[0]









    Out[186]:





array([  1.00000000e+00,   2.99792458e+11])

Q: How to get the first column instead?



In [187]:

    
table[:, 0]









    Out[187]:





array([ 1. ,  1.1,  1.2,  1.3,  1.4,  1.5,  1.6,  1.7,  1.8,  1.9,  2. ,
        2.1,  2.2,  2.3,  2.4,  2.5,  2.6,  2.7,  2.8,  2.9,  3. ])



In [188]:

    
# Note that this is different.

# Q. What is this?

table[:][0]









    Out[188]:





array([  1.00000000e+00,   2.99792458e+11])



In [189]:

    
# This will print the second column:

table[:, 1]









    Out[189]:





array([  2.99792458e+11,   2.72538598e+11,   2.49827048e+11,
         2.30609583e+11,   2.14137470e+11,   1.99861639e+11,
         1.87370286e+11,   1.76348505e+11,   1.66551366e+11,
         1.57785504e+11,   1.49896229e+11,   1.42758313e+11,
         1.36269299e+11,   1.30344547e+11,   1.24913524e+11,
         1.19916983e+11,   1.15304792e+11,   1.11034244e+11,
         1.07068735e+11,   1.03376710e+11,   9.99308193e+10])



In [191]:

    
# To get the first five rows of the table:
    
print(table[:5, :])

print()

# Same as:
print(table[:5])









    



[[  1.00000000e+00   2.99792458e+11]
 [  1.10000000e+00   2.72538598e+11]
 [  1.20000000e+00   2.49827048e+11]
 [  1.30000000e+00   2.30609583e+11]
 [  1.40000000e+00   2.14137470e+11]]

[[  1.00000000e+00   2.99792458e+11]
 [  1.10000000e+00   2.72538598e+11]
 [  1.20000000e+00   2.49827048e+11]
 [  1.30000000e+00   2.30609583e+11]
 [  1.40000000e+00   2.14137470e+11]]

Numpy also has a multi-dimensional lazy indexing trick under its sleeve:



In [192]:

    
ndarray = np.zeros(2,3,4)    # will fail. Why? Hint: Look at error message









    



---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-192-bc2beecb780f> in <module>()
----> 1 ndarray = np.zeros(2,3,4)    # will fail. Why? Hint: Look at error message

TypeError: data type not understood



In [194]:

    
ndarray = np.zeros((2,3,4))



In [195]:

    
ndarray = np.arange(2*3*4).reshape((2,3,4))    # will fail. Why?



In [196]:

    
ndarray









    Out[196]:





array([[[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11]],

       [[12, 13, 14, 15],
        [16, 17, 18, 19],
        [20, 21, 22, 23]]])



In [198]:

    
ndarray[:, :, 0]









    Out[198]:





array([[ 0,  4,  8],
       [12, 16, 20]])



In [199]:

    
ndarray[..., 0]









    Out[199]:





array([[ 0,  4,  8],
       [12, 16, 20]])

Array Computing

For an array $A$ of any rank, $f(A)$ means applying the function $f$ to each element of $A$.

Matrix Objects

numpy can create matrices, an object type that enables matrix operations. Some matrix manipulation examples follow.



In [204]:

    
xArray1 = np.array([1, 2, 3], float)
xArray1









    Out[204]:





array([ 1.,  2.,  3.])



In [205]:

    
xArray1.T









    Out[205]:





array([ 1.,  2.,  3.])



In [201]:

    
xMatrix = np.matrix(xArray1)
print(type(xMatrix))
xMatrix









    



<class 'numpy.matrixlib.defmatrix.matrix'>






    Out[201]:





matrix([[ 1.,  2.,  3.]])

Note extra set of brackets!



In [206]:

    
xMatrix.shape









    Out[206]:





(1, 3)

Now, to transpose it:



In [207]:

    
xMatrix2 = xMatrix.transpose()
xMatrix2









    Out[207]:





matrix([[ 1.],
        [ 2.],
        [ 3.]])



In [208]:

    
# Or
xMatrix.T









    Out[208]:





matrix([[ 1.],
        [ 2.],
        [ 3.]])

To create an identity array then convert it to a matrix:

Q. What is the identity matrix?



In [210]:

    
iMatrix = np.eye(3)  # or np.identity
iMatrix









    Out[210]:





array([[ 1.,  0.,  0.],
       [ 0.,  1.,  0.],
       [ 0.,  0.,  1.]])



In [211]:

    
# And
iMatrix2 = np.mat(iMatrix)  # 'mat' short for 'matrix'
iMatrix2









    Out[211]:





matrix([[ 1.,  0.,  0.],
        [ 0.,  1.,  0.],
        [ 0.,  0.,  1.]])



In [214]:

    
# Array multiplication.
# Reminder of xMatrix?
xMatrix









    Out[214]:





matrix([[ 1.,  2.,  3.]])



In [215]:

    
# Multiplication of any matrix by the identity matrix
# yields that matrix:
xMatrix * iMatrix









    Out[215]:





matrix([[ 1.,  2.,  3.]])



In [216]:

    
# Reminder of xMatrix2:
xMatrix2









    Out[216]:





matrix([[ 1.],
        [ 2.],
        [ 3.]])



In [217]:

    
xMatrix2 = iMatrix * xMatrix2
xMatrix2









    Out[217]:





matrix([[ 1.],
        [ 2.],
        [ 3.]])

Multiplication of matrices:



In [219]:

    
xMatrix * xMatrix2









    Out[219]:





matrix([[ 14.]])

IN THIS CASE IT IS equivalent to the dot product:



In [220]:

    
np.dot(xMatrix, xMatrix2)









    Out[220]:





matrix([[ 14.]])

MULTIPLICATION OF ARRAYS IS DIFFERENT (IT IS NOT MATRIX MATH):



In [221]:

    
xMatrix









    Out[221]:





matrix([[ 1.,  2.,  3.]])



In [222]:

    
xMatrix2









    Out[222]:





matrix([[ 1.],
        [ 2.],
        [ 3.]])



In [223]:

    
xArray  = np.array(xMatrix)
xArray2 = np.array(xMatrix2)
xArray * xArray2









    Out[223]:





array([[ 1.,  2.,  3.],
       [ 2.,  4.,  6.],
       [ 3.,  6.,  9.]])



In [224]:

    
xMatrix.shape, xMatrix2.shape









    Out[224]:





((1, 3), (3, 1))



In [225]:

    
xArray.shape









    Out[225]:





(1, 3)

Multiplying arrays of the same shape performs element-wise multiplication:



In [226]:

    
np.array(xMatrix) * np.array(xMatrix2).T









    Out[226]:





array([[ 1.,  4.,  9.]])

Moral of the story: **be careful**



In [ ]: