by Rob Hicks http://rlhick.people.wm.edu/stories/linear-algebra-python-basics.html
In this chapter, I will be discussing some linear algebra basics that will provide sufficient linear algebra background for effective programming in Python for our purposes. We will be doing very basic linear algebra that by no means covers the full breadth of this topic. Why linear algebra? Linear algebra allows us to express relatively complex linear expressions in a very compact way.
Being comfortable with the rules for scalar and matrix addition, subtraction, multiplication, and division (known as inversion) is important for our class.
Before we can implement any of these ideas in code, we need to talk a bit about python and how data is stored.
There are numerous ways to run python code. I will show you two and both are easily accessible after installing Anaconda:
The Spyder integrated development environment. The major advantages of Spyder is that it provides a graphical way for viewing matrices, vectors, and other objects you want to check as you work on a problem. It also has the most intuitive way of debugging code.
Spyder looks like this:
The Ipython Notebook (now called Jupyter). The major advantages of this approach is that you use your web browser for all of your python work and you can mix code, videos, notes, graphics from the web, and mathematical notation to tell the whole story of your python project. In fact, I am using the ipython notebook for writing these notes.
The Ipython Notebook looks like this:
In my work flow, I usually only use the Ipython Notebook, but for some coding problems where I need access to the easy debugging capabilities of Spyder, I use it. We will be using the Ipython Notebook interface (web browser) mostly in this class.
The python universe has a huge number of libraries that extend the capabilities of python. Nearly all of these are open source, unlike packages like stata or matlab where some key libraries are proprietary (and can cost lots of money). In lots of my code, you will see this at the top:
In [33]:
%matplotlib inline
##import sympy as sympy
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sbn
from scipy import *
This code sets up Ipython Notebook environments (lines beginning with %
), and loads several libraries and functions. The core scientific stack in python consists of a number of free libraries. The ones I have loaded above include:
Scalars can be created easily like this:
In [3]:
x = .5
print x
In [4]:
x_vector = np.array([1,2,3])
print x_vector
We could have done this by defining a python list and converting it to an array:
In [5]:
c_list = [1,2]
print "The list:",c_list
print "Has length:", len(c_list)
c_vector = np.array(c_list)
print "The vector:", c_vector
print "Has shape:",c_vector.shape
In [6]:
z = [5,6]
print "This is a list, not an array:",z
print type(z)
In [7]:
zarray = np.array(z)
print "This is an array, not a list",zarray
print type(zarray)
In [8]:
b = zip(z,c_vector)
print b
print "Note that the length of our zipped list is 2 not (2 by 2):",len(b)
In [9]:
print "But we can convert the list to a matrix like this:"
A = np.array(b)
print A
print type(A)
print "A has shape:",A.shape
To learn the basics, consider a small matrix of dimension $2 \times 2$, where $2 \times 2$ denotes the number of rows $\times$ the number of columns. Let $A$=$\bigl( \begin{smallmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{smallmatrix} \bigr)$. Consider adding a scalar value (e.g. 3) to the A. $$ \begin{equation} A+3=\begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{bmatrix}+3 =\begin{bmatrix} a_{11}+3 & a_{12}+3 \\ a_{21}+3 & a_{22}+3 \end{bmatrix} \end{equation} $$ The same basic principle holds true for A-3: $$ \begin{equation} A-3=\begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{bmatrix}-3 =\begin{bmatrix} a_{11}-3 & a_{12}-3 \\ a_{21}-3 & a_{22}-3 \end{bmatrix} \end{equation} $$ Notice that we add (or subtract) the scalar value to each element in the matrix A. A can be of any dimension.
This is trivial to implement, now that we have defined our matrix A:
In [10]:
result = A + 3
#or
result = 3 + A
print result
Consider two small $2 \times 2$ matrices, where $2 \times 2$ denotes the # of rows $\times$ the # of columns. Let $A$=$\bigl( \begin{smallmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{smallmatrix} \bigr)$ and $B$=$\bigl( \begin{smallmatrix} b_{11} & b_{12} \\ b_{21} & b_{22} \end{smallmatrix} \bigr)$. To find the result of $A-B$, simply subtract each element of A with the corresponding element of B:
$$ \begin{equation} A -B = \begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{bmatrix} - \begin{bmatrix} b_{11} & b_{12} \\ b_{21} & b_{22} \end{bmatrix} = \begin{bmatrix} a_{11}-b_{11} & a_{12}-b_{12} \\ a_{21}-b_{21} & a_{22}-b_{22} \end{bmatrix} \end{equation} $$Addition works exactly the same way:
$$ \begin{equation} A + B = \begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{bmatrix} + \begin{bmatrix} b_{11} & b_{12} \\ b_{21} & b_{22} \end{bmatrix} = \begin{bmatrix} a_{11}+b_{11} & a_{12}+b_{12} \\ a_{21}+b_{21} & a_{22}+b_{22} \end{bmatrix} \end{equation} $$An important point to know about matrix addition and subtraction is that it is only defined when $A$ and $B$ are of the same size. Here, both are $2 \times 2$. Since operations are performed element by element, these two matrices must be conformable- and for addition and subtraction that means they must have the same numbers of rows and columns. I like to be explicit about the dimensions of matrices for checking conformability as I write the equations, so write
$$ A_{2 \times 2} + B_{2 \times 2}= \begin{bmatrix} a_{11}+b_{11} & a_{12}+b_{12} \\ a_{21}+b_{21} & a_{22}+b_{22} \end{bmatrix}_{2 \times 2} $$Notice that the result of a matrix addition or subtraction operation is always of the same dimension as the two operands.
Let's define another matrix, B, that is also $2 \times 2$ and add it to A:
In [11]:
B = np.random.randn(2,2)
print B
In [12]:
result = A + B
result
Out[12]:
As before, let $A$=$\bigl( \begin{smallmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{smallmatrix} \bigr)$. Suppose we want to multiply A times a scalar value (e.g. $3 \times A$)
$$ \begin{equation} 3 \times A = 3 \times \begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{bmatrix} = \begin{bmatrix} 3a_{11} & 3a_{12} \\ 3a_{21} & 3a_{22} \end{bmatrix} \end{equation} $$is of dimension (2,2). Scalar multiplication is commutative, so that $3 \times A$=$A \times 3$. Notice that the product is defined for a matrix A of any dimension.
Similar to scalar addition and subtration, the code is simple:
In [13]:
A * 3
Out[13]:
Now, consider the $2 \times 1$ vector $C=\bigl( \begin{smallmatrix} c_{11} \\ c_{21} \end{smallmatrix} \bigr)$
Consider multiplying matrix $A_{2 \times 2}$ and the vector $C_{2 \times 1}$. Unlike the addition and subtraction case, this product is defined. Here, conformability depends not on the row and column dimensions, but rather on the column dimensions of the first operand and the row dimensions of the second operand. We can write this operation as follows
$$ \begin{equation} A_{2 \times 2} \times C_{2 \times 1} = \begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{bmatrix}_{2 \times 2} \times \begin{bmatrix} c_{11} \\ c_{21} \end{bmatrix}_{2 \times 1} = \begin{bmatrix} a_{11}c_{11} + a_{12}c_{21} \\ a_{21}c_{11} + a_{22}c_{21} \end{bmatrix}_{2 \times 1} \end{equation} $$Alternatively, consider a matrix C of dimension $2 \times 3$ and a matrix A of dimension $3 \times 2$
$$ \begin{equation} A_{3 \times 2}=\begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \\ a_{31} & a_{32} \end{bmatrix}_{3 \times 2} , C_{2 \times 3} = \begin{bmatrix} c_{11} & c_{12} & c_{13} \\ c_{21} & c_{22} & c_{23} \\ \end{bmatrix}_{2 \times 3} \end{equation} $$Here, A $\times$ C is
$$ \begin{align} A_{3 \times 2} \times C_{2 \times 3}=& \begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \\ a_{31} & a_{32} \end{bmatrix}_{3 \times 2} \times \begin{bmatrix} c_{11} & c_{12} & c_{13} \\ c_{21} & c_{22} & c_{23} \end{bmatrix}_{2 \times 3} \\ =& \begin{bmatrix} a_{11} c_{11}+a_{12} c_{21} & a_{11} c_{12}+a_{12} c_{22} & a_{11} c_{13}+a_{12} c_{23} \\ a_{21} c_{11}+a_{22} c_{21} & a_{21} c_{12}+a_{22} c_{22} & a_{21} c_{13}+a_{22} c_{23} \\ a_{31} c_{11}+a_{32} c_{21} & a_{31} c_{12}+a_{32} c_{22} & a_{31} c_{13}+a_{32} c_{23} \end{bmatrix}_{3 \times 3} \end{align} $$So in general, $X_{r_x \times c_x} \times Y_{r_y \times c_y}$ we have two important things to remember:
Given these facts, you should convince yourself that matrix multiplication is not generally commutative, that the relationship $X \times Y = Y \times X$ does not hold in all cases. For this reason, we will always be very explicit about whether we are pre multiplying ($X \times Y$) or post multiplying ($Y \times X$) the vectors/matrices $X$ and $Y$.
For more information on this topic, see this http://en.wikipedia.org/wiki/Matrix_multiplication.
In [14]:
# Let's redefine A and C to demonstrate matrix multiplication:
A = np.arange(6).reshape((3,2))
C = np.random.randn(2,2)
print A.shape
print C.shape
We will use the numpy dot operator to perform the these multiplications. You can use it two ways to yield the same result:
In [15]:
print A.dot(C)
print np.dot(A,C)
Suppose instead of pre-multiplying C by A, we post-multiply. The product doesn't exist because we don't have conformability as described above:
In [16]:
C.dot(A)
The term matrix division is actually a misnomer. To divide in a matrix algebra world we first need to invert the matrix. It is useful to consider the analog case in a scalar work. Suppose we want to divide the $f$ by $g$. We could do this in two different ways: $$ \begin{equation} \frac{f}{g}=f \times g^{-1}. \end{equation} $$ In a scalar seeting, these are equivalent ways of solving the division problem. The second one requires two steps: first we invert g and then we multiply f times g. In a matrix world, we need to think about this second approach. First we have to invert the matrix g and then we will need to pre or post multiply depending on the exact situation we encounter (this is intended to be vague for now).
As before, consider the square $2 \times 2$ matrix $A$=$\bigl( \begin{smallmatrix} a_{11} & a_{12} \\ a_{21} & a_{22}\end{smallmatrix} \bigr)$. Let the inverse of matrix A (denoted as $A^{-1}$) be
$$ \begin{equation} A^{-1}=\begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{bmatrix}^{-1}=\frac{1}{a_{11}a_{22}-a_{12}a_{21}} \begin{bmatrix} a_{22} & -a_{12} \\ -a_{21} & a_{11} \end{bmatrix} \end{equation} $$The inverted matrix $A^{-1}$ has a useful property: $$ \begin{equation} A \times A^{-1}=A^{-1} \times A=I \end{equation} $$ where I, the identity matrix (the matrix equivalent of the scalar value 1), is $$ \begin{equation} I_{2 \times 2}=\begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix} \end{equation} $$ furthermore, $A \times I = A$ and $I \times A = A$.
An important feature about matrix inversion is that it is undefined if (in the $2 \times 2$ case), $a_{11}a_{22}-a_{12}a_{21}=0$. If this relationship is equal to zero the inverse of A does not exist. If this term is very close to zero, an inverse may exist but $A^{-1}$ may be poorly conditioned meaning it is prone to rounding error and is likely not well identified computationally. The term $a_{11}a_{22}-a_{12}a_{21}$ is the determinant of matrix A, and for square matrices of size greater than $2 \times 2$, if equal to zero indicates that you have a problem with your data matrix (columns are linearly dependent on other columns). The inverse of matrix A exists if A is square and is of full rank (ie. the columns of A are not linear combinations of other columns of A).
For more information on this topic, see this http://en.wikipedia.org/wiki/Matrix_inversion, for example, on inverting matrices.
In [17]:
# note, we need a square matrix (# rows = # cols), use C:
C_inverse = np.linalg.inv(C)
print C_inverse
Check that $C\times C^{-1} = I$:
In [18]:
print C.dot(C_inverse)
print "Is identical to:"
print C_inverse.dot(C)
At times it is useful to pivot a matrix for conformability- that is in order to matrix divide or multiply, we need to switch the rows and column dimensions of matrices. Consider the matrix $$ \begin{equation} A_{3 \times 2}=\begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \\ a_{31} & a_{32} \end{bmatrix}_{3 \times 2} \end{equation} $$ The transpose of A (denoted as $A^{\prime}$) is $$ \begin{equation} A^{\prime}=\begin{bmatrix} a_{11} & a_{21} & a_{31} \\ a_{12} & a_{22} & a_{32} \\ \end{bmatrix}_{2 \times 3} \end{equation} $$
In [21]:
A = np.arange(6).reshape((3,2))
B = np.arange(8).reshape((2,4))
print "A is"
print A
print "The Transpose of A is"
print A.T
One important property of transposing a matrix is the transpose of a product of two matrices. Let matrix A be of dimension $N \times M$ and let B of of dimension $M \times P$. Then $$ \begin{equation} (AB)^{\prime}=B^{\prime}A^{\prime} \end{equation} $$ For more information, see this http://en.wikipedia.org/wiki/Matrix_transposition on matrix transposition. This is also easy to implement:
In [22]:
print B.T.dot(A.T)
print "Is identical to:"
print (A.dot(B)).T
In [23]:
print A
A[:,0]
Out[23]:
or, we could grab a particular element (in this case, the second column, last row):
In [24]:
A[2,1]
Out[24]:
In [25]:
print A
In [26]:
print A[:,1]>4
A[A[:,1]>4]
Out[26]:
In [27]:
A = np.arange(24).reshape((12,2))
print A
print A.shape
The code below loops over the rows (12 of them) of our matrix A. For each row, it slices A and prints the row values across all columns. Notice the form of the for loop. The colon defines the statement we are looping over. For each iteration of the loop idented lines will be executed:
In [28]:
for rows in A:
print rows
In [29]:
for rows in A:
print rows
In [27]:
for cols in A.T:
print cols
In [30]:
x=.4
if x<.5:
print "Heads"
print 100
elif x>.5:
print "Tails"
print 0
else:
print "Tie"
print 50
In [31]:
x=0
while x<10:
x+=1
print x<10
print x
In [37]:
v = np.random.beta(56, 23, 100)
In [40]:
plt.hist(v)
Out[40]:
In [54]:
np.savetxt("myvextor.txt", v.reshape(20,5))
In [55]:
%%sh
head -10 myvextor.txt
In [44]:
w = np.loadtxt("myvextor.txt")
In [52]:
(w*v).sum()/(w.)
Out[52]:
In [56]:
a = v.reshape(10,10)
In [63]:
type(a)
Out[63]:
In [64]:
np.matrix(a)
Out[64]:
In [68]:
v =
type(v)
Out[68]:
In [76]:
v = np.matrix(np.random.rand(10,1))
w = np.matrix(np.random.rand(10,1))
In [78]:
a = v.dot(w.transpose())
In [81]:
a.shape
Out[81]:
In [82]:
type(a)
Out[82]:
In [84]:
lag = np.linalg
In [85]:
Out[85]:
In [ ]: