In [1]:
import numpy as np
from numpy.random import randn as randn
from numpy.random import randint as randint
A vector can be represented by an array of real numbers
$$\mathbf{x} = [x_1, x_2, \ldots, x_n]$$Geometrically, a vector specifies the coordinates of the tip of the vector if the tail were placed at the origin
In [2]:
from IPython.display import Image
Image('images/vector.png')
Out[2]:
In [3]:
x = np.array([1,2,3,4])
print x
print x.shape
The norm of a vector $\mathbf{x}$ is defined by
$$||\boldsymbol{x}|| = \sqrt{x_1^2 + x_2^2 + \cdots + x_n^2}$$
In [4]:
print np.sqrt(np.sum(x**2))
print np.linalg.norm(x)
Adding a constant to a vector adds the constant to each element
$$a + \boldsymbol{x} = [a + x_1, a + x_2, \ldots, a + x_n]$$
In [5]:
a = 4
print x
print x + 4
Multiplying a vector by a constant multiplies each term by the constant.
$$a \boldsymbol{x} = [ax_1, ax_2, \ldots, ax_n]$$
In [6]:
print x
print x*4
print np.linalg.norm(x*4)
#print np.linalg.norm(x)
If we have two vectors $\boldsymbol{x}$ and $\boldsymbol{y}$ of the same length $(n)$, then the dot product is give by
$$\boldsymbol{x} \cdot \boldsymbol{y} = x_1y_1 + x_2y_2 + \cdots + x_ny_n$$
In [7]:
y = np.array([4, 3, 2, 1])
print x
print y
np.dot(x,y)
Out[7]:
If $\mathbf{x} \cdot \mathbf{y} = 0$ then $x$ and $y$ are orthogonal (aligns with the intuitive notion of perpindicular)
In [8]:
w = np.array([1, 2])
v = np.array([-2, 1])
np.dot(w,v)
Out[8]:
The norm squared of a vector is just the vector dot product with itself $$ ||x||^2 = x \cdot x $$
In [9]:
print np.linalg.norm(x)**2
print np.dot(x,x)
The distance between two vectors is the norm of the difference. $$ d(x,y) = ||x-y|| $$
In [10]:
np.linalg.norm(x-y)
Out[10]:
Cosine Similarity is the cosine of the angle between the two vectors give by
$$cos(\theta) = \frac{\boldsymbol{x} \cdot \boldsymbol{y}}{||\boldsymbol{x}|| \text{ } ||\boldsymbol{y}||}$$
In [11]:
x = np.array([1,2,3,4])
y = np.array([5,6,7,8])
np.dot(x,y)/(np.linalg.norm(x)*np.linalg.norm(y))
Out[11]:
If both $\boldsymbol{x}$ and $\boldsymbol{y}$ are zero-centered, this calculation is the correlation between $\boldsymbol{x}$ and $\boldsymbol{y}$
In [12]:
x_centered = x - np.mean(x)
y_centered = y - np.mean(y)
# The following gives the "Centered Cosine Similarity"
# ... which is equivelent to the "Sample Pearson Correlation Coefficient"
# ... (in the correlation case, we're interpreting the vector as a list of samples)
# ... see: https://en.wikipedia.org/wiki/Pearson_product-moment_correlation_coefficient#For_a_sample
np.dot(x_centered,y_centered)/(np.linalg.norm(x_centered)*np.linalg.norm(y_centered))
Out[12]:
If we have two vectors $\boldsymbol{x}$ and $\boldsymbol{y}$ of the same length $(n)$, then
$$\boldsymbol{x} + \boldsymbol{y} = [x_1+y_1, x_2+y_2, \ldots, x_n+y_n]$$
In [13]:
x = np.array([1,2,3,4])
y = np.array([5,6,7,8])
print x+y
In [14]:
a=2
x = np.array([1,2,3,4])
print a*x
A linear combination of a collection of vectors $(\boldsymbol{x}_1, \boldsymbol{x}_2, \ldots, \boldsymbol{x}_m)$ is a vector of the form
$$a_1 \cdot \boldsymbol{x}_1 + a_2 \cdot \boldsymbol{x}_2 + \cdots + a_m \cdot \boldsymbol{x}_m$$
In [15]:
a1=2
x1 = np.array([1,2,3,4])
print a1*x1
a2=4
x2 = np.array([5,6,7,8])
print a2*x2
print a1*x1 + a2*x2
An $n \times p$ matrix is an array of numbers with $n$ rows and $p$ columns:
$$ X = \begin{bmatrix} x_{11} & x_{12} & \cdots & x_{1p} \\ x_{21} & x_{22} & \cdots & x_{2p} \\ \vdots & \vdots & \ddots & \vdots \\ x_{n1} & x_{n2} & \cdots & x_{np} \end{bmatrix} $$$n$ = the number of subjects
$p$ = the number of features
For the following $2 \times 3$ matrix $$ X = \begin{bmatrix} 1 & 2 & 3\\ 4 & 5 & 6 \end{bmatrix} $$
We can create in Python using NumPY
In [16]:
X = np.array([[1,2,3],[4,5,6]])
print X[1, 2]
print X
print X.shape
Let $X$ and $Y$ be matrices of the dimension $n \times p$. Let $x_{ij}$ $y_{ij}$ for $i=1,2,\ldots,n$ and $j=1,2,\ldots,p$ denote the entries in these matrices, then
In [17]:
X = np.array([[1,2,3],[4,5,6]])
print X
Y = np.array([[7,8,9],[10,11,12]])
print Y
print X+Y
In [18]:
X = np.array([[1,2,3],[4,5,6]])
print X
Y = np.array([[7,8,9],[10,11,12]])
print Y
print X-Y
In [19]:
X = np.array([[1,2,3],[4,5,6]])
print X
a=5
print a*X
In order to multiply two matrices, they must be conformable such that the number of columns of the first matrix must be the same as the number of rows of the second matrix.
Let $X$ be a matrix of dimension $n \times k$ and let $Y$ be a matrix of dimension $k \times p$, then the product $XY$ will be a matrix of dimension $n \times p$ whose $(i,j)^{th}$ element is given by the dot product of the $i^{th}$ row of $X$ and the $j^{th}$ column of $Y$
$$\sum_{s=1}^k x_{is}y_{sj} = x_{i1}y_{1j} + \cdots + x_{ik}y_{kj}$$If $X$ and $Y$ are square matrices of the same dimension, then the both the product $XY$ and $YX$ exist; however, there is no guarantee the two products will be the same
In [20]:
X = np.array([[2,1,0],[-1,2,3]])
print X
Y = np.array([[0,-2],[1,2],[1,1]])
print Y
# Matrix multiply with dot operator
print np.dot(X,Y)
print X.dot(Y)
In [21]:
# Regular multiply operator is just element-wise multiplication
print X
print Y.transpose()
print X*Y.T
The transpose of an $n \times p$ matrix is a $p \times n$ matrix with rows and columns interchanged
$$ X^T = \begin{bmatrix} x_{11} & x_{12} & \cdots & x_{1n} \\ x_{21} & x_{22} & \cdots & x_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ x_{p1} & x_{p2} & \cdots & x_{pn} \end{bmatrix} $$
In [22]:
print X
X_T = X.transpose()
print X_T
print X_T.shape
In numpy, when we enter a vector, it will not normally have the second dimension, so we can reshape it
In [23]:
x = np.array([1,2,3,4])
print x
print x.shape
In [24]:
y = x.reshape(4,1)
z = x[:,np.newaxis]
print y
print z
print y.shape
print z.shape
and a row vector is generally written as the transpose
$$\boldsymbol{x}^T = [x_1, x_2, \ldots, x_n]$$
In [25]:
x_T = y.transpose()
print x_T
print x_T.shape
print x
If we have two vectors $\boldsymbol{x}$ and $\boldsymbol{y}$ of the same length $(n)$, then the dot product is give by matrix multiplication
$$\boldsymbol{x}^T \boldsymbol{y} = \begin{bmatrix} x_1& x_2 & \ldots & x_n \end{bmatrix} \begin{bmatrix} y_{1}\\ y_{2}\\ \vdots\\ y_{n} \end{bmatrix} = x_1y_1 + x_2y_2 + \cdots + x_ny_n$$The inverse of a square $n \times n$ matrix $X$ is an $n \times n$ matrix $X^{-1}$ such that
$$X^{-1}X = XX^{-1} = I$$Where $I$ is the identity matrix, an $n \times n$ diagonal matrix with 1's along the diagonal.
If such a matrix exists, then $X$ is said to be invertible or nonsingular, otherwise $X$ is said to be noninvertible or singular.
In [26]:
print np.identity(4)
X = np.array([[1,2,3], [0,1,0], [-2, -1, 0]])
Y = np.linalg.inv(X)
print Y
print Y.dot(X)
print np.allclose(X,Y.dot(X))
This is equivalent to saying that the columns of $X$ are all orthogonal to each other (and have unit length).
A system of equations of the form: \begin{align*} a_{11}x_1 + \cdots + a_{1n}x_n &= b_1 \\ \vdots \hspace{1in} \vdots \\ a_{m1}x_1 + \cdots + a_{mn}x_n &= b_m \end{align*} can be written as a matrix equation: $$ A\mathbf{x} = \mathbf{b} $$ and hence, has solution $$ \mathbf{x} = A^{-1}\mathbf{b} $$
Let $A$ be an $n \times n$ matrix and $\boldsymbol{x}$ be an $n \times 1$ nonzero vector. An eigenvalue of $A$ is a number $\lambda$ such that
$$A \boldsymbol{x} = \lambda \boldsymbol{x}$$A vector $\boldsymbol{x}$ satisfying this equation is called an eigenvector associated with $\lambda$
Eigenvectors and eigenvalues will play a huge roll in matrix methods later in the course (PCA, SVD, NMF).
In [27]:
A = np.array([[1, 1], [1, 2]])
vals, vecs = np.linalg.eig(A)
print vals
print vecs
In [28]:
lam = vals[0]
vec = vecs[:,0]
print A.dot(vec)
print lam * vec
In [ ]: