Consider the following system of two linear algebraic equations:
\begin{align} a_{11}x_1 + a_{12}x_2 &= b_1\\ a_{21}x_1 + a_{22}x_2 &= b_2. \end{align}Each equation describes a line in the $(x_1,x_2)$-plane. By geometry, this system has a unique solution if an only if the lines intersect at a single point. This happens when the slopes of these lines are different, i.e. there is a unique solution if and only if
$$ \frac{-a_{11}}{a_{12}}\neq\frac{-a_{21}}{a_{22}} $$provided that $a_{12}$ and $a_{22}$ are not zero. In case these lines are parallel (and non-intersecting) there is no solution. In case these lines coincide (on top of each other, meaning that one of the equations is a constant multiple of the other), there are infinitely many solutions since the system reduces to a single linear equation with two unknowns.
The system of linear equations above can be expressed in the form
$$ \begin{bmatrix}a_{11} & a_{12} \\ a_{21} & a_{22} \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} = \begin{bmatrix} b_1 \\ b_2 \end{bmatrix}, $$or, if we label the $2\times 2$ matrix, the $2\times 1$ column vector unknown, and the $2\times 1$ right hand side column vector above as
\begin{align} \boldsymbol{A} &=\begin{bmatrix}a_{11} & a_{12} \\ a_{21} & a_{22} \end{bmatrix},\\ \boldsymbol{x} &= \begin{bmatrix} x_1 \\ x_2 \end{bmatrix},\\ \boldsymbol{b} &=\begin{bmatrix} b_1 \\ b_2 \end{bmatrix}, \end{align}we can rewrite the system in the following compact, matrix form:
$$ \boldsymbol{A}\boldsymbol{x} = \boldsymbol{b} $$The matrix $\boldsymbol{A}$ is called the coefficient matrix since it contains the coefficients of the unknowns in the linear system of equations.
At this point, observe that if we use substitution method to solve for $x_1$ and $x_2$, we obtain the following formulae:
\begin{align} x_1 &= \frac{b_1 a_{22} - b_2 a_{12}}{a_{11}a_{22}-a_{12}a_{21}}\\ x_2 &= \frac{a_{11}b_2 - a_{21}b_1}{a_{11}a_{22}-a_{12}a_{21}} \end{align}The denominators in these expressions are the same and this denominator is independent of the the values $b_1$ and $b_2$. Note that $x_1$ and $x_2$ are uniquely determined if and only if the denominators are non-zero, that is,
$$ a_{11}a_{22}-a_{12}a_{21}\neq 0. $$The property that $a_{11}a_{22}-a_{12}a_{21}\neq 0$ is equivalent to the property that the lines determined by the two equations in the linear system have different slopes. In fact, this expression has a name: $a_{11}a_{22}-a_{12}a_{21}$ is called the determinant of the matrix $\boldsymbol{A}$. We write:
$$ \det(\boldsymbol{A}) = \det \left( \begin{bmatrix}a_{11} & a_{12} \\ a_{21} & a_{22} \end{bmatrix}\right)=\left\lvert\begin{matrix}a_{11} & a_{12} \\ a_{21} & a_{22} \end{matrix} \right\rvert=a_{11}a_{22}-a_{12}a_{21}. $$We now state the first fact of today's lecture.
Fact: A $2\times 2$ system of linear algebraic equations
$$ \boldsymbol{A}\boldsymbol{x}=\boldsymbol{b} $$has a unique solution $\boldsymbol{x}$ if and only if the determinant of the coefficient matrix is non-zero, i.e. if and only if
$$ \det(\boldsymbol{A}) \neq 0. $$Observe that the formulae we got for $x_1$ and $x_2$ using the substitution method are nothing but ratios of determinants:
\begin{align} x_1 &= \frac{b_1 a_{22} - b_2 a_{12}}{a_{11}a_{22}-a_{12}a_{21}}=\frac{\left\lvert \begin{matrix}b_{1} & a_{12} \\ b_{2} & a_{22} \end{matrix}\right\rvert}{\left\lvert \begin{matrix}a_{11} & a_{12} \\ a_{21} & a_{22} \end{matrix}\right\rvert}\\ x_2 &= \frac{a_{11}b_2 - a_{21}b_1}{a_{11}a_{22}-a_{12}a_{21}}=\frac{\left\lvert \begin{matrix}a_{11} & b_{1} \\ a_{21} & b_{2} \end{matrix}\right\rvert}{\left\lvert \begin{matrix}a_{11} & a_{12} \\ a_{21} & a_{22} \end{matrix}\right\rvert} \end{align}where the matrices whose determinants appear in the numerators are obtained by replacing the first column of the coefficient matrix by the right hand side vector $\boldsymbol{b}$ for $x_1$, and by replacing the second column of the coefficient matrix by the right hand side vector $\boldsymbol{b}$ for $x_2$. This is called the Cramer's Rule in linear algebra.
Before proceeding further we define the identity matrix, denoted by $\mathbf{I}$:
$$ \mathbf{I} = \begin{bmatrix} 1&0\\0&1\end{bmatrix} $$This matrix is the identity element of the multiplication operation on matrices. It just acts like the number $1$ does in ordinary multiplication of real numbers. For any matrix $\boldsymbol{A}$, it holds that
$$ \mathbf{I}\,\boldsymbol{A} = \boldsymbol{A}\,\mathbf{I} = \boldsymbol{A}. $$We also define the inverse of a matrix. Given a matrix $\boldsymbol{A}$, if there exists a matrix $\boldsymbol{B}$ such that
$$ \boldsymbol{A}\boldsymbol{B} = \boldsymbol{B}\boldsymbol{A} = \mathbf{I} $$then $\boldsymbol{A}$ is said to be invertible and the matrix $\boldsymbol{B}$ is called the inverse of $\boldsymbol{A}$. We denote the inverse of $\boldsymbol{A}$ by the matrix $\boldsymbol{A}^{-1}$ (just like reciprocals of real numbers). Moreover, the formula for the inverse of
$$ \boldsymbol{A} = \begin{bmatrix}a_{11} & a_{12} \\ a_{21} & a_{22} \end{bmatrix} $$is given explicitly by
$$ \boldsymbol{A}^{-1} =\frac{1}{\det(\boldsymbol{A})} \begin{bmatrix}a_{22} & -a_{12} \\ -a_{21} & a_{11} \end{bmatrix} $$We arrive at the second fact of today's lecture.
Fact: A matrix $\boldsymbol{A}$ is invertible if and only if $\det(\boldsymbol{A})\neq 0$.
You can directly verify this fact from the formula for the inverse. Finally, if a matrix has non-zero determinant, it is called non-singular or invertible. On the other hand, if the determinant of $\boldsymbol{A}$ is zero, then $\boldsymbol{A}$ is called singular or non-invertible.
Now consider the homogeneous system
\begin{align} a_{11}x_1 + a_{12}x_2 &= 0\\ a_{21}x_1 + a_{22}x_2 &= 0, \end{align}or equivalently, the matrix equation
$$ \boldsymbol{A}\boldsymbol{x} = \boldsymbol{0}. $$Note that each of the equations in the system determine a line that passes through the origin and clearly, the zero vector is always a solution. That is,
$$ \boldsymbol{x}=\begin{bmatrix} x_1 \\ x_2 \end{bmatrix}=\boldsymbol{0} = \begin{bmatrix} 0 \\ 0 \end{bmatrix} $$is always a solution of the homogenous linear system. It is therefore called the trivial solution. Now, recall that this linear system has a unique solution if and only if $\det(\boldsymbol{A})\neq 0$. Therefore, in order to find a non-trivial (different than the zero vector) solution to this system, it ought to be true that the coefficient matrix is singular, that is, $\det(\boldsymbol{A}) = 0$. Note that in this case there are infinitely many solutions. Since both of the lines pass from the origin, if they have the same slope (i.e. $\det(\boldsymbol{A})=0$), they must be the same line (coincident lines), in which case the linear system above reduces to a single equation leading to infinitely many solutions.
Now, given a matrix $\boldsymbol{A}$, consider the operation of multiplying a vector $\boldsymbol{x}$ by the matrix $\boldsymbol{A}$ as a transformation (or a mapping that maps vectors to vectors):
$$ \boldsymbol{x}\mapsto\boldsymbol{A}\boldsymbol{x}. $$This is precisely answering to the question: What does the matrix $\boldsymbol{A}$ do to the vectors it acts on? To be concrete, for example, consider the matrix
$$ \boldsymbol{A} = \begin{bmatrix} 1 & 2 \\ 3 & 4 \end{bmatrix}, $$and observe that this matrix transforms the vector $\begin{bmatrix} 1 \\ 0 \end{bmatrix}$ to $\begin{bmatrix} 1 \\ 3 \end{bmatrix}$:
$$ \begin{bmatrix} 1 \\ 0 \end{bmatrix} \mapsto \boldsymbol{A}\begin{bmatrix} 1 \\ 0 \end{bmatrix} = \begin{bmatrix} 1 \\ 3 \end{bmatrix} $$You can see that $\boldsymbol{A}$ rotates and stretches $\begin{bmatrix} 1 \\ 0 \end{bmatrix}$ when it acts on it. We may now ask the following question:
Question: Are there any directions in the $(x_1,x_2)$ plane that are left invariant by $\boldsymbol{A}$? In other words, are there any vectors that are only stretched by the matrix $\boldsymbol{A}$?
Mathematical formulation of this question naturally takes the following form: Given a matrix $\boldsymbol{A}$, find all possible vectors $\boldsymbol{x}$ such that
$$ \boldsymbol{A}\boldsymbol{x} = \lambda \boldsymbol{x} $$for some scalar (complex or real number) $\lambda$.
Such a vector $\boldsymbol{x}$ is called an eigenvector of $\boldsymbol{A}$ and the scalar $\lambda$ is called an eigenvalue of $\boldsymbol{A}$.
Note that solving
$$ \boldsymbol{A}\boldsymbol{x} = \lambda \boldsymbol{x} $$for $\boldsymbol{x}$ is equivalent to solving the homogeneous equation
$$ (\boldsymbol{A}-\lambda\,\mathbf{I})\,\boldsymbol{x} = \boldsymbol{0} $$Since we look for non-zero (nontrivial) solutions $\boldsymbol{x}$ of this equation, we require that the coefficient matrix $(\boldsymbol{A}-\lambda\,\mathbf{I})$ of this homogeneous equation to be singular. In other words, eigenvectors exist for those values of $\lambda$ which lead to
$$ \det(\boldsymbol{A}-\lambda\,\mathbf{I})=0. $$Such values of $\lambda$ are called eigenvalues of $\boldsymbol{A}$. As we will see in a moment, for a $2\times 2$ matrix this equation is a quadratic equation in the $\lambda$ variable. Indeed, letting
$$ \boldsymbol{A} = \begin{bmatrix}a_{11} & a_{12} \\ a_{21} & a_{22} \end{bmatrix} $$as before, we have
$$ \det(\boldsymbol{A}-\lambda\,\mathbf{I})=\left\lvert\begin{matrix}a_{11}-\lambda & a_{12} \\ a_{21} & a_{22}-\lambda \end{matrix}\right\rvert = \lambda^2 -(a_{11}+a_{22})\lambda + a_{11}a_{22}-a_{12}a_{21} = \lambda^2 - \text{tr}(\boldsymbol{A})\lambda + \det(\boldsymbol{A}), $$where $\text{tr}(\boldsymbol{A})=a_{11}+a_{22}$ is the sum of the diagonal elements of the matrix $\boldsymbol{A}$ and it is called the trace of $\boldsymbol{A}$.
To sum up, in order to find eigenvectors of a matrix $\boldsymbol{A}$, we first need to determine for which values of $\lambda$ we have
$$ \det(\boldsymbol{A}-\lambda\,\mathbf{I})=0, $$or equivalently
$$ \lambda^2 - \text{tr}(\boldsymbol{A})\lambda + \det(\boldsymbol{A})=0. $$This equation is called the characteristic equation of $\boldsymbol{A}$. Since this is a quadratic polynomial equation in $\lambda$, there are three possibile scenarios that can happen:
We now calculate these for a given concrete matrix.
Example: Find the eigenvalues and eigenvectors of the matrix
$$ \boldsymbol{A}=\begin{bmatrix} 3 & 7 \\ 5 & 5 \end{bmatrix}. $$We calculate the roots of the equation
$$ \det(\boldsymbol{A}-\lambda\,\mathbf{I})=\left\lvert\begin{matrix} 3-\lambda & 7 \\ 5 & 5-\lambda \end{matrix}\right\rvert = \lambda^2 -8\lambda -20=0 $$This equation factors as $(\lambda- 10)(\lambda+2) = 0$. We find that $\lambda_1=10$ and $\lambda_2=-2$ are the two eigenvalues of the matrix.
_Eigenvector associated to the eigenvalue $\lambda_1 = 10$:_ Need to solve
$$ (\boldsymbol{A}-10\,\mathbf{I}\,)\boldsymbol{x} = \boldsymbol{0} $$for $\boldsymbol{x}$. More precisely,
$$ \begin{bmatrix} -7 & 7 \\ 5 & -5 \end{bmatrix}\begin{bmatrix} x_1 \\ x_2 \end{bmatrix} = \begin{bmatrix} 0 \\ 0 \end{bmatrix}. $$Since the coefficient matrix is singular (we already set $\lambda$ equat to an eigenvalue we found), all the information is contained in one of the rows of this equation:
$$ -7x_1 +7 x_2 = 0 \iff x_1=x_2. $$Therefore if the value of $x_1$ is given, say $x_1=c$, then
$$\begin{bmatrix}x_1 \\ x_2 \end{bmatrix} = \begin{bmatrix}c \\ c \end{bmatrix}=c \begin{bmatrix}1 \\ 1 \end{bmatrix}.$$This means that the eigenvector associated to the eigenvalue $\lambda_1 = 10$ lies in the direction of $\begin{bmatrix}1 \\ 1 \end{bmatrix}$ and we might as well take it to be
$$ \boldsymbol{u}_1 = \begin{bmatrix}1 \\ 1 \end{bmatrix}. $$(Be careful: The subscript here is appended to a vector to indicate its affiliation with $\lambda_1$. It does not indicate a component.)
We now do the analogous calculation using the other eigenvalue.
_Eigenvector associated to the eigenvalue $\lambda_2 = -2$:_ Need to solve
$$ (\boldsymbol{A}-(-2)\,\mathbf{I}\,)\boldsymbol{x} =(\boldsymbol{A}+2\,\mathbf{I}\,)\boldsymbol{x}= \boldsymbol{0} $$for $\boldsymbol{x}$. More precisely,
$$ \begin{bmatrix} 5 & 7 \\ 5 & 7 \end{bmatrix}\begin{bmatrix} x_1 \\ x_2 \end{bmatrix} = \begin{bmatrix} 0 \\ 0 \end{bmatrix}. $$Again, since the coefficient matrix is singular , all the information is contained in one of the rows of this equation:
$$ 5x_1 +7 x_2 = 0 \iff x_1=-\frac{7}{5} x_2. $$Therefore if the value of $x_2$ is given, say $x_2=c$, then $x_1 = \frac{-7c}{5}$ and
$$\begin{bmatrix}x_1 \\ x_2 \end{bmatrix} = \begin{bmatrix}\frac{-7c}{5} \\ c \end{bmatrix}=c \begin{bmatrix} \frac{-7}{5} \\ 1 \end{bmatrix}.$$This means that the eigenvector associated to the eigenvalue $\lambda_2 = -2$ lies in the direction of $\begin{bmatrix} -7 \\ 5 \end{bmatrix}$ and we might as well take it to be (take $c=5$ above)
$$ \boldsymbol{u}_2 = \begin{bmatrix} -7 \\ 5 \end{bmatrix}. $$To sum up, $\boldsymbol{A}$ has the following eigenvalue-eigenvector pairs:
Please go over other examples done in class, especially the ones with different eigenvalue properties such as complex eigenvalues, repeated eigenvalues.
We close this topic with one last fact.
Fact: Suppose that a matrix has two distinct eigenvalues (real or complex) $\lambda_1\neq \lambda_2$ and let $\boldsymbol{u}_1$ and $\boldsymbol{u}_2$ denote the corresponding eigenvectors with the entries
\begin{align} \boldsymbol{u_1} &= \begin{bmatrix} u_{11} \\ u_{21} \end{bmatrix}\\ \boldsymbol{u_2} &= \begin{bmatrix} u_{12} \\ u_{22} \end{bmatrix}.\\ \end{align}Then the matrix of eigenvectors (the matrix that is obtained by placing the eigenvectors in columns next to each other)
$$ \boldsymbol{X} = \begin{bmatrix} \boldsymbol{u}_{1} ; \boldsymbol{u}_{2} \end{bmatrix} = \begin{bmatrix} u_{11} & u_{12}\\ u_{21} &u_{22}\end{bmatrix} $$is nonsingular, i.e.
$$ \det(\boldsymbol{X})\neq 0. $$
In [ ]: