In [1]:
import numpy as np
from scipy import linalg
The usual rules for real matrices extend without modification to complex valued matrices.
a. $\det(\boldsymbol{A}) = (4 + 4i)(4 + i) - (2-i)(-3 + 2i) = 16 + 13 i$
b. $\det(\boldsymbol{A}^{H}) = (4 - 4i)(4 - i) - (-3 - 2i)(2 + i) = 16 - 13i$
c. Inverse:
$$ \boldsymbol{A}^{-1} = \frac{1}{16 + 13 i } \begin{bmatrix} 4 + i & -2 + i \\ 3 - 2i & 4 + 4i \end{bmatrix} $$
Extra: SymPy can be used for symbolic computations:
In [2]:
import sympy
sympy.interactive.printing.init_printing(use_unicode=False, wrap_line=False)
A = sympy.Matrix([[4 + sympy.I*4 , 2- sympy.I], [-3 + sympy.I*2 , 4 + sympy.I]])
A
Out[2]:
In [3]:
det = A.det()
sympy.simplify(det)
Out[3]:
In [4]:
detH = A.H.det()
sympy.simplify(detH)
Out[4]:
In [5]:
Ainv = A.inv()
sympy.simplify(Ainv)
Out[5]:
In [6]:
sympy.simplify(A.multiply(A.inv()))
Out[6]:
Computing roots,
\begin{align} \lambda &= \cos \theta \pm \sqrt{\cos^{2}\theta - 1} \\ &= \cos \theta \pm i \sin\theta \end{align}a. If $\boldsymbol{x}$ is an eigenvector and $\lambda$ is an eigenvalue,
$$ \boldsymbol{M} \boldsymbol{x} = \lambda \boldsymbol{x} $$
Premultiplying by $\boldsymbol{x}^{H}$,
$$ \boldsymbol{x}^{H} \boldsymbol{M} \boldsymbol{x} = \lambda \boldsymbol{x}^{H} \boldsymbol{x} $$
Taking the complex conjugate of both sides and noting that $\boldsymbol{x}^{H} \boldsymbol{x}$ is real,
$$ \left(\boldsymbol{x}^{H} \boldsymbol{M} \boldsymbol{x}\right)^{H} = \boldsymbol{x}^{H} \boldsymbol{M}^{H} \boldsymbol{x} = \bar{\lambda} \boldsymbol{x}^{H} \boldsymbol{x} $$
Since $\boldsymbol{M}$ is Hermitian ($\boldsymbol{M} =\boldsymbol{M}^{H}$), comparing the above two equations they can hold ony if $\lambda = \bar{\lambda}$, which is true only if $\lambda$ is real.
b. Consider two eigenpairs $(\lambda_{1}, \boldsymbol{x}_{1})$ and $( \lambda_{2}, \boldsymbol{x}_{2})$, where $\lambda_{1} \ne \lambda_{2}$. We have:
\begin{align} \boldsymbol{M} \boldsymbol{x}_{1} &= \lambda_{1} \boldsymbol{x}_{1} \\ \boldsymbol{M} \boldsymbol{x}_{2} &= \lambda_{2} \boldsymbol{x}_{2} \end{align}
Multiplying by $\boldsymbol{x}_{2}$ and $\boldsymbol{x}_{1}$, respectively,
\begin{align} \boldsymbol{x}_{2}^{H} \boldsymbol{M} \boldsymbol{x}_{1} &= \lambda_{1} \boldsymbol{x}_{2}^{H} \boldsymbol{x}_{1} \\ \boldsymbol{x}_{1}^{H} \boldsymbol{M} \boldsymbol{x}_{2} &= \lambda_{2} \boldsymbol{x}_{1}^{H} \boldsymbol{x}_{2} \end{align}
Taking the complex conjugate transpose of the first equation and exploiting that $\boldsymbol{M} = \boldsymbol{M}^{H}$,
\begin{equation} \left(\boldsymbol{x}_{2}^{H} \boldsymbol{M} \boldsymbol{x}_{1}\right)^{H} = \boldsymbol{x}_{1}^{H} \boldsymbol{M} \boldsymbol{x}_{2} = \lambda_{1} \boldsymbol{x}_{1}^{H} \boldsymbol{x}_{2} \end{equation}
Comparing this to $\boldsymbol{x}_{1}^{H}\boldsymbol{M} \boldsymbol{x}_{2} = \lambda_{2} \boldsymbol{x}_{1}^{H}\boldsymbol{x}_{2}$, since $\lambda_{1} \ne \lambda_{2}$ both equations can hold only if $\boldsymbol{x}_{1}^{H} \boldsymbol{x}_{2} = 0$, i.e. the eigenvectors are orthogonal.
To show that $\boldsymbol{A}^{H} \boldsymbol{A}$ is positive semi-definite: \begin{equation} \boldsymbol{x}^{H} \boldsymbol{A}^{H} \boldsymbol{A} \boldsymbol{x} = (\boldsymbol{A}\boldsymbol{x})^{H} (\boldsymbol{A} \boldsymbol{x}) \ge 0 \end{equation} If $(\lambda, \boldsymbol{x})$ is an eigenpair of $\boldsymbol{A}^{H} \boldsymbol{A}$,
\begin{equation} \boldsymbol{A}^{H} \boldsymbol{A} \boldsymbol{x} = \lambda \boldsymbol{x} \end{equation}Premultiplying both sides by $\boldsymbol{x}^{H}$:
\begin{equation} \boldsymbol{x}^{H} \boldsymbol{A}^{H} \boldsymbol{A} \boldsymbol{x} = \lambda \boldsymbol{x}^{H} \boldsymbol{x} \end{equation}Since $\boldsymbol{x}^{H} \boldsymbol{A}^{H} \boldsymbol{A} \boldsymbol{x} \ge 0$ and $\boldsymbol{x}^{H} \boldsymbol{x} \ge 0$, all eigenvalues $\lambda$ must be greater than or equal to zero.
a. Eigenvalues of $\boldsymbol{A}$ satisfy $\det(\boldsymbol{A} - \lambda\boldsymbol{I}) = 0$. For the transpose, we have \begin{equation} \det\left(\boldsymbol{A}^{T} - \lambda\boldsymbol{I}\right) = \det\left(\left(\boldsymbol{A} - \bar{\lambda} \boldsymbol{I}\right)^{T}\right) = \det(\boldsymbol{A} - \lambda \boldsymbol{I}), \end{equation} since the $\det\boldsymbol{A} = \det \boldsymbol{A}^{T}$. Hence eigenvalues of $\det\boldsymbol{A}$ and $\det \boldsymbol{A}^{T}$ are the same.
b. Noting that $\det\left(\boldsymbol{A}^{H}\right) = \det\left(\overline{\boldsymbol{A}}^{T}\right) = \overline{\det\left(\boldsymbol{A}^{T}\right)} = \overline{\det\left(\boldsymbol{A}\right)}$, \begin{equation} \det\left(\boldsymbol{A}^{H} - \lambda\boldsymbol{I}\right) = \det\left(\left(\boldsymbol{A} - \bar{\lambda} \boldsymbol{I}\right)^{H}\right) = \det\left( \overline{(\boldsymbol{A} - \bar{\lambda}\boldsymbol{I})^{T}}\right) = \overline{\det(\boldsymbol{A} - \bar{\lambda} \boldsymbol{I}}). \end{equation}
c. If $\lambda$ and $\boldsymbol{x}$ are an eigenvalue and eigenvector, respectively, of
$\boldsymbol{A}\boldsymbol{B}$, then
\begin{equation}
\boldsymbol{A}\boldsymbol{B} \boldsymbol{x} = \lambda \boldsymbol{x}
\end{equation}
Multiplying $\boldsymbol{x}$ by $\boldsymbol{B}$ on both sides,
\begin{equation}
(\boldsymbol{B}\boldsymbol{A}) (\boldsymbol{B} \boldsymbol{x})
= \lambda (\boldsymbol{B}\boldsymbol{x})
\end{equation}
We see that $\lambda$ is an eigenvalue of $\boldsymbol{B}\boldsymbol{A}$, and the eigenvector is now
$\boldsymbol{B} \boldsymbol{x}$.
Recall that from the defintion of a matrix operator norm it follows that $\| \boldsymbol{A}\boldsymbol{x} \| \le \| \boldsymbol{A} \| \| \boldsymbol{x} \|$
a. From the definition of the matrix norm:
\begin{equation} \| \boldsymbol{A} \boldsymbol{B} \boldsymbol{x} \| \le \| \boldsymbol{A} \| \| \boldsymbol{B} \boldsymbol{x}\| \le \| \boldsymbol{A} \| \| \boldsymbol{B}\| \| \boldsymbol{x}\| \end{equation}
for all $\boldsymbol{x}$. Re-arranging,
\begin{equation} \frac{\| \boldsymbol{A} \boldsymbol{B} \boldsymbol{x} \|}{\|\boldsymbol{x}\|} \le \| \boldsymbol{A} \| \| \boldsymbol{B}\| \end{equation}
for all $\boldsymbol{x} \ne \boldsymbol{0}$. From the definition of the norm, $\| \boldsymbol{A} \boldsymbol{B} \|$ is the largest possible value of $\| \boldsymbol{A} \boldsymbol{B} \boldsymbol{x} \| / \|\boldsymbol{x}\|$ (over all $\boldsymbol{x}$, exluding the zero vector), hence
\begin{equation} \| \boldsymbol{A} \boldsymbol{B} \| \le \| \boldsymbol{A} \| \| \boldsymbol{B}\| \end{equation}
b. Consider \begin{equation} \| (\boldsymbol{A} + \boldsymbol{B}) \boldsymbol{x} \| = \| \boldsymbol{A}\boldsymbol{x} + \boldsymbol{B} \boldsymbol{x} \| \end{equation}
From the triagle inequality for vectors,
\begin{align} \| \boldsymbol{A}\boldsymbol{x} + \boldsymbol{B} \boldsymbol{x} \| &\le \| \boldsymbol{A}\boldsymbol{x} \| + \| \boldsymbol{B} \boldsymbol{x} \| \\ &\le \| \boldsymbol{A} \| \| \boldsymbol{x} \| + \| \boldsymbol{B} \| |\boldsymbol{x} \| \end{align}
Re-arranging
\begin{equation} \frac{\| (\boldsymbol{A} + \boldsymbol{B}) \boldsymbol{x} \|}{\| \boldsymbol{x} \|} \le \| \boldsymbol{A} \| + \| \boldsymbol{B} \| \end{equation}
Using same argument as from part a), we get $\| \boldsymbol{A} + \boldsymbol{B} \| \le \| \boldsymbol{A} \| + \| \boldsymbol{B} \|$.
Recall that the smallest possible value of $C$ for which $\|{\boldsymbol{A}\boldsymbol{x}}\| \le C \| \boldsymbol{x}\|$ holds is the operator norm $\|\boldsymbol{A}\|$. The task is to find $C$ for the different norms. It simplifies the proofs if we consider all vectors for which $\|\boldsymbol{x}\| = 1$ (this is possible since $\|\alpha \boldsymbol{x}\| = |\alpha| \|\boldsymbol{x}\|$), in which case the task is to find the smallest possible $C$ such that $\|\boldsymbol{A}\boldsymbol{x}\| \le C$.
What we need to do is (i) prove the inequality, and then (ii) find an equality to show that is is a weak inequality.
a. For the 1-norm, denoting the $j$th column of $\boldsymbol{A}$ by $\boldsymbol{a}_{j}$, and for a vector $\|\boldsymbol{x}\|_{1} = 1$:
\begin{align*} \| \boldsymbol{A} \boldsymbol{x}\|_{1} = \|\sum_{j=1}^{n} \boldsymbol{a}_{j} x_{j}\|_{1} \le \sum_{j=1}^{n} \|\boldsymbol{a}_{j}\|_{1} |x_{j}| \le \max_{j=1}^{n} \|\boldsymbol{a}_{j}\|_{1} \end{align*}
This is the maximum column sum.
The term on the right could possibly be larger than $\| \boldsymbol{A} \|_{1}$, whereas the norm is the smallest possible value for the RHS that still satisfies the inequalities. If we can show a case for which equality is reached, $\|\boldsymbol{A} \boldsymbol{x}\|_{1} = \max_{j=1}^{n} \|\boldsymbol{a}_{j}\|_{1}$, we have the norm. For the vector the $\boldsymbol{e}_{j}$ with $e_{j} = 1$, where $j$ is the column with the greatest 1-norm, and $e_{i \ne j} = 0$, we have \begin{equation} \|\boldsymbol{A} \boldsymbol{e}_{j}\|_{1} = \max_{j=1}^{n} \|\boldsymbol{a}_{j}\|_{1} \end{equation} Therefore, \begin{equation} \|\boldsymbol{A}\|_{1} = \max_{j=1}^{n} \|\boldsymbol{a}_{j}\|_{1} \end{equation}
b. For a vector $\|\boldsymbol{x}\|_{\infty} = 1$:
\begin{align*} \|\boldsymbol{A} \boldsymbol{x}\|_{\infty} = \max_{i=1}^{m} |\sum_{j=1}^{n} a_{ij} x_{j}| &\le \max_{i=1}^{m} \sum_{j=1}^{n} |a_{ij}| |x_{j}| \\ &\le \max_{i=1}^{m} \sum_{j=1}^{n} |a_{ij}| \end{align*}
The RHS is the maximum row sum.
As before, we need to find a case with equality. If the row with the maximum sum is row $k$, we choose a vector $\boldsymbol{x}$ where $x_{i} = \pm 1$ such that the sign of $x_{i}$ is the same as the sign of the entry $a_{ki}$. We then have
\begin{equation} \|\boldsymbol{A}\|_{\infty} = \max_{i=1}^{m} \sum_{j=1}^{n} |a_{ij}| \end{equation}
Recall that the $l_{2}$ norm is the square root of the maximum eigenvalue of $\boldsymbol{A}^{T}\boldsymbol{A}$ (matrix is real in this question). We have $$
\boldsymbol{A}^{T}\boldsymbol{A}
=
\begin{bmatrix}
26 & 5 \\ 5 & 1
\end{bmatrix}
$$
For this matrix, $\lambda = (27 \pm \sqrt{27^{2} -4})/2$.
Hence $\|\boldsymbol{A}\|_{2} \approx 5.1926$.
Checking for a vector of length $1$:
$$
\boldsymbol{A}
\begin{bmatrix} \cos \theta \\ \sin \theta \end{bmatrix}
=
\begin{bmatrix} \cos \theta \\ 5\cos \theta + \sin \theta \end{bmatrix}
$$
Comptuing the norm,
$$F(\theta) = |\boldsymbol{A}\boldsymbol{x}|_{2}^{2} = 25 \cos^{2} \theta + 10 \cos\theta\sin\theta + 1 $$ To find the extreme points of the function, we differentiate $F$ with respect to $\theta$ and set the derivative equal to zero: \begin{align} \frac{d F}{d \theta} &= -50 \cos\theta \sin\theta + 10 (\cos^{2} \theta - \sin^{2} \theta) \\ &= -25 \sin 2\theta + 10 \cos 2\theta \\ &= 0 \end{align} The norm is maximised when $\theta = \arctan(2/5) /2$. Evaluating:
In [7]:
θ = np.arctan(2/5)/2
print("θ =", θ)
F = 25*np.cos(θ)*np.cos(θ) + 10*np.cos(θ)*np.sin(θ) + 1
print("F =", np.sqrt(F))
Recall that $\kappa(\boldsymbol{A}) = \| \boldsymbol{A} \| \| \boldsymbol{A}^{-1} \|$.
We have $\| \boldsymbol{A}\|_{1} = 3$ and $\| \boldsymbol{A}\|_{\infty} \approx 2$. Note that $$ \boldsymbol{A}^{-1} \approx -\frac{1}{2} \begin{bmatrix} 1 & -2 \\ -1 & 10^{-4} \end{bmatrix} $$ so $\| \boldsymbol{A}^{-1} \|_{1} \approx 1$ and $\| \boldsymbol{A}^{-1} \|_{\infty} = 3/2$. Therefore $\kappa_{1}(\boldsymbol{A}) \approx 3$ and $\kappa_{\infty}(\boldsymbol{A}) \approx 3$.
Recall that $\kappa_{2}(\boldsymbol{A}) = \sqrt{\lambda_{\max}(\boldsymbol{A}^{T}\boldsymbol{A}})/\sqrt{\lambda_{\min}(\boldsymbol{A}^{T}\boldsymbol{A}})$. As an approximation we ignore the $(1, 1)$ entry: $$ \boldsymbol{A}^{T}\boldsymbol{A} \approx \begin{bmatrix} 1 & 1 \\ 1 & 5 \end{bmatrix} $$ Computing the eigenvalues, $\lambda \approx 3 \pm \sqrt{5}$, therefore $\kappa_{2} \approx \sqrt{3 + \sqrt{5}}/\sqrt{3 - \sqrt{5}} \approx 2.618$.
The matrix is very well conditioned, however LU factorisation may require pivoting. This is an issue with LU factorisation rather than a pathological problem with the matrix.
A projection matrix $\boldsymbol{P}$ has the property $\boldsymbol{P} = \boldsymbol{P}\boldsymbol{P}$, and $\boldsymbol{P}^{H} = \boldsymbol{P}$.
a. The solution to the least squares problem is $\hat{\boldsymbol{x}} = \boldsymbol{A} (\boldsymbol{A}^{H} \boldsymbol{A})^{-1} \boldsymbol{A}^{H} \boldsymbol{b}$. Therefore $\boldsymbol{r} = \boldsymbol{A} \hat{\boldsymbol{x}} - \boldsymbol{b} = \boldsymbol{A}(\boldsymbol{A}^{H} \boldsymbol{A})^{-1} \boldsymbol{A}^{H} \boldsymbol{b} - \boldsymbol{b}$. Insert this expression for $\boldsymbol{r}$ into the expression in the question and re-arrange. to show the result.
Vectors $\boldsymbol{A}\boldsymbol{z}$ lie in the column space of $\boldsymbol{A}$, hence the expression says that the least-squares residual is orthogonal to the column space of $\boldsymbol{A}$.
b. $$ \boldsymbol{P}\boldsymbol{P} = \boldsymbol{A} (\boldsymbol{A}^{H}\boldsymbol{A})^{-1}\boldsymbol{A}^{H} \boldsymbol{A}(\boldsymbol{A}^{H}\boldsymbol{A})^{-1}\boldsymbol{A}^{H} = \boldsymbol{A} (\boldsymbol{A}^{H}\boldsymbol{A})^{-1}\boldsymbol{A}^{H} = \boldsymbol{P} $$ and $$ \boldsymbol{P}^{H} = \boldsymbol{A}(\boldsymbol{A}^{H}\boldsymbol{A})^{-H}\boldsymbol{A}^{H} = \boldsymbol{A}(\boldsymbol{A}^{H}\boldsymbol{A})^{-1}\boldsymbol{A}^{H} $$ by $\boldsymbol{A}^{H}\boldsymbol{A}$ being Hermitian
c. We can phrase a least squares problem as $$ \boldsymbol{A} \hat{\boldsymbol{x}} = \boldsymbol{A}(\boldsymbol{A}^{H}\boldsymbol{A})^{-1}\boldsymbol{A}^{H} \boldsymbol{b} = \boldsymbol{P} \boldsymbol{b} $$ which says that $\boldsymbol{P}$ projects $\boldsymbol{b}$ into the column space of $\boldsymbol{A}$. If $\boldsymbol{b}^{\prime} = \boldsymbol{P}\boldsymbol{b}$ is in the column space of $\boldsymbol{A}$, then $\boldsymbol{A} \hat{\boldsymbol{x}} = \boldsymbol{b}^{\prime}$ has a solution.
a.
Firstly, if the $m \times n$ matrix $\boldsymbol{A}$ has linearly independent rows, then the rank of $\boldsymbol{A}$ is $m$ and the column space of $\boldsymbol{A}$ spans $\mathbb{C}^{m}$, and the nullspace space of $\boldsymbol{A}^{H}$ contains the zero vector only.
Now, consider the nullspace of $\boldsymbol{A}\boldsymbol{A}^{H}$:
\begin{equation} \boldsymbol{A}\boldsymbol{A}^{H} \boldsymbol{x} = \boldsymbol{0} \ \rightarrow \ \boldsymbol{x}^{H}\boldsymbol{A} \boldsymbol{A}^{H} \boldsymbol{x} = 0 \ \rightarrow \ (\boldsymbol{A}^{H} \boldsymbol{x})^{H}\boldsymbol{A}^{H} \boldsymbol{x} = 0 \end{equation}
The above holds only if $\boldsymbol{A}^{H} \boldsymbol{x} = \boldsymbol{0}$, which says that $\boldsymbol{x}$ must come from the nullspace of $\boldsymbol{A}^{H}$. We have already determined that the nullspace of $\boldsymbol{A}^{H}$ contains only the zero vector, therefore $\boldsymbol{A}\boldsymbol{A}^{H}$ is full rank (the nullspace of $\boldsymbol{A}\boldsymbol{A}^{H}$ contains the zero vector only) and can be inverted.
b. Since $\boldsymbol{A}\boldsymbol{A}^{H}$ is square and full rank, it can be inverted, $$ \boldsymbol{A} \boldsymbol{A}^{+} = \boldsymbol{A} \boldsymbol{A}^{H} (\boldsymbol{A}\boldsymbol{A}^{H})^{-1} = \boldsymbol{I}. $$ hence $\boldsymbol{A}^{+}$ is a right-inverse of $\boldsymbol{A}$.
In [8]:
A = np.array([[2, -1], [-1, 2]])
print(A)
We split $\boldsymbol{A}$ such that $\boldsymbol{A} = \boldsymbol{N} - \boldsymbol{P}$. A method will converge if the largest absolute eigenvalue of $\boldsymbol{N}^{-1}\boldsymbol{P}$ is less the one.
For the Richardson method $\boldsymbol{N} = \boldsymbol{I}$:
In [9]:
# Richardson
N = np.identity(2)
P = N - A
M = np.linalg.inv(N).dot(P)
print(np.linalg.eigvals(M))
Largest eigenvalue (absolute value) is greater than 1, therefore method will not converge.
For the Jacobi method, $\boldsymbol{N} = \text{diag}(\boldsymbol{A})$:
In [10]:
# Jacobi
N = np.diag(np.diag(A))
P = N - A
M = np.linalg.inv(N).dot(P)
print(np.linalg.eigvals(M))
Largest eigenvalue (absolute value) is less than 1, therefore method will converge.
For Gauss-Seidel, $\boldsymbol{N}$ is the lower triangular part of $\boldsymbol{A}$:
In [11]:
# Gauss-Seidel
N = np.tril(A)
P = N - A
M = np.linalg.inv(N).dot(P)
print(np.linalg.eigvals(M))
Gauss-Seidel will converge because largest eigenvalue is less than one.
Non-singular matrix cannot have any zero singular values. In fact, smallest singular values is a measure of the 'distance' to a singular matrix.
In [12]:
A = np.array([[1, 3], [2, 2], [3, 1]])
print(A)
Compute the reduced SVD (recall that NumPy uses $\boldsymbol{A} = \boldsymbol{U} \boldsymbol{\Sigma} \boldsymbol{V}$ rather than $\boldsymbol{A} = \boldsymbol{U} \boldsymbol{\Sigma} \boldsymbol{V}^{T}$):
In [13]:
U, s, V = np.linalg.svd(A, full_matrices=False)
print(s)
print(U)
print(V)
Compute pseudoinverse by creating $\boldsymbol{\Sigma}^{+} = \boldsymbol{\Sigma}_{1}^{-1}$
In [14]:
# Pseudoinverse
Sigma_p = np.diag(1.0/s)
Ap = (V.T).dot(Sigma_p.dot(U.T))
print(Ap)
Compute $\boldsymbol{A}^{+}\boldsymbol{A}$:
In [15]:
# Check that A^{+}A = I
print(Ap.dot(A))
which is the identity. Compute now $\boldsymbol{A}\boldsymbol{A}^{+}$
In [16]:
# Check that AA^{+} \ne I
print(A.dot(Ap))
Which is clearly not the identity.
(e) Recall that from $\boldsymbol{A}^{T} \boldsymbol{A} \hat{\boldsymbol{x}} = \boldsymbol{A}^{T} \boldsymbol{b}$ we have \begin{align} \hat{\boldsymbol{x}} &= (\boldsymbol{A}^{T} \boldsymbol{A})^{-1}\boldsymbol{A}^{T} \boldsymbol{b} \\ &= \boldsymbol{A}^{+} \boldsymbol{b} \end{align} Multiplying both sides by $\boldsymbol{A}$, $$ \boldsymbol{A}\hat{\boldsymbol{x}} = \underbrace{\boldsymbol{A} \boldsymbol{A}^{+}}_{\boldsymbol{P}} \boldsymbol{b} $$ where $\boldsymbol{P}$ is the projection matrix from an earlier questions. Recall that $\boldsymbol{P}$ projects a vector into the column space of $\boldsymbol{A}$. Since $\boldsymbol{P}$ is a projector, it does nothing if $\boldsymbol{b}$ is already in column space. Therefore any vector $\boldsymbol{b}$ in column space of $\boldsymbol{A}$ is a solution.
In [17]:
A = np.array([[0, 3, 0], [2, 0, 0]])
print(A)
Compute SVD:
In [18]:
U, s, V = np.linalg.svd(A, full_matrices=False)
print(U)
print(s)
print(V)
Compute $\boldsymbol{\Sigma}^{+} = \boldsymbol{\Sigma}_{1}^{-1}$, and then $\boldsymbol{A}^{+}$
In [19]:
# \Sigma^{+}
Sigma_p = np.diag(1.0/s)
# A^{+}
Ap = (V.T).dot(Sigma_p).dot(U)
print(Ap)
Compute $\boldsymbol{A}^{+}\boldsymbol{A}$
In [20]:
ApA = Ap.dot(A)
print(ApA)
For this matrix, any $\boldsymbol{x} = [x_{1} \ \ x_{2} \ \ 0]$ (which is from the row space of $\boldsymbol{A}$) satisfies $\boldsymbol{A}^{+} \boldsymbol{A} \boldsymbol{x} = \boldsymbol{x}$.
In [21]:
A = np.array([[1, 0, 0], [1, 0, 0], [1, 1, 1]])
print(A)
b = np.array([0, 2, 2])
print(b)
Compute the SVD and print singular values:
In [22]:
# Compute SVD
U, s, V = np.linalg.svd(A)
print(s)
There is one zero singular value, so we need to 'trim' the last column from $\boldsymbol{U}$ and the last row from $\boldsymbol{U}$, and compute $\boldsymbol{\sigma}^{+}$
In [23]:
# Create view of U with last columns
U1 = U[:, :2]
print(U1)
V1 = V[:2,:]
print(V1)
# Create Sigma^{+}
S1 = np.diag(1.0/s[:-1])
print(S1)
Solve problem $\boldsymbol{x} = \boldsymbol{V}_{1} \boldsymbol{\Sigma}_{1}^{-1} \boldsymbol{U}_{1}^{T}$
In [24]:
x = np.transpose(V1).dot(S1.dot(U1.T).dot(b))
print(x)