Notes from chapter 1 of Numerical Algorithm by Justin Solomon
Natural numbers $$\mathbb{N}=\{n\in\mathbb{Z}: n>0\}$$
Euclidean product $$A\times B=\{(a, b): a\in A\ and\ b \in B\}$$
Power of sets $$A^n=\underbrace{A\times A \times\ \ldots \times A}_{n\text{ times}}$$
A vector space over $\mathbb{R}$ is a set $\mathcal{V}$ closed under addition and scalar multiplication with:
A member $\vec{v}\in\mathcal{V}$ is a vector
All the vectors we can reach with addition and scalar multiplication.
The span of a set $S\subseteq\mathcal{V}$ is: $$spanS=\{a_1\vec{v_1}+\ldots+a_k\vec{v_k}: \vec{v_i}\in S\ and\ a_i\in\mathbb{R}\ for\ all\ i\}$$
A set $S\subseteq\mathcal{V}$ is linearly dependent if:
$\lVert\vec{b}-A\vec{x}\lVert_2^2=(\vec{b}-A\vec{x}).(\vec{b}-A\vec{x})=(\vec{b}-A\vec{x})^T(\vec{b}-A\vec{x})=(\vec{b}^T-\vec{x}^TA^T)(\vec{b}-A\vec{x}) \\ = \vec{b}^T\vec{b}-\vec{b}^TA\vec{x}-\vec{x}^TA^T\vec{b}+\vec{x}^TA^TA\vec{x}$
Since $\vec{x}^TA^T\vec{b}$ is a scalar we can transpose it: $\vec{x}^TA^T\vec{b}=(\vec{x}^TA^T\vec{b})^T=\vec{b}^TA\vec{x}$
Which results in: $\lVert\vec{b}-A\vec{x}\lVert_2^2=\lVert A\vec{x}\lVert_2^2-2\vec{b}^TA\vec{x}+\lVert\vec{b}\lVert_2^2$
The same results can be derived using dot product identities:
$\lVert\vec{b}-A\vec{x}\lVert_2^2=(\vec{b}-A\vec{x}).(\vec{b}-A\vec{x}) \\ =\vec{b}.(\vec{b}-A\vec{x})-A\vec{x}.(\vec{b}-A\vec{x}) \\ =\vec{b}.\vec{b}-\vec{b}.A\vec{x}-A\vec{x}.\vec{b}+A\vec{x}.A\vec{x} \\ =\lVert A\vec{x}\lVert_2^2+\lVert \vec{b}\lVert_2^2-2A\vec{x}.b \\ =\lVert A\vec{x}\lVert_2^2+\lVert \vec{b}\lVert_2^2-2\vec{b}^TA\vec{x} $
$Q\vec{x}=\vec{r}$
$E=f(\vec{x})=(A\vec{x}-\vec{a})^T(A\vec{x}-\vec{a})+(B\vec{x}-\vec{b})^T(B\vec{x}-\vec{b}) \\ =(\vec{x}^TA^T-\vec{a}^T)(A\vec{x}-\vec{a}) + (\vec{x}^TB^T-\vec{b}^T)(B\vec{x}-\vec{b}) \\ =\vec{x}^TA^TA\vec{x}-\vec{x}^TA^T\vec{a}-\vec{a}^TA\vec{x}+\vec{a}^T\vec{a}+\vec{x}^TB^TB\vec{x}-\vec{x}^TB^T\vec{b}-\vec{b}^TB\vec{x}+\vec{b}^T\vec{b} \\ =\vec{x}^TA^TA\vec{x}-2\vec{a}^TA\vec{x}+\vec{a}^T\vec{a}+\vec{x}^TB^TB\vec{x}-2\vec{b}^TB\vec{x}+\vec{b}^T\vec{b} $
$\nabla\ f(\vec{x})=(A^TA+(A^TA)^T)\vec{x}-2\vec{a}^TA+(B^TB+(B^TB)^T)\vec{x}-2\vec{b}^TB=0 \\ 2A^TA\vec{x}+2B^TB\vec{x}-2\vec{a}^TA-2\vec{b}^TB=0 \\ 2(A^TA+B^TB)\vec{x}=2(\vec{a}^TA+\vec{b}^TB) \\ (A^TA+B^TB)\vec{x}=(\vec{a}^TA+\vec{b}^TB) $
$Q=A^TA+B^TB \\ \vec{r}=\vec{a}^TA+\vec{b}^TB $
In [2]:
import numpy as np
In [14]:
a = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
# np.dot(a, a.T) == np.matmul(a, a.T)
np.dot(a, a.T) == np.dot(a.T, a)
Out[14]:
In [15]:
np.dot(a, a.T)
Out[15]:
In [16]:
np.dot(a.T, a)
Out[16]:
In [18]:
np.matmul(a.T, a)
Out[18]:
In [19]:
np.matmul(a, a.T)
Out[19]:
In [20]:
b = np.array([[1, 2], [3, 4]])
np.dot(b, b.T)
Out[20]:
In [21]:
np.dot(b.T, b)
Out[21]:
$A^T=\begin{pmatrix} \mid & \mid & & \mid \\ \vec{r}_1 & \vec{r}_2 & \ldots & \vec{r}_m \\ \mid & \mid & & \mid\end{pmatrix}= \begin{pmatrix} \_\_\_ & \vec{c}^T_1 & \_\_\_ \\ \_\_\_ & \vec{c}^T_2 & \_\_\_ \\ & \ldots & \\ \_\_\_ & \vec{c}^T_n & \_\_\_ \\ \end{pmatrix}$
$A^TA=\begin{pmatrix} \mid & \mid & & \mid \\ A^T\vec{c}_1 & A^T\vec{c}_2 & \ldots & A^T\vec{c}_n \\ \mid & \mid & & \mid\end{pmatrix}=\begin{pmatrix} \vec{c}^T_1\vec{c}_1 & \vec{c}^T_1\vec{c}_2 & \ldots & \vec{c}^T_1\vec{c}_n \\ \vec{c}^T_2\vec{c}_1 & \vec{c}^T_2\vec{c}_2 & \ldots & \vec{c}^T_2\vec{c}_n \\ \vdots & \vdots & \ddots & \vdots \\ \vec{c}^T_n\vec{c}_1 & \vec{c}^T_n\vec{c}_2 & \ldots & \vec{c}^T_n\vec{c}_n \end{pmatrix}$
$AA^T=\begin{pmatrix} \mid & \mid & & \mid \\ A\vec{r}_1 & A\vec{r}_2 & \ldots & A\vec{r}_m \\ \mid & \mid & & \mid\end{pmatrix}=\begin{pmatrix} \vec{r}^T_1\vec{r}_1 & \vec{r}^T_1\vec{r}_2 & \ldots & \vec{r}^T_1\vec{r}_m \\ \vec{r}^T_2\vec{r}_1 & \vec{r}^T_2\vec{r}_2 & \ldots & \vec{r}^T_2\vec{r}_m \\ \vdots & \vdots & \ddots & \vdots \\ \vec{r}^T_m\vec{r}_1 & \vec{r}^T_m\vec{r}_2 & \ldots & \vec{r}^T_m\vec{r}_m \end{pmatrix}$
$\Lambda(x,\ \lambda)=\lVert A\vec{x}\lVert_2^2-\lambda(\lVert B\vec{x}\lVert _2^2-1) \\ =\vec{x}^TA^TA\vec{x}+\lambda-\lambda\vec{x}^TB^TB\vec{x} $
$\frac{\partial \Lambda}{\partial \vec{x}}=2A^TA\vec{x}-2\lambda B^TB\vec{x}=A^TA\vec{x}-\lambda B^TB\vec{x}=0 \\ \frac{\partial \Lambda}{\partial \vec{\lambda}}=1-\vec{x}^TB^TB\vec{x}=0 $
$\lVert A\vec{x}\lVert_2^2=\vec{x}^TA^TA\vec{x}=\vec{x}^T\lambda B^TB\vec{x} \\ \lVert A\vec{x}\lVert_2 = \sqrt{\lambda\vec{x}^T B^TB\vec{x}}$
$\Lambda(\vec{x}, \lambda)=\vec{a}.\vec{x}-\lambda(\lVert \vec{x}\lVert_2-1)$
$\lVert \vec{x}\lVert_2=(\vec{x}.\vec{x})^\frac{1}{2}$
$\frac{\partial \lVert \vec{x}\lVert_2}{\partial \vec{x}}=\frac{1}{2}(\vec{x}.\vec{x})^{1/2}(2\vec{x})=\frac{\vec{x}}{\lVert \vec{x}\lVert_2}$
$\frac{\partial \Lambda}{\partial \vec{x}}=\vec{a}-\frac{\lambda\vec{x}}{\lVert \vec{x}\lVert_2} \rightarrow \vec{x}=\frac{\vec{a}}{\lambda \lVert \vec{x}\lVert_2}$
$\frac{\partial \Lambda}{\partial \lambda}=\lVert x\lVert_2-1=0 \rightarrow \lVert x\lVert_2=1$
$\vec{x}=\frac{\vec{a}}{\lambda}$
$X=B(B^TB)^{-1}B^T$
$X^2=(B(B^TB)^{-1}B^T)(B(B^TB)^{-1}B^T) \\ =B(B^TB)^{-1}(B^TB)(B^TB)^{-1}B^T$
$X^2=X$
$I^N=I$
In [45]:
from functools import partial, reduce
In [33]:
eye = lambda x: np.eye(x)
In [42]:
for i in range(1, 5):
assert np.all(np.dot(eye(i), eye(i)) == eye(i))
In [57]:
all([
np.all(eye(i) == reduce(np.dot,
[eye(i) for _ in range(n)]))
for i in range(1, 5)
for n in range(2, 10)
])
Out[57]:
In [75]:
A = np.random.randint(1, 100, (3, 3))
assert np.all(np.dot(A, eye(3)) == A)
assert np.all(np.dot(eye(3), A) == A)
$AA=A$
$X=(I-A)$
$X^2=(I-A)(I-A) \\ =I - IA - AI + A^2 \\ =I - 2A + AA $
$X^2=I-2A+A \\ =I-A=X $
$\left(\frac{1}{2}I_{n\times n}-A\right)\left(\alpha I_{n\times n}+\beta A\right)=I \\ \frac{\alpha}{2}I + \frac{\beta}{2}A-\alpha A-\beta AA=I $
$(\frac{\beta}{2}-\alpha-\beta)A + (\frac{\alpha}{2}-1)I=0$
$\frac{\alpha}{2}-1=0 \rightarrow \alpha=2$
$\frac{\beta}{2}-2-\beta=0 \rightarrow \beta=-4$
$X^{-1}=2I -4A$
$\lambda^2\vec{x}-\lambda\vec{x} = \lambda(\lambda-1)\vec{x} = 0$