In [1]:
import numpy as np
import matplotlib.pyplot as plt
import scipy.linalg as spla
%matplotlib inline
# https://scikit-learn.org/stable/modules/classes.html#module-sklearn.datasets
from sklearn import datasets
import ipywidgets as widgets
import matplotlib as mpl
mpl.rcParams['font.size'] = 14
mpl.rcParams['axes.labelsize'] = 20
mpl.rcParams['xtick.labelsize'] = 14
mpl.rcParams['ytick.labelsize'] = 14
M=8
This algorithm orthogonalize a set of input vectors, returning an orthogonal set that spans the same column space. We will only consider now that the input set of vectors are linearly independent.
Let $A=[\mathbf{a}_1\, ...., \mathbf{a}_n]$ a matrix with linearly independent column vectors $\in\mathbb{R}^m$ and $n \le m$.
We know the following for the orthogonal set:
\begin{align*} \mathbf{q}_i^*\,\mathbf{q}_i & = \|\mathbf{q}_i\|_2^2= 1\\ \mathbf{q}_i^*\,\mathbf{q}_j & = 0, \, \text{ for } i\neq j \end{align*}Then the Gram-Schmidt orthogonalization finds the $\mathbf{q}_i$ and $r_{ij}$ from the following set of equations and considering the previous constraints: \begin{align*} \mathbf{a}_1 &= r_{11}\,\mathbf{q}_1\\ r_{11} &= \|\mathbf{a}_1\|_2\\ \mathbf{q}_1 &= \dfrac{\mathbf{a}_1}{r_{11}}\\ \mathbf{a}_2 &= r_{12}\,\mathbf{q}_1+r_{22}\,\mathbf{q}_2\\ r_{12} &= \mathbf{q}_1^*\,\mathbf{a}_2\\ r_{22} &= \|\mathbf{a}_2-r_{12}\,\mathbf{q}_1\|_2\\ \mathbf{q}_2 &= \dfrac{\mathbf{a}_2-r_{12}\,\mathbf{q}_1}{r_{22}}\\ \vdots &= \vdots\\ \mathbf{a}_j &= \sum_{i=1}^j r_{ij}\,\mathbf{q}_i\\ r_{ij} &= \mathbf{q}_i^*\,\mathbf{a}_j, \, \text{ for } i<j\\ r_{jj} &= \left\|\mathbf{a}_j-\sum_{i=1}^{j-1} r_{ij}\,\mathbf{q}_i\right\|_2\\ \mathbf{q}_j &= \dfrac{\mathbf{a}_j-\sum_{i=1}^{j-1} r_{ij}\,\mathbf{q}_i}{r_jj}\\ \vdots &= \vdots\\ \mathbf{a}_n &= \sum_{i=1}^n r_{in}\,\mathbf{q}_i\\ r_{in} &= \mathbf{q}_i^*\,\mathbf{a}_n, \, \text{ for } i<n\\ r_{nn} &= \left\|\mathbf{a}_n-\sum_{i=1}^{n-1} r_{in}\,\mathbf{q}_i\right\|_2\\ \mathbf{q}_n &= \dfrac{\mathbf{a}_n-\sum_{i=1}^{n-1} r_{in}\,\mathbf{q}_i}{r_{nn}} \end{align*}
Thus, we obtain the QR decomposition as follows:
\begin{equation} \mathbf{a}_{m\times n} = Q_{m\times n}R_{n\times n}\\ \end{equation}Where $Q$ is a matrix of vectors $\mathbf{q}_{n}$, and $R$ is an upper-triangular matrix, with the coefficients $r_{ij}$:
This is known as the Reduced QR Factorization.
[IMPORTANT] What is then a full QR decomposition?
In [2]:
# Inputs:
# A: A set of linearly independent columns
# type_factorization: reduced or full
# type_gram_schmidt: classic or modified
def QR(A, type_factorization = 'reduced', type_gram_schmidt='classic'):
A.astype('float')
if type_factorization == 'reduced':
Q = np.zeros(A.shape)
R = np.zeros((A.shape[1],A.shape[1]))
elif type_factorization == 'full':
Q = np.zeros((A.shape[0],A.shape[0]))
R = np.zeros(A.shape)
for j in np.arange(A.shape[1]):
y = A[:,j]
for i in np.arange(j):
if type_gram_schmidt == 'classic':
R[i,j] = np.dot(Q[:,i],A[:,j])
elif type_gram_schmidt == 'modified':
R[i,j] = np.dot(Q[:,i],y)
y=y-R[i,j]*Q[:,i]
R[j,j] = np.linalg.norm(y)
Q[:,j] = y/np.linalg.norm(R[j,j])
# The following lines must be completed by you!
#if type_factorization == 'full':
# (1) We need to add 0's to the R matrix so it is of the same shape as the matrix A,
# fortunately this was already done!
# (2) We need to add orthogonal vectors to Q so it is square,
# how do we do this?
return Q,R
In [3]:
A = np.array([[1,-4],[2,3],[2,2]])
Qa, Ra = QR(A, type_factorization ='reduced', type_gram_schmidt='classic')
print(np.dot(Qa,Ra))
print(Qa)
print(Ra)
This method let us resolve a system of equations. However, exists a Full QR Factorization, creating the next system:
\begin{equation} A_{m\times n} = Q_{m\times m}R_{m\times n}\\ \end{equation}Q is a square orthogonal matrix, adding $m-n$ columns and R grows adding $m-n$ zero rows.
A square matrix $Q$ is orthogonal if $Q^*\, = Q^{-1}$
In [4]:
d = 1e-10
A = np.array([[1,1,1],[d,0,0],[0,d,0],[0,0,d]])
Q1,R1 = QR(A, type_gram_schmidt = 'classic')
Q2,R2 = QR(A, type_gram_schmidt = 'modified')
# What are the Q's?
print('What are the Q\'s?')
print(Q1)
print(Q2)
# Are truly orthogonal the Q's?
print('Are truly orthogonal the Q\'s?')
print(np.dot(Q1.transpose(),Q1)) # Warning: We are just using the transpose since the matrices are real!
print(np.dot(Q2.transpose(),Q2))
# Do we recover A?
print('Showing Q1 and Q2')
print(np.dot(Q1,R1)-A)
print(np.dot(Q2,R2)-A)
There is cases where the number of equations is greater than variables. Many times, those systems don't have an exact solution (inconsistent system). Then, in this case we needs an approximation closest to the data. Based in orthogonality, the shortest distance from a point to plane. The orthogonal distance represents the error which would be minimum.
\begin{equation} \mathbf{b} - A\,\mathbf{x} = \mathbf{r}\\ \mathbf{b} - A\,\mathbf{x} \perp \{A\,\mathbf{x}\, |\, \mathbf{x} \in \mathbb{R}^m\} \end{equation}The idea is that $\mathbf{r}$ would be closest to zero. We need to apply orthogonality to find the vector that satisfied this condition.
\begin{equation} (A\,\mathbf{x})^*\,(\mathbf{b}-A\,\overline{\mathbf{x}})=0, \hspace{1cm} \text{for all } \mathbf{x} \in \mathbb{R}^n\\ \mathbf{x}^*\, A^*\,(\mathbf{b}-A\,\overline{\mathbf{x}})=0, \hspace{1cm} \text{for all } \mathbf{x} \in \mathbb{R}^n\\ A^*\,(\mathbf{b}-A\,\overline{\mathbf{x}})=\mathbf{0} \\ A^*\,A\,\overline{\mathbf{x}}= A^*\,\mathbf{b} \\ \end{equation}This last equation gives us a new square $n\times n$ matrix, which let us resolve the equation system. This linear system of equations is known as the Normal Equations.
In [5]:
def least_squares(A,b):
Q,R = QR(A,type_gram_schmidt='modified')
return spla.solve_triangular(R,np.dot(Q.T,b))
def solve_model(M):
A=M['A']
b=M['b']
M['x_bar']=least_squares(A,b)
return M
def create_model(data, type_model='linear'):
if type_model == 'linear': # f(x)=a0+a1*x
A = np.ones((data.shape[0],2))
A[:,1] = data[:,0]
b = data[:,1]
if type_model == 'parabollic': # f(x)=a0+a1*x+a_2*x^2
A = np.ones((data.shape[0],3))
A[:,1] = data[:,0]
A[:,2] = data[:,0]**2
b = data[:,1]
if type_model == 'exponential': #f(x)=a0 \exp(a1*x) = \exp(\log(a0)+a1*x) -> log(f(x))=log(a0)+a1*x = A0+a1+x (it is linear now!)
A = np.ones((data.shape[0],2))
A[:,1] = data[:,0]
b = np.log(data[:,1])
M = {'A':A,
'b':b,
'type_model':type_model}
M=solve_model(M)
return M
def evaluate_model(M,x):
x_bar=M['x_bar']
if M['type_model'] == 'linear':
return x_bar[0] + x_bar[1]*x
if M['type_model'] == 'parabollic':
return x_bar[0] + x_bar[1]*x + x_bar[2]*x**2
if M['type_model'] == 'exponential':
return np.exp(x_bar[0]+x_bar[1]*x)
In [6]:
def generate_data(type_of_data='linear'):
n=40
np.random.seed(0)
x = np.linspace(0,10,n)
y = np.random.rand(n)
x = np.concatenate((x,x,y),axis=0)
n = 3*n
if type_of_data=='linear':
y = x+0.1*np.random.normal(0,1,n)+1.5
elif type_of_data=='parabollic':
y = 4*x**2+0.1*x*np.random.normal(0,1,n)+1.5
elif type_of_data=='exponential':
y = np.exp(x+0.1*np.random.normal(0,1,n)+1.5)
elif type_of_data=='sinusoidal':
y = np.sin(2*np.pi*x/10)+0.1*np.random.normal(0,1,n)+1.5
elif type_of_data=='random':
y = 0.1*np.random.normal(0,1,n)+1.5
elif type_of_data=='boston house-prices':
x,y=datasets.load_boston(return_X_y=True)
x=x[:,5]
elif type_of_data=='diabetes':
x,y=datasets.load_diabetes(return_X_y=True)
x=x[:,2]
data = np.stack((x, y)).T
return data
In [7]:
def looking_at_data(type_of_data='diabetes'):
data=generate_data(type_of_data)
Ml = create_model(data, type_model='linear')
Mp = create_model(data, type_model='parabollic')
Me = create_model(data, type_model='exponential')
xx=np.linspace(np.min(data[:,0])-0.1,np.max(data[:,0])+0.1,1000)
yyl=evaluate_model(Ml,xx)
yyp=evaluate_model(Mp,xx)
yye=evaluate_model(Me,xx)
error_l=data[:,1]-evaluate_model(Ml,data[:,0])
error_p=data[:,1]-evaluate_model(Mp,data[:,0])
error_e=data[:,1]-evaluate_model(Me,data[:,0])
plt.figure(figsize=(2*M,M))
plt.subplot(1, 2, 1)
plt.plot(xx,yyl,'k-',linewidth=5,label='linear model')
plt.plot(xx,yyp,'y-',linewidth=20,label='parabollic model',alpha=0.4)
plt.plot(xx,yye,'g-',linewidth=5,label='exponential model')
plt.plot(data[:,0],data[:,1],'.b',markersize=20,label='original data',alpha=0.3)
plt.grid(True)
plt.xlabel(r'$x$')
plt.ylabel(r'$y$')
plt.legend(loc='best')
#plt.ylim(0,10)
plt.subplot(1, 2, 2)
plt.title('What does this histogram tell us?')
three_errors=np.vstack((error_l, error_p, error_e)).T
plt.hist(three_errors, bins=20,
label=['linear','parabollic','exponential'],
color=['k','y','g'], alpha=0.5)
plt.legend(loc='best')
plt.grid(True)
plt.show()
widgets.interact(looking_at_data,type_of_data=['linear','parabollic','exponential','sinusoidal','random','boston house-prices','diabetes'])
Out[7]:
http://docs.scipy.org/doc/numpy/reference/generated/numpy.linalg.lstsq.html
http://docs.scipy.org/doc/numpy/reference/generated/numpy.linalg.qr.html
https://scikit-learn.org/stable/modules/classes.html#module-sklearn.datasets
ctorres@inf.utfsm.cl) and assistans: Laura Bermeo, Alvaro Salinas, Axel Símonsen and Martín Villanueva. DI UTFSM. April 2016.ctorres@inf.utfsm.cl) DI UTFSM. June 2017.ctorres@inf.utfsm.cl) DI UTFSM. July 2019.ctorres@inf.utfsm.cl) DI UTFSM. August 2019.
In [ ]: