Regression

Fitting a function $f$ to a dataset consisting of input-output pairs $x, y$. Fitting means that $y$ is approximately equal to $f(x)$, written as $y \approx f(x)$ for each $x,y$ pair.

One way of measuring the quality of the approximation is defining the residual $$ e = y - f(x) $$ and measuring the magnitude of $e$.

Parametric Regression a.k.a. parametric function fitting

Fitting a parametric function $f(x; \theta)$ with parameter $\theta$ to a dataset consisting of input-output pairs $x, y$

Linear Regression

Fitting a parametric function $f(x; \theta)$ with parameter $\theta$ that is linear in the parameters to a dataset consisting of input-output pairs $x, y$.


In [1]:
# http://matplotlib.org/1.5.1/examples/pylab_examples/anscombe.html

from __future__ import print_function
"""
Edward Tufte uses this example from Anscombe to show 4 datasets of x
and y that have the same mean, standard deviation, and regression
line, but which are qualitatively different.
"""

import matplotlib.pyplot as plt
import numpy as np

x = np.array([10, 8, 13, 9, 11, 14, 6, 4, 12, 7, 5])
y1 = np.array([8.04, 6.95, 7.58, 8.81, 8.33, 9.96, 7.24, 4.26, 10.84, 4.82, 5.68])
y2 = np.array([9.14, 8.14, 8.74, 8.77, 9.26, 8.10, 6.13, 3.10, 9.13, 7.26, 4.74])
y3 = np.array([7.46, 6.77, 12.74, 7.11, 7.81, 8.84, 6.08, 5.39, 8.15, 6.42, 5.73])
x4 = np.array([8, 8, 8, 8, 8, 8, 8, 19, 8, 8, 8])
y4 = np.array([6.58, 5.76, 7.71, 8.84, 8.47, 7.04, 5.25, 12.50, 5.56, 7.91, 6.89])


def fit(x):
    return 3 + 0.5*x

plt.figure(figsize=(16,4))

xfit = np.array([np.amin(x), np.amax(x)])

plt.subplot(141)
plt.plot(x, y1, 'ks', xfit, fit(xfit), 'r-', lw=2)
plt.axis([2, 20, 2, 14])
plt.setp(plt.gca(), xticklabels=[], yticks=(4, 8, 12), xticks=(0, 10, 20))
plt.text(3, 12, 'I', fontsize=20)

plt.subplot(142)
plt.plot(x, y2, 'ks', xfit, fit(xfit), 'r-', lw=2)
plt.axis([2, 20, 2, 14])
plt.setp(plt.gca(), xticklabels=[], yticks=(4, 8, 12), yticklabels=[], xticks=(0, 10, 20))
plt.text(3, 12, 'II', fontsize=20)

plt.subplot(143)
plt.plot(x, y3, 'ks', xfit, fit(xfit), 'r-', lw=2)
plt.axis([2, 20, 2, 14])
plt.text(3, 12, 'III', fontsize=20)
plt.setp(plt.gca(), yticks=(4, 8, 12), xticks=(0, 10, 20))

plt.subplot(144)

xfit = np.array([np.amin(x4), np.amax(x4)])
plt.plot(x4, y4, 'ks', xfit, fit(xfit), 'r-', lw=2)
plt.axis([2, 20, 2, 14])
plt.setp(plt.gca(), yticklabels=[], yticks=(4, 8, 12), xticks=(0, 10, 20))
plt.text(3, 12, 'IV', fontsize=20)

# verify the stats
#pairs = (x, y1), (x, y2), (x, y3), (x4, y4)
#for x, y in pairs:
#    print('mean=%1.2f, std=%1.2f, r=%1.2f' % (np.mean(y), np.std(y), np.corrcoef(x, y)[0][1]))

plt.show()
print(r'http://matplotlib.org/1.5.1/examples/pylab_examples/anscombe.html')


http://matplotlib.org/1.5.1/examples/pylab_examples/anscombe.html

The regression problem is about finding a function $f(x; w)$ with parameters $w$, given example input vectors $x_i$ and corresponding outputs $y_i$ for $i=1,\dots, N$ such that

$$ y_i \approx f(x_i; w) $$

This is done by minimizing a suitably defined function that measures the error

$$ E(w) = \sum_i D(y_i, f(x_i; w)) $$

A popular choice is $$ D(y, f) = \frac{1}{2}(y-f)^2 $$

The collection $\{x_i, y_i\}$ for $i=1\dots N$ is called a data set.

If the mapping $f$ is linear in $w$, the problem is a linear regression problem.

Note that the mapping $f$ itself does not need to be linear in $x$. It is the linearity in $w$ why we call the problem linear regression.

Definition: Linearity:

A function $g$ is linear when $$ g(aw_1 + b w_2) = a g(w_1) + b g(w_2) $$ for all scalar $a,b$ and vectors $w_1$ and $w_2$.

Example1: Line Fitting

  • We wish to fit a line to data where we are given pairs $$ (y_i, x_i) $$ for $i=1\dots N$ (or $i=0\dots N-1$)

  • Model $$ y_i \approx f(x; w_1, w_0) = w_0 + w_1 x $$

$x$ : Input

$w_1$: The slope

$w_0$: Intercept

$f_i \equiv f(x_i; w_1, w_0)$

Example2: Parabola Fitting

  • We wish to fit a parabola to data where we are given pairs $$ (y_i, x_i) $$ for $i=1\dots N$ (or $i=0\dots N-1$)

  • Model $$ y_i \approx f(x; w_2, w_1, w_0) = w_0 + w_1 x + w_2 x^2 $$

$x$ : Input

$w_2$: Coefficient of the Quadratic term, sign

$w_1$: Coefficient of the Linear term

$w_0$: Constant

$f_i \equiv f(x_i; w_2, w_1, w_0)$

A parabola is not a linear function in $x$ but it is linear in the parameters $w_2, w_1, w_0$


In [1]:
import matplotlib.pyplot as plt
import numpy as np

x = np.array([10, 8, 13, 9, 11, 14, 6, 4, 12, 7, 5])-2
y = np.array([8.04, 6.95, 7.58, 8.81, 8.33, 9.96, 7.24, 4.26, 10.84, 4.82, 5.68])-2
#y = np.array([9.14, 8.14, 8.74, 8.77, 9.26, 8.10, 6.13, 3.10, 9.13, 7.26, 4.74])-2
#y = np.array([7.46, 6.77, 12.74, 7.11, 7.81, 8.84, 6.08, 5.39, 8.15, 6.42, 5.73])-2

def plot_fit(w1, w0):
    f = w0 + w1*x

    plt.figure(figsize=(6,6))
    plt.plot(x,y,'sk')
    plt.plot(x,f,'o-r')
    #plt.axis('equal')
    plt.xlim((0,15))
    plt.ylim((0,15))
    for i in range(len(x)):
        plt.plot((x[i],x[i]),(f[i],y[i]),'b')
    plt.show()
    plt.figure(figsize=(6,2))
    plt.bar(x,(f-y)**2/2)
    plt.title(np.sum((f-y)**2/2))
    plt.ylim((0,30))
    plt.xlim((0,15))
    plt.show()
    
plot_fit(0.38,3.99)



In [2]:
%matplotlib inline
from __future__ import print_function
from ipywidgets import interact, interactive, fixed
import ipywidgets as widgets
import matplotlib.pylab as plt
from IPython.display import clear_output, display, HTML


interact(plot_fit, w1=(-2, 2, 0.01), w0=(-5, 5, 0.01));


Example Dataset

Number of registered cars in Turkey, as a function of years.

$i$ $y_i$ $x_i$
Index Number of Cars (In Millions) Year

In [4]:
%matplotlib inline

import scipy as sc
import numpy as np
import pandas as pd
import matplotlib as mpl
import matplotlib.pylab as plt

df_arac = pd.read_csv(u'data/arac.csv',sep=';')
df_arac[['Year','Car']]
#df_arac


Out[4]:
Year Car
0 1966 91469
1 1967 112367
2 1968 125375
3 1969 137345
4 1970 137771
5 1971 153676
6 1972 187272
7 1973 240360
8 1974 313160
9 1975 403546
10 1976 488894
11 1977 560424
12 1978 624438
13 1979 688687
14 1980 742252
15 1981 776432
16 1982 811465
17 1983 856350
18 1984 919577
19 1985 983444
20 1986 1087234
21 1987 1193021
22 1988 1310257
23 1989 1434830
24 1990 1649879
25 1991 1864344
26 1992 2181388
27 1993 2619852
28 1994 2861640
29 1995 3058511
30 1996 3274156
31 1997 3570105
32 1998 3838288
33 1999 4072326
34 2000 4422180
35 2001 4534803
36 2002 4600140
37 2003 4700343
38 2004 5400440
39 2005 5772745
40 2006 6140992
41 2007 6472156
42 2008 6796629
43 2009 7093964
44 2010 7544871
45 2011 8113111
46 2012 8648875
47 2013 9283923
48 2014 9857915
49 2015 10589337
50 2016 11317998
51 2017 12035978

Visualizing data


In [5]:
BaseYear = 1995
x = np.matrix(df_arac.Year[0:]).T-BaseYear
y = np.matrix(df_arac.Car[0:]).T/1000000.

plt.plot(x+BaseYear, y, 'o-')
plt.xlabel('Yil')
plt.ylabel('Araba (Millions)')

plt.show()


Manual Model Fitting


In [6]:
%matplotlib inline
from __future__ import print_function
from ipywidgets import interact, interactive, fixed
import ipywidgets as widgets
import matplotlib.pylab as plt
from IPython.display import clear_output, display, HTML


w_0 = 0.27150786
w_1 = 0.37332256

BaseYear = 1995
x = np.matrix(df_arac.Year[0:]).T-BaseYear
y = np.matrix(df_arac.Car[0:]).T/1000000.

fig, ax = plt.subplots()

f = w_1*x + w_0
plt.plot(x+BaseYear, y, 'o-')
ln, = plt.plot(x+BaseYear, f, 'r')


plt.xlabel('Years')
plt.ylabel('Number of Cars (Millions)')
ax.set_ylim((-2,12))
plt.close(fig)

def set_line(w_1, w_0):

    f = w_1*x + w_0
    e = y - f

    ln.set_ydata(f)
    ax.set_title('Total Error = {} '.format(np.asscalar(e.T*e/2)))
    display(fig)

set_line(0.32,3)



In [7]:
interact(set_line, w_1=(-2, 2, 0.01), w_0=(-5, 5, 0.01));



In [8]:
w_0 = 0.27150786
w_1 = 0.37332256
w_2 = 0.1

BaseYear = 1995
x = np.array(df_arac.Year[0:]).T-BaseYear
y = np.array(df_arac.Car[0:]).T/1000000.

fig, ax = plt.subplots()

f = w_2*x**2 + w_1*x + w_0
plt.plot(x+BaseYear, y, 'o-')
ln, = plt.plot(x+BaseYear, f, 'r')


plt.xlabel('Years')
plt.ylabel('Number of Cars (Millions)')
ax.set_ylim((-2,12))
plt.close(fig)

def set_line(w_2, w_1, w_0):
    f = w_2*x**2 + w_1*x + w_0
    e = y - f
    ln.set_ydata(f)
    ax.set_title('Total Error = {} '.format(np.sum(e*e/2)))
    display(fig)

set_line(w_2, w_1, w_0)



In [16]:
interact(set_line, w_2=(-0.1,0.1,0.001), w_1=(-2, 2, 0.01), w_0=(-5, 5, 0.01))


Out[16]:
<function __main__.set_line>

Example

  • Fitting the model: estimating $w = [w_0, w_1]$

  • As there is noise, we can't hope to fit our model exactly

  • Define the error for each observation

$$e_i = y_i - f(x_i; w)$$

squared Euclidian norm. The constant $1/2$ is useful for a cosmetical simplification.

  • Total Error (sum over all data points)
$$ E(w) = \frac{1}{2} \sum_i (y_i - f(x_i; w))^2 $$
  • We can minimize the total error by adjusting $w_0$ and $w_1$

Visualization of the error surface

A good approach for low dimensional problems is the visualization of the error surface. We evaluate her exhaustively the error for each possible setting of the parameter $w$.


In [100]:
from itertools import product

BaseYear = 1995
x = np.matrix(df_arac.Year[0:]).T-BaseYear
y = np.matrix(df_arac.Car[0:]).T/1000000.

# Setup the vandermonde matrix
N = len(x)
A = np.hstack((np.ones((N,1)), x))

left = -5
right = 15
bottom = -4
top = 6
step = 0.05
W0 = np.arange(left,right, step)
W1 = np.arange(bottom,top, step)

ErrSurf = np.zeros((len(W1),len(W0)))

for i,j in product(range(len(W1)), range(len(W0))):
    e = y - A*np.matrix([W0[j], W1[i]]).T
    ErrSurf[i,j] = e.T*e/2

plt.figure(figsize=(10,10))
plt.imshow(ErrSurf, interpolation='nearest', 
           vmin=0, vmax=1000,origin='lower',
           extent=(left,right,bottom,top),cmap='jet')
plt.xlabel('w0')
plt.ylabel('w1')
plt.title('Error Surface')
plt.colorbar(orientation='horizontal')
plt.show()


How to fit the model automatically?

Idea: Least Squares

  • Compute the derivative of the total error w.r.t. $w_0$ and $w_1$ and solve the equations
  • Derivation in Vector Notation
\begin{eqnarray} \left( \begin{array}{c} y_0 \\ y_1 \\ \vdots \\ y_{N-1} \end{array} \right) \approx \left( \begin{array}{cc} 1 & x_0 \\ 1 & x_1 \\ \vdots \\ 1 & x_{N-1} \end{array} \right) \left( \begin{array}{c} w_0 \\ w_1 \end{array} \right) \end{eqnarray}\begin{eqnarray} y \approx A w \end{eqnarray}

$A = A(x)$: Design Matrix

$w$: Model Parameters

$y$: Observations

  • Error vector: $$e = y - Aw$$
\begin{eqnarray} E(w) & = & \frac{1}{2}e^\top e = \frac{1}{2}(y - Aw)^\top (y - Aw)\\ & = & \frac{1}{2}y^\top y - \frac{1}{2} y^\top Aw - \frac{1}{2} w^\top A^\top y + \frac{1}{2} w^\top A^\top Aw \\ & = & \frac{1}{2} y^\top y - y^\top Aw + \frac{1}{2} w^\top A^\top Aw \\ \end{eqnarray}

Gradient

Derivative of the function $E$ with respect to a vector \begin{eqnarray} \frac{d E}{d w } & = & \left(\begin{array}{c} \partial E/\partial w_0 \\ \partial E/\partial w_1 \\ \vdots \\ \partial E/\partial w_{K-1} \end{array}\right) \end{eqnarray}

We derive the gradient of the total error as \begin{eqnarray} \frac{d}{d w }E(w) & = & \frac{d}{d w }(\frac{1}{2} y^\top y) &+ \frac{d}{d w }(- y^\top Aw) &+ \frac{d}{d w }(\frac{1}{2} w^\top A^\top Aw) \\ & = & 0 &- A^\top y &+ A^\top A w \\ & = & - A^\top (y - Aw) \\ & = & - A^\top e \\ & \equiv & \nabla E(w) \end{eqnarray}

Two basic results that everyone should know

  • Gradient of an inner product \begin{eqnarray} \frac{d}{d w }(h^\top w) & = & h \end{eqnarray}

  • Gradient of a Quadratic form \begin{eqnarray} \frac{d}{d w }(w^\top K w) & = & (K+K^\top) w \end{eqnarray}

Least Squares solution: Directly solving the equations

  • Find
\begin{eqnarray} w^* & = & \arg\min_{w} E(w) \end{eqnarray}
  • Optimality Condition (from basic Calculus, the gradient should vanish)
\begin{eqnarray} \nabla E(w^*) & = & 0 \end{eqnarray}
  • Set the derivative to zero and solve for $w^*$
\begin{eqnarray} 0 & = & - A^\top y + A^\top A w^* \\ A^\top y & = & A^\top A w^* \\ w^* & = & (A^\top A)^{-1} A^\top y \end{eqnarray}
  • Projection interpretation:
\begin{eqnarray} f & = A w^* = A (A^\top A)^{-1} A^\top y \end{eqnarray}

$P$ is the orthogonal projection matrix onto the space spanned by the columns of $A$ \begin{eqnarray} P & = & A (A^\top A)^{-1} A^\top \end{eqnarray}

To understand the last equation, consider the case when $A = a$ is a column vector. The projector is \begin{eqnarray} P & = & \frac{a}{\|a\|} \frac{a^\top}{\|a\|} \end{eqnarray} Hence the approximation is given by \begin{eqnarray} f & = a w^* = \frac{a}{\|a\|} \left( \frac{a^\top}{\|a\|} y \right) \end{eqnarray}

This expression shows that the vector $y$ is projected onto the unit vector pointing in the direction of $A$.


In [101]:
# Solving the Normal Equations

# Setup the Design matrix
N = len(x)
A = np.hstack((np.ones((N,1)), x))

#plt.imshow(A, interpolation='nearest')
# Solve the least squares problem
w_ls,E,rank,sigma = np.linalg.lstsq(A, y)

print('Parameters: \nw0 = ', w_ls[0],'\nw1 = ', w_ls[1] )
print('Error:', E/2)

f = np.asscalar(w_ls[1])*x + np.asscalar(w_ls[0])
plt.plot(x+BaseYear, y, 'o-')
plt.plot(x+BaseYear, f, 'r')


plt.xlabel('Years')
plt.ylabel('Number of Cars (Millions)')
plt.show()


Parameters: 
w0 =  [[ 3.94060309]] 
w1 =  [[ 0.194523]]
Error: [[ 26.15676891]]

Fitting a polynomial

Often, fitting a line between inputs and outputs is not enough. A good candidate is fitting a higher order polynomial.

Parabola

\begin{eqnarray} \left( \begin{array}{c} y_0 \\ y_1 \\ \vdots \\ y_{N-1} \end{array} \right) \approx \left( \begin{array}{ccc} 1 & x_0 & x_0^2 \\ 1 & x_1 & x_1^2 \\ \vdots \\ 1 & x_{N-1} & x_{N-1}^2 \end{array} \right) \left( \begin{array}{c} w_0 \\ w_1 \\ w_2 \end{array} \right) \end{eqnarray}

Polynomial of order $K$

\begin{eqnarray} \left( \begin{array}{c} y_0 \\ y_1 \\ \vdots \\ y_{N-1} \end{array} \right) \approx \left( \begin{array}{ccccc} 1 & x_0 & x_0^2 & \dots & x_0^K \\ 1 & x_1 & x_1^2 & \dots & x_1^K\\ \vdots \\ 1 & x_{N-1} & x_{N-1}^2 & \dots & x_{N-1}^K \end{array} \right) \left( \begin{array}{c} w_0 \\ w_1 \\ w_2 \\ \vdots \\ w_K \end{array} \right) \end{eqnarray}

The model is identical

\begin{eqnarray} y \approx A w \end{eqnarray}

$A = A(x)$: Design Matrix

$w$: Model Parameters

$y$: Observations

Design matrices for polynomial fitting are also known as Vandermonde matrices.


In [106]:
# Setup the vandermonde matrix
N = len(x)
degree = 9
#A = np.hstack((np.power(x,0), np.power(x,1), np.power(x,2)))
A = np.hstack((np.power(x,i) for i in range(degree+1)))
xx = np.matrix(np.linspace(np.asscalar(x[0])-10,np.asscalar(x[-1])+10,30)).T
A2 = np.hstack((np.power(xx,i) for i in range(degree+1)))

#plt.imshow(A, interpolation='nearest')
# Solve the least squares problem
w_ls,E,rank,sigma = np.linalg.lstsq(A, y)

f = A2*w_ls
plt.plot(x+BaseYear, y, 'o-')
plt.plot(xx+BaseYear, f, 'r')

plt.xlabel('Years')
plt.ylabel('Number of Cars')

plt.gca().set_ylim((-2,12))
plt.gca().set_xlim((1950,2025))

plt.show()



In [17]:
x = np.array([10, 8, 13, 9, 11, 14, 6, 4, 12, 7, 5])
N = len(x)
x = x.reshape((N,1))
y = np.array([8.04, 6.95, 7.58, 8.81, 8.33, 9.96, 7.24, 4.26, 10.84, 4.82, 5.68]).reshape((N,1))
#y = np.array([9.14, 8.14, 8.74, 8.77, 9.26, 8.10, 6.13, 3.10, 9.13, 7.26, 4.74]).reshape((N,1))
#y = np.array([7.46, 6.77, 12.74, 7.11, 7.81, 8.84, 6.08, 5.39, 8.15, 6.42, 5.73]).reshape((N,1))

def fit_and_plot_poly(degree):

    #A = np.hstack((np.power(x,0), np.power(x,1), np.power(x,2)))
    A = np.hstack((np.power(x,i) for i in range(degree+1)))
    # Setup the vandermonde matrix
    xx = np.matrix(np.linspace(np.asscalar(min(x))-1,np.asscalar(max(x))+1,300)).T
    A2 = np.hstack((np.power(xx,i) for i in range(degree+1)))

    #plt.imshow(A, interpolation='nearest')
    # Solve the least squares problem
    w_ls,E,rank,sigma = np.linalg.lstsq(A, y)
    f = A2*w_ls
    plt.plot(x, y, 'o')
    plt.plot(xx, f, 'r')

    plt.xlabel('x')
    plt.ylabel('y')

    plt.gca().set_ylim((0,20))
    #plt.gca().set_xlim((1950,2025))
    plt.title(E)
    plt.show()

fit_and_plot_poly(9)



In [18]:
interact(fit_and_plot_poly, degree=(0,12))


Out[18]:
<function __main__.fit_and_plot_poly>

Beyond polynomial fitting

We can fit other functions than polynomials. For example, if data are known to be near periodic with period $T$, one possibility is fitting a sinusoidal as:

\begin{eqnarray} y_k & = & a_0 \cos(2\pi x_k /T ) + b_0 \sin(2\pi x_k /T ) \end{eqnarray}

Then the Vandermonde matrix will have the entries

\begin{eqnarray} \left( \begin{array}{c} y_0 \\ y_1 \\ \vdots \\ y_{N-1} \end{array} \right) \approx \left( \begin{array}{ccc} 1 & \cos(\omega x_0) & \sin(\omega x_0) \\ 1 & \cos(\omega x_1) & \sin(\omega x_1) \\ \vdots \\ 1 & \cos(\omega x_{N-1}) & \sin(\omega x_{N-1}) \end{array} \right) \left( \begin{array}{c} w_0 \\ w_1 \\ w_2 \end{array} \right) \end{eqnarray}

Autoregressive (AR) Models

Idea: Use past values of a time series as inputs. \begin{eqnarray} y_k & = & w_1 y_{k-1} + w_2 y_{k-2} + w_3 y_{k-3} + \dots + w_L y_{k-L} + \epsilon_k \end{eqnarray}

The design matrix consists of past values of the process. Such a design matrix is also known as a Toeplitz matrix. \begin{eqnarray} \left( \begin{array}{c} y_L \\ \vdots \\ y_k \\ y_{k+1} \\ \vdots \\ y_{N} \end{array} \right) \approx \left( \begin{array}{ccccc} y_{L-1} & y_{L-2} & y_{L-3} & \dots & y_{0}\\ \vdots \\ y_{k-1} & y_{k-2} & y_{k-3} & \dots & y_{k-L} \\ y_{k} & y_{k-1} & y_{k-2} & \dots & y_{k-L+1} \\ \vdots \\ y_{N-1} & y_{N-2} & y_{N-3} & \dots & y_{N-L} \end{array} \right) \left( \begin{array}{c} w_1 \\ w_2 \\ w_3 \\ \vdots \\ w_L \end{array} \right) \end{eqnarray}


In [20]:
import scipy as sc
import scipy.linalg as la

# Order
L = 2

BaseYear = 1995
idx0 = 10
x = np.matrix(df_arac.Year[idx0:]).T-BaseYear
y = np.matrix(df_arac.Car[idx0:]).T/1000000.


def AR_model(y,L):
    N = len(y) - L
    A = np.hstack((np.matrix(y[L-i:L-i+N]) for i in range(1,L+1) ))
    yy = np.matrix(y[L:(L+N+1)])
    return A, yy
   
def make_AR_model(y, L):
    B = la.toeplitz(y[:-1], y[0:L])
    AA = B[L-1:,:]
    yy = y[L:]
    return AA, yy

A,yy = AR_model(y, L)
#plt.imshow(A, interpolation='nearest')
# Solve the least squares problem
w_ls,E,rank,sigma = np.linalg.lstsq(A, yy)

f = A*w_ls

plt.figure(figsize=(14,8))
plt.plot(x[L:]+BaseYear, yy, 'o-r')
plt.plot(x[L:]+BaseYear, f, '+b')

plt.gca().set_ylim((-2,12))
plt.gca().set_xlim((1965+idx0,2020))

plt.show()



In [22]:
idx = np.array([10, 8, 13, 9, 11, 14, 6, 4, 12, 7, 5])-4
sg = np.argsort(idx)
yt = np.array([8.04, 6.95, 7.58, 8.81, 8.33, 9.96, 7.24, 4.26, 10.84, 4.82, 5.68]).reshape((N,1))
y = yt[sg]
N = len(y)

def fit_and_plot_ar(degree):

    #A = np.hstack((np.power(x,0), np.power(x,1), np.power(x,2)))
    A,yy = AR_model(y, degree)
    # Setup the vandermonde matrix

    #plt.imshow(A, interpolation='nearest')
    # Solve the least squares problem
    w_ls,E,rank,sigma = np.linalg.lstsq(A, yy)
    f = A*w_ls
    plt.plot(y, 'o')
    
    x = np.arange(degree,N)
    plt.plot(x, f, '+')

    #plt.xlabel('x')
    #plt.ylabel('y')

    plt.gca().set_ylim((2,12.5))
    #plt.gca().set_xlim((1950,2025))
    plt.title('RMS = '+str(np.sqrt(E[0]/len(yy))[0]))
    plt.show()
    print(w_ls)

fit_and_plot_ar(2)


[[ 0.45032605]
 [ 0.61769257]]

In [23]:
interact(fit_and_plot_ar, degree=(1,5))


Out[23]:
<function __main__.fit_and_plot_ar>
\begin{eqnarray} \left( \begin{array}{c} y_0 \\ y_1 \\ y_2 \\ y_{3} \\ y_{4} \\ z_{5} \\ z_{6} \\ z_{7} \\ \end{array} \right) \approx \left( \begin{array}{ccccc} z_{-1} & z_{-2} \\ y_{0} & z_{-1} \\ y_{1} & y_{0} \\ y_{2} & y_{1} \\ y_{3} & y_{2} \\ y_{4} & y_{3} \\ z_{5} & y_{4} \\ z_{6} & z_{5} \\ \end{array} \right) \left( \begin{array}{c} w_1 \\ w_2 \end{array} \right) \end{eqnarray}
$$ 0 \approx y_k - w_1 y_{k-1} - w_2 y_{k-2} $$

$v_0 = 1, v_1 = -w_1, v_2 = -w_2$

$$ 0 \approx \sum_{r} \sum_{s} v_r y_{s} \delta(k-r, s) $$

In [112]:
%matplotlib inline
#from cvxpy import *
import cvxpy as cvx
import numpy as np
import matplotlib as mpl
import matplotlib.pylab as plt

np.set_printoptions(precision=3)

x = np.matrix('[0,1,2,3,4,5]').T
xx = np.matrix(np.arange(-1,6,0.1)).T
y = np.matrix('[2,4,6,-1,10,12]').T
N = len(x)
degree = 3
#A = np.hstack((np.power(x,0), np.power(x,1), np.power(x,2)))
A = np.hstack((np.power(x,i) for i in range(degree+1)))
A2 = np.hstack((np.power(xx,i) for i in range(degree+1)))


# Construct the problem.
w = cvx.Variable(degree+1)
objective = cvx.Minimize(cvx.norm(A*w - y, 1.1))
#constraints = [0 <= x, x <= 10]
#prob = Problem(objective, constraints)
prob = cvx.Problem(objective)

# The optimal objective is returned by prob.solve().
result = prob.solve()
# The optimal value for x is stored in x.value.
print(w.value)
# The optimal Lagrange multiplier for a constraint
# is stored in constraint.dual_value.
#print(constraints[0].dual_value)

plt.plot(x.T.tolist(), y.T.tolist(), 'o')
plt.plot(xx, A2*w.value, '-')
plt.show()


[[ 2.   ]
 [ 2.08 ]
 [-0.096]
 [ 0.016]]

In [ ]: