Ajuste de curvas

El ajuste de curvas es el proceso de construir una curva (función), que sea el mejor ajuste a una serie de puntos. Las curvas ajustadas pueden ser usadas como asistencia en la visualización de datos, para inferir valores de una función donde no hay datos disponibles, y para resumir la relación entre variables.

Referencia:


0. Introducción

Consideremos un polinomio de grado uno:

$$y = \beta_1 x + \beta_0.$$

Esta es una línea recta que tiene pendiente a_1. Sabemos que habrá una línea conectando dos puntos cualesquiera. Por tanto, una ecuación polinómica de primer grado es un ajuste perfecto entre dos puntos.

Si consideramos ahora un polinomio de segundo grado,

$$y = \beta_2 x^2 + \beta_1 x + \beta_0,$$

este se ajustará exactamente a tres puntos. Si aumentamos el grado de la función a la de un polinomio de tercer grado, obtenemos:

$$y = \beta_3 x^3 + \beta_2 x^2 + \beta_1 x + \beta_0,$$

que se ajustará a cuatro puntos.

Ejemplos

  1. Encontrar la línea recta que pasa exactamente por los puntos $(0,1)$ y $(1,0)$.
  2. Encontrar la parábola que pasa exactamente por los puntos $(-1,1)$, $(0,0)$ y $(1,1)$.

Solución

  1. Consideramos $y=\beta_1 x + \beta_0$. Evaluando en el punto $(0,1)$, obtenemos $a_1(0) + a_0 = 1$. Ahora, evaluando en el punto $(1,0)$, obtenemos $a_1(1) + a_0 = 0$. De esta manera, $$\left[\begin{array}{cc} 1 & 0 \\ 1 & 1\end{array}\right]\left[\begin{array}{c} \beta_0 \\ \beta_1\end{array}\right]=\left[\begin{array}{c} 1 \\ 0\end{array}\right].$$ Resolviendo, $\beta_0=-\beta_1=1$.

In [1]:
import numpy as np
import matplotlib.pyplot as plt

In [2]:
P1 = [0, 1]
P2 = [1, 0]

X = np.array([[1, 0], [1, 1]])
y = np.array([1, 0])
b0, b1 = np.linalg.inv(X).dot(y)
b0, b1


Out[2]:
(1.0, -1.0)

In [3]:
x = np.linspace(-0.2, 1.2, 100)
y = b1*x+b0

plt.figure(figsize=(8,6))
plt.plot([P1[0], P2[0]], [P1[1], P2[1]], 'r*', label = 'puntos')
plt.plot(x, y, 'b', label = 'recta ajustada')
plt.xlabel('$x$')
plt.ylabel('$y$')
plt.legend(loc = 'best')
plt.grid()
plt.show()


  1. Consideramos $y=\beta_2 x^2 + \beta_1 x + \beta_0$. Evaluando en el punto $(-1,1)$, obtenemos $\beta_2(-1)^2 + \beta_1(-1) + \beta_0 = 1$. Ahora, evaluando en el punto $(0,0)$, obtenemos $\beta_2(0)^2 + \beta_1(0) + \beta_0 = 0$. Finalmente, evaluando en el punto $(1,1)$, obtenemos $\beta_2(1)^2 + \beta_1(1) + \beta_0 = 1$. De esta manera, $$\left[\begin{array}{ccc} 1 & -1 & 1 \\ 1 & 0 & 0 \\ 1 & 1 & 1 \end{array}\right]\left[\begin{array}{c} \beta_0 \\ \beta_1 \\ \beta_2 \end{array}\right]=\left[\begin{array}{c} 1 \\ 0 \\ 1 \end{array}\right].$$ Resolviendo, $\beta_0=\beta_1=0$ y $\beta_2=1$.

In [4]:
P1 = [-1, 1]
P2 = [0, 0]
P3 = [1, 1]

X = np.array([[1, -1, 1], [1, 0, 0], [1, 1, 1]])
y = np.array([1, 0, 1])
b0, b1, b2 = np.linalg.inv(X).dot(y)
b0, b1, b2


Out[4]:
(0.0, 0.0, 1.0)

In [5]:
x = np.linspace(-1.2, 1.2, 100)
y = b2*x**2+b1*x+b0

plt.figure(figsize=(8,6))
plt.plot([P1[0], P2[0], P3[0]], [P1[1], P2[1], P3[1]], 'r*', label = 'puntos')
plt.plot(x, y, 'b', label = 'parábola ajustada')
plt.xlabel('$x$')
plt.ylabel('$y$')
plt.legend(loc = 'best')
plt.grid()
plt.show()


¿Qué tienen en común los anteriores problemas?

Las curvas están completamente determinadas por los puntos (datos limpios, suficientes y necesarios).

Esto se traduce en que, al llevar el problema a un sistema de ecuaciones lineales, existe una única solución: no hay necesidad, ni se puede optimizar nada.

¿Tendremos datos así de 'bonitos' en la vida real?

La realidad es que los datos que encontraremos en nuestra vida profesional se parecen más a esto...


In [37]:
x = np.linspace(0, 1, 30)
y = 10*x + 2 + np.random.randn(30)

plt.figure(figsize=(8,6))
plt.plot(x, y, '*b')
plt.xlabel('$x$')
plt.ylabel('$y$')
plt.grid()
plt.show()


¿Cómo ajustamos una curva a esto?

1. Problema básico

Consideramos que tenemos un conjunto de n pares ordenados de datos $(x_i,y_i)$, para $i=1,2,3,\dots,n$.

¿Cuál es la recta que mejor se ajusta a estos datos?

Consideramos entonces ajustes de la forma $\hat{f}(x) = \beta_0+\beta_1 x = \left[1 \quad x\right]\left[\begin{array}{c} \beta_0 \\ \beta_1 \end{array}\right]=\left[1 \quad x\right]\boldsymbol{\beta}$ (lineas rectas).

Para decir 'mejor', tenemos que definir algún sentido en que una recta se ajuste mejor que otra.

Mínimos cuadrados: el objetivo es seleccionar los coeficientes $\boldsymbol{\beta}=\left[\beta_0 \quad \beta_1 \right]^T$, de forma que la función evaluada en los puntos $x_i$ ($\hat{f}(x_i)$) aproxime los valores correspondientes $y_i$.

La formulación por mínimos cuadrados, encuentra los $\boldsymbol{\beta}=\left[\beta_0 \quad \beta_1 \right]^T$ que minimiza $$\sum_{i=1}^{n}(y_i-\hat{f}(x_i))^2=\sum_{i=1}^{n}(y_i-\left[1 \quad x_i\right]\boldsymbol{\beta})^2=\left|\left|\boldsymbol{y}-\boldsymbol{X}\boldsymbol{\beta}\right|\right|^2,$$

donde $\boldsymbol{y}=\left[y_1\quad\dots\quad y_n\right]$, y $\boldsymbol{X}=\left[\begin{array}{ccc}1 & x_1\\ \vdots & \vdots \\ 1 & x_n\end{array}\right].$ Esto es,

$$\boldsymbol{\beta}^{ls} = \arg \min_{\boldsymbol{\beta}} \left|\left|\boldsymbol{y}-\boldsymbol{X}\boldsymbol{\beta}\right|\right|^2$$

Para llevar a cabo la anterior minimización, el módulo pyomo_utilities.py ha sido actualizado con la función curve_polyfit(x, y, order, reg = None, robust = False). Esta función recibe:

  • x: datos de variable independiente
  • y: datos de variable dependiente
  • order: orden del polinomio con el que se quiere ajustar
  • reg_mode: modo de regularización
  • robust: ajuste robusto o no.

Para los datos anteriores queríamos ajustar una recta (order = 1)...


In [2]:
import pyomo_utilities

In [39]:
beta = pyomo_utilities.curve_polyfit(x, y, 1)


# ==========================================================
# = Solver Results                                         =
# ==========================================================
# ----------------------------------------------------------
#   Problem Information
# ----------------------------------------------------------
Problem: 
- Lower bound: -inf
  Upper bound: inf
  Number of objectives: 1
  Number of constraints: 0
  Number of variables: 2
  Sense: unknown
# ----------------------------------------------------------
#   Solver Information
# ----------------------------------------------------------
Solver: 
- Status: ok
  Message: Ipopt 3.12.8\x3a Optimal Solution Found
  Termination condition: optimal
  Id: 0
  Error rc: 0
  Time: 0.08418607711791992
# ----------------------------------------------------------
#   Solution Information
# ----------------------------------------------------------
Solution: 
- number of solutions: 0
  number of solutions displayed: 0

In [40]:
yhat = beta[0]+beta[1]*x

plt.figure(figsize=(8,6))
plt.plot(x, y, '*b', label = 'datos')
plt.plot(x, yhat, '-r', label = 'ajuste')
plt.legend(loc = 'best')
plt.xlabel('$x$')
plt.ylabel('$y$')
plt.grid()
plt.show()



In [41]:
beta


Out[41]:
array([  1.90607363,  10.22873982])

Note que la pendiente es aproximadamente $10$ y el intercepto es aproximadamente $2$.

La anterior idea se puede extender a ajuste polinomial...

2. Ajuste polinomial

Ahora, considere el siguiente conjunto de datos...


In [42]:
n = 100
x = np.linspace(np.pi/3, 5*np.pi/3, n)
y = 4*np.sin(x) + 0.5*np.random.randn(n)

plt.figure(figsize=(8,6))
plt.plot(x, y, '*b')
plt.xlabel('$x$')
plt.ylabel('$y$')
plt.grid()
plt.show()


¿Se ajustará bien una recta?


In [43]:
beta1 = pyomo_utilities.curve_polyfit(x, y, 1)
yhat1 = beta1[0]+beta1[1]*x

plt.figure(figsize=(8,6))
plt.plot(x, y, '*b', label = 'datos')
plt.plot(x, yhat1, '-r', label = 'ajuste 1')
plt.legend(loc = 'best')
plt.xlabel('$x$')
plt.ylabel('$y$')
plt.grid()
plt.show()


# ==========================================================
# = Solver Results                                         =
# ==========================================================
# ----------------------------------------------------------
#   Problem Information
# ----------------------------------------------------------
Problem: 
- Lower bound: -inf
  Upper bound: inf
  Number of objectives: 1
  Number of constraints: 0
  Number of variables: 2
  Sense: unknown
# ----------------------------------------------------------
#   Solver Information
# ----------------------------------------------------------
Solver: 
- Status: ok
  Message: Ipopt 3.12.8\x3a Optimal Solution Found
  Termination condition: optimal
  Id: 0
  Error rc: 0
  Time: 0.1688530445098877
# ----------------------------------------------------------
#   Solution Information
# ----------------------------------------------------------
Solution: 
- number of solutions: 0
  number of solutions displayed: 0

Veamos $\beta$ para el ajuste con recta


In [44]:
beta1


Out[44]:
array([ 7.5613924 , -2.40593379])

Creo que no. Pero, ¿Y una parábola?


In [45]:
beta2 = pyomo_utilities.curve_polyfit(x, y, 2)
yhat2 = beta2[0]+beta2[1]*x+beta2[2]*x**2

plt.figure(figsize=(8,6))
plt.plot(x, y, '*b', label = 'datos')
plt.plot(x, yhat1, '-r', label = 'ajuste 1')
plt.plot(x, yhat2, '-g', label = 'ajuste 2')
plt.legend(loc = 'best')
plt.xlabel('$x$')
plt.ylabel('$y$')
plt.grid()
plt.show()


# ==========================================================
# = Solver Results                                         =
# ==========================================================
# ----------------------------------------------------------
#   Problem Information
# ----------------------------------------------------------
Problem: 
- Lower bound: -inf
  Upper bound: inf
  Number of objectives: 1
  Number of constraints: 0
  Number of variables: 3
  Sense: unknown
# ----------------------------------------------------------
#   Solver Information
# ----------------------------------------------------------
Solver: 
- Status: ok
  Message: Ipopt 3.12.8\x3a Optimal Solution Found
  Termination condition: optimal
  Id: 0
  Error rc: 0
  Time: 0.07372069358825684
# ----------------------------------------------------------
#   Solution Information
# ----------------------------------------------------------
Solution: 
- number of solutions: 0
  number of solutions displayed: 0

Veamos $\beta$ para el ajuste con parábola


In [46]:
beta2


Out[46]:
array([ 7.06103439, -2.03067975, -0.05972353])

Tampoco. Quizá un polinomio cúbico...


In [47]:
beta3 = pyomo_utilities.curve_polyfit(x, y, 3)
yhat3 = beta3[0]+beta3[1]*x+beta3[2]*x**2+beta3[3]*x**3

plt.figure(figsize=(8,6))
plt.plot(x, y, '*b', label = 'datos')
plt.plot(x, yhat1, '-r', label = 'ajuste 1')
plt.plot(x, yhat2, '-g', label = 'ajuste 2')
plt.plot(x, yhat3, '-k', label = 'ajuste 3')
plt.legend(loc = 'best')
plt.xlabel('$x$')
plt.ylabel('$y$')
plt.grid()
plt.show()


# ==========================================================
# = Solver Results                                         =
# ==========================================================
# ----------------------------------------------------------
#   Problem Information
# ----------------------------------------------------------
Problem: 
- Lower bound: -inf
  Upper bound: inf
  Number of objectives: 1
  Number of constraints: 0
  Number of variables: 4
  Sense: unknown
# ----------------------------------------------------------
#   Solver Information
# ----------------------------------------------------------
Solver: 
- Status: ok
  Message: Ipopt 3.12.8\x3a Optimal Solution Found
  Termination condition: optimal
  Id: 0
  Error rc: 0
  Time: 0.07972860336303711
# ----------------------------------------------------------
#   Solution Information
# ----------------------------------------------------------
Solution: 
- number of solutions: 0
  number of solutions displayed: 0

Veamos $\beta$ para el ajuste con cúbica


In [48]:
beta3


Out[48]:
array([ -3.92858421,  11.07782448,  -4.64835053,   0.48686828])

Mucho mejor. Entonces, ¿mientras más se suba el orden mejor la aproximación?

¡Cuidado! OVERFITTING...


In [49]:
beta20 = pyomo_utilities.curve_polyfit(x, y, 20)
yhat20 = np.array([x**i for i in range(21)]).T.dot(beta20)

plt.figure(figsize=(8,6))
plt.plot(x, y, '*b', label = 'datos')
plt.plot(x, yhat1, '-r', label = 'ajuste 1')
plt.plot(x, yhat2, '-g', label = 'ajuste 2')
plt.plot(x, yhat3, '-k', label = 'ajuste 3')
plt.plot(x, yhat20, '-c', label = 'ajuste 20')
plt.legend(loc = 'best')
plt.xlabel('$x$')
plt.ylabel('$y$')
plt.grid()
plt.show()


WARNING - Loading a SolverResults object with a warning status into model=unknown; message from solver=Ipopt 3.12.8\x3a Maximum Number of Iterations Exceeded.
# ==========================================================
# = Solver Results                                         =
# ==========================================================
# ----------------------------------------------------------
#   Problem Information
# ----------------------------------------------------------
Problem: 
- Lower bound: -inf
  Upper bound: inf
  Number of objectives: 1
  Number of constraints: 0
  Number of variables: 21
  Sense: unknown
# ----------------------------------------------------------
#   Solver Information
# ----------------------------------------------------------
Solver: 
- Status: warning
  Message: Ipopt 3.12.8\x3a Maximum Number of Iterations Exceeded.
  Termination condition: maxIterations
  Id: 400
  Error rc: 0
  Time: 47.06028485298157
# ----------------------------------------------------------
#   Solution Information
# ----------------------------------------------------------
Solution: 
- number of solutions: 0
  number of solutions displayed: 0

De nuevo, veamos $\beta$


In [50]:
beta20


Out[50]:
array([ -4.93468972e+02,   1.81288474e+03,  -2.77119281e+03,
         2.27717188e+03,  -1.04857751e+03,   2.28935518e+02,
         1.24838389e+01,  -2.06797776e+01,   4.58586762e+00,
        -6.67443366e-02,  -1.24087628e-01,   4.41802905e-03,
         5.14054019e-03,  -3.77033885e-04,  -8.85796924e-05,
        -1.14858592e-05,  -2.25031570e-06,   4.60166262e-06,
        -1.15488415e-06,   1.16209009e-07,  -4.30313138e-09])

¡Cuidado! ver el tamaño de algunos coeficientes. Cuando los coeficientes son grandes, ¿qué pasa?

Es conveniente ver el error como función del orden del polinomio... selección de modelos


In [51]:
e_ms = []
for i in range(10):
    beta = pyomo_utilities.curve_polyfit(x, y, i+1);
    yhat = np.array([x**j for j in range(i+2)]).T.dot(beta)
    e_ms.append(sum((y - yhat)**2))
    
plt.figure(figsize=(8,5))
plt.plot(np.arange(10)+1, e_ms, 'o')
plt.xlabel('orden')
plt.ylabel('error')
plt.show()


# ==========================================================
# = Solver Results                                         =
# ==========================================================
# ----------------------------------------------------------
#   Problem Information
# ----------------------------------------------------------
Problem: 
- Lower bound: -inf
  Upper bound: inf
  Number of objectives: 1
  Number of constraints: 0
  Number of variables: 2
  Sense: unknown
# ----------------------------------------------------------
#   Solver Information
# ----------------------------------------------------------
Solver: 
- Status: ok
  Message: Ipopt 3.12.8\x3a Optimal Solution Found
  Termination condition: optimal
  Id: 0
  Error rc: 0
  Time: 0.08580756187438965
# ----------------------------------------------------------
#   Solution Information
# ----------------------------------------------------------
Solution: 
- number of solutions: 0
  number of solutions displayed: 0
# ==========================================================
# = Solver Results                                         =
# ==========================================================
# ----------------------------------------------------------
#   Problem Information
# ----------------------------------------------------------
Problem: 
- Lower bound: -inf
  Upper bound: inf
  Number of objectives: 1
  Number of constraints: 0
  Number of variables: 3
  Sense: unknown
# ----------------------------------------------------------
#   Solver Information
# ----------------------------------------------------------
Solver: 
- Status: ok
  Message: Ipopt 3.12.8\x3a Optimal Solution Found
  Termination condition: optimal
  Id: 0
  Error rc: 0
  Time: 0.08841514587402344
# ----------------------------------------------------------
#   Solution Information
# ----------------------------------------------------------
Solution: 
- number of solutions: 0
  number of solutions displayed: 0
# ==========================================================
# = Solver Results                                         =
# ==========================================================
# ----------------------------------------------------------
#   Problem Information
# ----------------------------------------------------------
Problem: 
- Lower bound: -inf
  Upper bound: inf
  Number of objectives: 1
  Number of constraints: 0
  Number of variables: 4
  Sense: unknown
# ----------------------------------------------------------
#   Solver Information
# ----------------------------------------------------------
Solver: 
- Status: ok
  Message: Ipopt 3.12.8\x3a Optimal Solution Found
  Termination condition: optimal
  Id: 0
  Error rc: 0
  Time: 0.08169269561767578
# ----------------------------------------------------------
#   Solution Information
# ----------------------------------------------------------
Solution: 
- number of solutions: 0
  number of solutions displayed: 0
# ==========================================================
# = Solver Results                                         =
# ==========================================================
# ----------------------------------------------------------
#   Problem Information
# ----------------------------------------------------------
Problem: 
- Lower bound: -inf
  Upper bound: inf
  Number of objectives: 1
  Number of constraints: 0
  Number of variables: 5
  Sense: unknown
# ----------------------------------------------------------
#   Solver Information
# ----------------------------------------------------------
Solver: 
- Status: ok
  Message: Ipopt 3.12.8\x3a Optimal Solution Found
  Termination condition: optimal
  Id: 0
  Error rc: 0
  Time: 0.09646058082580566
# ----------------------------------------------------------
#   Solution Information
# ----------------------------------------------------------
Solution: 
- number of solutions: 0
  number of solutions displayed: 0
# ==========================================================
# = Solver Results                                         =
# ==========================================================
# ----------------------------------------------------------
#   Problem Information
# ----------------------------------------------------------
Problem: 
- Lower bound: -inf
  Upper bound: inf
  Number of objectives: 1
  Number of constraints: 0
  Number of variables: 6
  Sense: unknown
# ----------------------------------------------------------
#   Solver Information
# ----------------------------------------------------------
Solver: 
- Status: ok
  Message: Ipopt 3.12.8\x3a Optimal Solution Found
  Termination condition: optimal
  Id: 0
  Error rc: 0
  Time: 0.0874786376953125
# ----------------------------------------------------------
#   Solution Information
# ----------------------------------------------------------
Solution: 
- number of solutions: 0
  number of solutions displayed: 0
# ==========================================================
# = Solver Results                                         =
# ==========================================================
# ----------------------------------------------------------
#   Problem Information
# ----------------------------------------------------------
Problem: 
- Lower bound: -inf
  Upper bound: inf
  Number of objectives: 1
  Number of constraints: 0
  Number of variables: 7
  Sense: unknown
# ----------------------------------------------------------
#   Solver Information
# ----------------------------------------------------------
Solver: 
- Status: ok
  Message: Ipopt 3.12.8\x3a Optimal Solution Found
  Termination condition: optimal
  Id: 0
  Error rc: 0
  Time: 0.09184813499450684
# ----------------------------------------------------------
#   Solution Information
# ----------------------------------------------------------
Solution: 
- number of solutions: 0
  number of solutions displayed: 0
# ==========================================================
# = Solver Results                                         =
# ==========================================================
# ----------------------------------------------------------
#   Problem Information
# ----------------------------------------------------------
Problem: 
- Lower bound: -inf
  Upper bound: inf
  Number of objectives: 1
  Number of constraints: 0
  Number of variables: 8
  Sense: unknown
# ----------------------------------------------------------
#   Solver Information
# ----------------------------------------------------------
Solver: 
- Status: ok
  Message: Ipopt 3.12.8\x3a Optimal Solution Found
  Termination condition: optimal
  Id: 0
  Error rc: 0
  Time: 0.07228708267211914
# ----------------------------------------------------------
#   Solution Information
# ----------------------------------------------------------
Solution: 
- number of solutions: 0
  number of solutions displayed: 0
# ==========================================================
# = Solver Results                                         =
# ==========================================================
# ----------------------------------------------------------
#   Problem Information
# ----------------------------------------------------------
Problem: 
- Lower bound: -inf
  Upper bound: inf
  Number of objectives: 1
  Number of constraints: 0
  Number of variables: 9
  Sense: unknown
# ----------------------------------------------------------
#   Solver Information
# ----------------------------------------------------------
Solver: 
- Status: ok
  Message: Ipopt 3.12.8\x3a Optimal Solution Found
  Termination condition: optimal
  Id: 0
  Error rc: 0
  Time: 0.07850098609924316
# ----------------------------------------------------------
#   Solution Information
# ----------------------------------------------------------
Solution: 
- number of solutions: 0
  number of solutions displayed: 0
# ==========================================================
# = Solver Results                                         =
# ==========================================================
# ----------------------------------------------------------
#   Problem Information
# ----------------------------------------------------------
Problem: 
- Lower bound: -inf
  Upper bound: inf
  Number of objectives: 1
  Number of constraints: 0
  Number of variables: 10
  Sense: unknown
# ----------------------------------------------------------
#   Solver Information
# ----------------------------------------------------------
Solver: 
- Status: ok
  Message: Ipopt 3.12.8\x3a Optimal Solution Found
  Termination condition: optimal
  Id: 0
  Error rc: 0
  Time: 0.09156322479248047
# ----------------------------------------------------------
#   Solution Information
# ----------------------------------------------------------
Solution: 
- number of solutions: 0
  number of solutions displayed: 0
# ==========================================================
# = Solver Results                                         =
# ==========================================================
# ----------------------------------------------------------
#   Problem Information
# ----------------------------------------------------------
Problem: 
- Lower bound: -inf
  Upper bound: inf
  Number of objectives: 1
  Number of constraints: 0
  Number of variables: 11
  Sense: unknown
# ----------------------------------------------------------
#   Solver Information
# ----------------------------------------------------------
Solver: 
- Status: ok
  Message: Ipopt 3.12.8\x3a Optimal Solution Found
  Termination condition: optimal
  Id: 0
  Error rc: 0
  Time: 0.13320088386535645
# ----------------------------------------------------------
#   Solution Information
# ----------------------------------------------------------
Solution: 
- number of solutions: 0
  number of solutions displayed: 0

En efecto, parece que con $3$ es suficiente.

¿Cómo prevenir el overfitting sin importar el orden del modelo?

3. Regularización

Vimos que la solución de mínimos cuadrados es: $$\boldsymbol{\beta}^{ls} = \arg \min_{\boldsymbol{\beta}} \left|\left|\boldsymbol{y}-\boldsymbol{X}\boldsymbol{\beta}\right|\right|^2.$$

Sin embargo, si crecemos el orden del modelo hay overfitting y algunos coeficientes óptimos $\boldsymbol{\beta}$ crecen muchísimo. Que un coeficiente sea muy grande, significa que se le da mucha importancia a alguna característica (que quizá sea ruido... no sirve para predecir).

La regularización consiste en penalizar la magnitud de los coeficientes $\boldsymbol{\beta}$ en el problema de optimización, para que no crezcan tanto.

3.1. Ridge

$$\boldsymbol{\beta}^{ridge} = \arg \min_{\boldsymbol{\beta}} \left|\left|\boldsymbol{y}-\boldsymbol{X}\boldsymbol{\beta}\right|\right|^2 + \lambda\left|\left|\boldsymbol{\beta}\right|\right|^2$$

In [52]:
beta20_ridge = pyomo_utilities.curve_polyfit(x, y, 20, reg_mode = 'ridge', reg_coef = 0.1)
yhat20_ridge = np.array([x**i for i in range(21)]).T.dot(beta20_ridge)

plt.figure(figsize=(8,6))
plt.plot(x, y, '*b', label = 'datos')
plt.plot(x, yhat20, '-c', label = 'ajuste 20')
plt.plot(x, yhat20_ridge, '--r', label = 'ajuste 20_ridge')
plt.legend(loc = 'best')
plt.xlabel('$x$')
plt.ylabel('$y$')
plt.grid()
plt.show()


WARNING - Loading a SolverResults object with a warning status into model=unknown; message from solver=Ipopt 3.12.8\x3a Maximum Number of Iterations Exceeded.
# ==========================================================
# = Solver Results                                         =
# ==========================================================
# ----------------------------------------------------------
#   Problem Information
# ----------------------------------------------------------
Problem: 
- Lower bound: -inf
  Upper bound: inf
  Number of objectives: 1
  Number of constraints: 0
  Number of variables: 21
  Sense: unknown
# ----------------------------------------------------------
#   Solver Information
# ----------------------------------------------------------
Solver: 
- Status: warning
  Message: Ipopt 3.12.8\x3a Maximum Number of Iterations Exceeded.
  Termination condition: maxIterations
  Id: 400
  Error rc: 0
  Time: 44.8629846572876
# ----------------------------------------------------------
#   Solution Information
# ----------------------------------------------------------
Solution: 
- number of solutions: 0
  number of solutions displayed: 0

In [53]:
beta20_ridge


Out[53]:
array([  1.15283050e+00,   9.73783290e-01,   6.30820326e-01,
         2.39028377e-01,  -1.64995663e-02,  -8.67068901e-03,
         8.71349947e-02,  -1.77323934e-01,  -6.87241957e-02,
         1.36359678e-01,  -3.21264688e-02,  -1.17461256e-02,
         5.12748798e-03,   1.10170418e-04,  -2.54450400e-04,
         2.42661845e-05,  -4.75117266e-06,   3.89368686e-06,
        -9.71334378e-07,   1.00167062e-07,  -3.82114908e-09])

3.2. Lasso

$$\boldsymbol{\beta}^{lasso} = \arg \min_{\boldsymbol{\beta}} \left|\left|\boldsymbol{y}-\boldsymbol{X}\boldsymbol{\beta}\right|\right|^2 + \lambda\left|\left|\boldsymbol{\beta}\right|\right|_1$$

La norma 1 no es más que la suma de los valores absolutos de las componentes $\left|\left|\boldsymbol{\beta}\right|\right|_1=\sum_{j=0}^m\left|\beta_j\right|$.


In [54]:
beta20_lasso = pyomo_utilities.curve_polyfit(x, y, 20, reg_mode = 'lasso', reg_coef = 0.001)
yhat20_lasso = np.array([x**i for i in range(21)]).T.dot(beta20_lasso)

plt.figure(figsize=(8,6))
plt.plot(x, y, '*b', label = 'datos')
plt.plot(x, yhat20, '-c', label = 'ajuste 20')
plt.plot(x, yhat20_ridge, '--r', label = 'ajuste 20_ridge')
plt.plot(x, yhat20_lasso, '--k', label = 'ajuste 20_lasso')
plt.legend(loc = 'best')
plt.xlabel('$x$')
plt.ylabel('$y$')
plt.grid()
plt.show()


WARNING - Loading a SolverResults object with a warning status into model=unknown; message from solver=Ipopt 3.12.8\x3a Maximum Number of Iterations Exceeded.
# ==========================================================
# = Solver Results                                         =
# ==========================================================
# ----------------------------------------------------------
#   Problem Information
# ----------------------------------------------------------
Problem: 
- Lower bound: -inf
  Upper bound: inf
  Number of objectives: 1
  Number of constraints: 0
  Number of variables: 21
  Sense: unknown
# ----------------------------------------------------------
#   Solver Information
# ----------------------------------------------------------
Solver: 
- Status: warning
  Message: Ipopt 3.12.8\x3a Maximum Number of Iterations Exceeded.
  Termination condition: maxIterations
  Id: 400
  Error rc: 0
  Time: 38.333542346954346
# ----------------------------------------------------------
#   Solution Information
# ----------------------------------------------------------
Solution: 
- number of solutions: 0
  number of solutions displayed: 0

In [55]:
beta20_lasso


Out[55]:
array([  4.74339449e+01,  -1.06689580e+02,   2.04186957e-04,
         2.28022744e+02,  -2.84491378e+02,   1.49246185e+02,
        -2.69738510e+01,  -6.87339214e+00,   3.47746344e+00,
        -8.81874852e-02,  -1.42530121e-01,   4.84261280e-03,
         4.75904665e-03,   2.35323378e-04,  -1.66801836e-04,
        -1.34792667e-05,  -1.89291441e-06,   3.51306280e-06,
        -7.17352334e-07,   5.69759860e-08,  -1.57529980e-09])

4. Ajuste robusto

Ahora, consideremos de nuevo el caso de la línea recta con un par de puntos atípicos al inicio y al final...


In [3]:
import numpy as np
import matplotlib.pyplot as plt

x = np.linspace(0, 1, 30)
y = 10*x + 2 + np.random.randn(30)

y[0] = 16
y[-1] = 0


plt.figure(figsize=(8,6))
plt.plot(x, y, '*b')
plt.xlabel('$x$')
plt.ylabel('$y$')
plt.grid()
plt.show()



In [4]:
beta = pyomo_utilities.curve_polyfit(x, y, 1)
yhat = beta[0]+beta[1]*x

plt.figure(figsize=(8,6))
plt.plot(x, y, '*b', label = 'datos')
plt.plot(x, yhat, '-r', label = 'ajuste')
plt.legend(loc = 'best')
plt.xlabel('$x$')
plt.ylabel('$y$')
plt.grid()
plt.show()


# ==========================================================
# = Solver Results                                         =
# ==========================================================
# ----------------------------------------------------------
#   Problem Information
# ----------------------------------------------------------
Problem: 
- Lower bound: -inf
  Upper bound: inf
  Number of objectives: 1
  Number of constraints: 0
  Number of variables: 2
  Sense: unknown
# ----------------------------------------------------------
#   Solver Information
# ----------------------------------------------------------
Solver: 
- Status: ok
  Message: Ipopt 3.12.8\x3a Optimal Solution Found
  Termination condition: optimal
  Id: 0
  Error rc: 0
  Time: 0.10215640068054199
# ----------------------------------------------------------
#   Solution Information
# ----------------------------------------------------------
Solution: 
- number of solutions: 0
  number of solutions displayed: 0

In [5]:
beta


Out[5]:
array([ 4.26260884,  5.23797593])

Si estos puntos que parecen ser atípicos, hacen parte de una 'mala medición', vemos que el ajuste que obtenemos a los otros puntos es muy pobre...

¿Cómo podemos evitar esto? La respuesta es ajuste robusto.


In [6]:
beta = pyomo_utilities.curve_polyfit(x, y, 1, robust=True)
yhat = beta[0]+beta[1]*x

plt.figure(figsize=(8,6))
plt.plot(x, y, '*b', label = 'datos')
plt.plot(x, yhat, '-r', label = 'ajuste')
plt.legend(loc = 'best')
plt.xlabel('$x$')
plt.ylabel('$y$')
plt.grid()
plt.show()


WARNING - Loading a SolverResults object with a warning status into model=unknown; message from solver=Ipopt 3.12.8\x3a Maximum Number of Iterations Exceeded.
# ==========================================================
# = Solver Results                                         =
# ==========================================================
# ----------------------------------------------------------
#   Problem Information
# ----------------------------------------------------------
Problem: 
- Lower bound: -inf
  Upper bound: inf
  Number of objectives: 1
  Number of constraints: 0
  Number of variables: 2
  Sense: unknown
# ----------------------------------------------------------
#   Solver Information
# ----------------------------------------------------------
Solver: 
- Status: warning
  Message: Ipopt 3.12.8\x3a Maximum Number of Iterations Exceeded.
  Termination condition: maxIterations
  Id: 400
  Error rc: 0
  Time: 26.388857126235962
# ----------------------------------------------------------
#   Solution Information
# ----------------------------------------------------------
Solution: 
- number of solutions: 0
  number of solutions displayed: 0

In [7]:
beta


Out[7]:
array([ 4.02193863,  6.93022593])

Mejor...

5. Actividad

  1. Ajustar polinomios de grado 1 hasta grado 10 a los siguientes datos.
  2. Graficar el error cuadrático acumulado contra el número de términos, y elegir un polinomio que ajuste bien y su grado no sea muy alto.
  3. Para el grado de polinomio elegido, realizar el ajuste con ridge con coeficiente de 0.001.
  4. Comparar los beta.

Abrir un nuevo notebook, llamado Tarea7_ApellidoNombre y subirlo a moodle en el espacio habilitado. Si no terminan esto en clase, tienen hasta mañana a las 23:00.


In [30]:
def f(x):
    return np.exp(-x**2/2)/np.sqrt(2*np.pi)

In [31]:
x = np.linspace(-3, 3)
y = f(x)

plt.figure(figsize=(8,6))
plt.plot(x, y, '*b', label = 'datos')
plt.legend(loc = 'best')
plt.xlabel('$x$')
plt.ylabel('$y$')
plt.grid()
plt.show()



In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:

Importante:

  • Proximo martes 21 de noviembre no hay clase.
  • Reposición: miércoles 22 de noviembre de 16:00 a 18:00 (clase de repaso, el viernes les confirmo el salón).
  • Examen el viernes 24 de noviembre.
  • Proyecto para el martes 28 de noviembre.
  • No tuve tiempo de revisar proyectos. Les mando al correo institucional de acá a mañana.
Created with Jupyter by Esteban Jiménez Rodríguez.