Redes Neuronales y Control difuso

Guia 4

Martin Noblía


<span xmlns:dct="http://purl.org/dc/terms/" property="dct:title">Guia4 ejercicio4</span> por <a xmlns:cc="http://creativecommons.org/ns#" href="http://nbviewer.ipython.org/urls/raw.githubusercontent.com/elsuizo/Redes_neuronales_Fuzzy/master/guia1.ipynb?create=1" property="cc:attributionName" rel="cc:attributionURL">Martin Noblía</a> se distribuye bajo una Licencia Creative Commons Atribución-CompartirIgual 4.0 Internacional.

Ejercicio 4

La idea de este ejercicio es estudiar el fenómeno de overfitting. Supongamos que tenemos una serie de datos que vamos ajustar con una red (una capa hidden). Los datos se encuentran en el archivo datos_guia4_ej4.mat. Nuestro objetivo es determinar el número optimo de neuronas en la capa hidden.

  • a) Determine una estrategia para dividir los datos en dos conjuntos: {ptrain,ttrain} y {ptest,ttest}
  • b) Ajuste Nest redes neuronales con Nneuro neuronas en la capa hidden

  • c) Grafique el error absoluto promedio para los ajustes de entrenamiento {ptrain,ttrain} y para los ajustes con los datos de testeo {ptest,ttest}.

  • d) Explique el comportamiento de ambos errores.


In [1]:
# Imports
import neurolab as nl
import numpy as np
import scipy.io as sio
import matplotlib.pyplot as plt
from sklearn import cross_validation

# Parametros 
%matplotlib inline
plt.rcParams['figure.figsize'] = 8,6 #parámetros de tamaño

a)

Para dividir los datos optamos por tomar un $10\%$ del total para test, ya que la cantidad de datos total es muy reducida. Normalmente se elige entre un $20\% - 30\%$


In [2]:
datos = sio.loadmat('datos_guia4_ej4.mat')
t = datos['t']
p = datos['p']
t = t.reshape(47,1)
p = p.reshape(47,1)

In [3]:
X_train, X_test, y_train, y_test = cross_validation.train_test_split(p, t, test_size=0.10, random_state=0)

In [4]:
neuronas = 50
net = nl.net.newff([[0, 1]], [1, neuronas, 1])

In [5]:
net.trainf = nl.train.train_gdx

In [6]:
net.out_minmax = np.array([[0,1]])

In [7]:
e = net.train(X_train, y_train, show=100, epochs=500, goal=0.01)


Epoch: 100; Error: 5.48841602214;
Epoch: 200; Error: 0.88264224914;
Epoch: 300; Error: 0.618295349848;
Epoch: 400; Error: 0.609319460669;
Epoch: 500; Error: 0.474157845702;
The maximum number of train epochs is reached

In [8]:
# Simulo con el set de prueba
out = net.sim(X_test)
out.shape


Out[8]:
(5, 1)

In [9]:
# Comparo con el set de test 
plt.plot(y_test, 'ro-')

plt.plot(out, 'go-')
plt.show()



In [10]:
plt.plot(e)


Out[10]:
[<matplotlib.lines.Line2D at 0x398ec10>]

In [11]:
# Simulo para varios numeros de neuronas
#plt.subplots_adjust(hspace=0.000)
neuronas = np.arange(10, 300, 10)
e = np.zeros((500, len(neuronas)))

X_train, X_test, y_train, y_test = cross_validation.train_test_split(p, t, test_size=0.10, random_state=0)

for i,n in enumerate(neuronas):
    # Simulamos para varios numeros de neuronas
    
    
    
    net = nl.net.newff([[0, 1]], [1, n, 1])
    net.trainf = nl.train.train_gdm
    net.out_minmax = np.array([[0,1]])
    e[:,i] = np.asarray(net.train(X_train, y_train, show=500, epochs=500, goal=0.001))
    #e = net.train(X_train, y_train, show=500, epochs=500, goal=0.01)
    out = net.sim(X_test)
    
    
    # Plots
    fig,ax1 = plt.subplots(1,2)
    ax1[0].plot(y_test, 'ro-')
    ax1[0].plot(out, 'go-')
    ax1[0].set_title('Red entrenada con:%s neuronas'% n)
    ax1[0].legend('y_test','salida de la red')
    ax1[1].plot(e[:,i])
    ax1[1].set_title('error')


Epoch: 500; Error: 0.201895669092;
The maximum number of train epochs is reached
Epoch: 500; Error: 0.135622241868;
The maximum number of train epochs is reached
Epoch: 500; Error: 0.178878821683;
The maximum number of train epochs is reached
Epoch: 500; Error: 0.1468501139;
The maximum number of train epochs is reached
Epoch: 500; Error: 0.664530682851;
The maximum number of train epochs is reached
Epoch: 500; Error: 0.168973271317;
The maximum number of train epochs is reached
Epoch: 500; Error: 0.146711476789;
The maximum number of train epochs is reached
Epoch: 500; Error: 5.50351122387;
The maximum number of train epochs is reached
Epoch: 500; Error: 5.50348939671;
The maximum number of train epochs is reached
Epoch: 500; Error: 0.038238549246;
The maximum number of train epochs is reached
Epoch: 500; Error: 5.5035112239;
The maximum number of train epochs is reached
Epoch: 500; Error: 0.548189105838;
The maximum number of train epochs is reached
Epoch: 500; Error: 0.0101929658472;
The maximum number of train epochs is reached
Epoch: 500; Error: 5.5035112239;
The maximum number of train epochs is reached
Epoch: 500; Error: 5.5035112239;
The maximum number of train epochs is reached
Epoch: 500; Error: 0.0147591954818;
The maximum number of train epochs is reached
Epoch: 500; Error: 5.5035112239;
The maximum number of train epochs is reached
Epoch: 500; Error: 5.5035112234;
The maximum number of train epochs is reached
Epoch: 500; Error: 5.50350905878;
The maximum number of train epochs is reached
Epoch: 500; Error: 5.5035112239;
The maximum number of train epochs is reached
Epoch: 500; Error: 5.5035112239;
/usr/local/lib/python2.7/dist-packages/matplotlib/legend.py:317: UserWarning: Unrecognized location "salida de la red". Falling back on "best"; valid locations are
	right
	center left
	upper right
	lower right
	best
	center
	lower left
	center right
	upper left
	upper center
	lower center

  % (loc, '\n\t'.join(self.codes.iterkeys())))
/usr/local/lib/python2.7/dist-packages/matplotlib/pyplot.py:412: RuntimeWarning: More than 20 figures have been opened. Figures created through the pyplot interface (`matplotlib.pyplot.figure`) are retained until explicitly closed and may consume too much memory. (To control this warning, see the rcParam `figure.max_num_figures`).
  max_open_warning, RuntimeWarning)
The maximum number of train epochs is reached
Epoch: 500; Error: 5.5035112239;
The maximum number of train epochs is reached
Epoch: 500; Error: 5.50351119298;
The maximum number of train epochs is reached
Epoch: 500; Error: 5.5035112239;
The maximum number of train epochs is reached
Epoch: 500; Error: 0.447310043409;
The maximum number of train epochs is reached
Epoch: 500; Error: 5.5035112239;
The maximum number of train epochs is reached
Epoch: 500; Error: 1.2677435841;
The maximum number of train epochs is reached
Epoch: 500; Error: 5.5035112239;
The maximum number of train epochs is reached
Epoch: 500; Error: 5.50351122361;
The maximum number of train epochs is reached

In [12]:
for i,v in enumerate(neuronas):
    plt.plot(v, e[:,i].mean(),'go')
plt.grid()
plt.title('Error promedio para cada numero de neuronas')


Out[12]:
<matplotlib.text.Text at 0x9212750>

In [ ]: