In [1]:
import warnings
warnings.filterwarnings('ignore')

1.- Small Circle inside Large Circle

a) Escriba una función que genere (aleatoriamente) $n$ datos etiquetados de la forma $\{(x_1; y_1), ... ,(x_n; y_n)\}, x_i \in R^2, y_i \in \{0, 1\}$, con una distribución de probabilidad que refleje la configuración linealmente inseparable que muestra la Fig. 1. Utilice esta función para crear 1000 datos de entrenamiento y 1000 datos de pruebas. Para medir la tendencia de los modelos a sobre-ajuste, agregue un 5% de ruido al dataset, generando x's cercanos a la frontera. Genere un gráfico que muestre datos de entrenamiento y pruebas, identificando cada clase con un color diferente.


In [2]:
import numpy as np
from sklearn.utils import check_random_state
from sklearn.model_selection import train_test_split

def do_circles(n=2000,noisy_n=0.05):
    generator = check_random_state(8)
    linspace = np.linspace(0, 2 * np.pi, n // 2 + 1)[:-1]
    outer_circ_x = np.cos(linspace)
    outer_circ_y = np.sin(linspace)
    inner_circ_x = outer_circ_x * .3
    inner_circ_y = outer_circ_y * .3
    X = np.vstack((np.append(outer_circ_x, inner_circ_x),
                   np.append(outer_circ_y, inner_circ_y))).T
    
    y = np.hstack([np.zeros(n // 2, dtype=np.intp),
                   np.ones(n // 2, dtype=np.intp)])
    X += generator.normal(scale=noisy_n, size=X.shape)
    
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.5, random_state=42)
    return X_train,y_train,X_test,y_test

X_train,Y_train,X_test,Y_test = do_circles()

Para lo que sigue de la actividad utilice la siguiente función para graficar las fronteras de clasificación en base a la probabilidad, definida por un algoritmo, de un ejemplo a pertenecer a una clase en particular.


In [3]:
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches


def plot_classifier(clf,X_train,Y_train,X_test,Y_test,model_type):
  f, axis = plt.subplots(1, 1, sharex='col', sharey='row',figsize=(12, 8))
  axis.scatter(X_train[:,0],X_train[:,1],s=30,c=Y_train,zorder=10,cmap='cool')
  axis.scatter(X_test[:,0],X_test[:,1],s=20,c=Y_test,zorder=10,cmap='Greys')
  XX, YY = np.mgrid[-2:2:200j, -2:2:200j]
  if model_type == 'DecisionTree':
      Z = clf.predict_proba(np.c_[XX.ravel(), YY.ravel()])[:,0]
  else:
      Z = clf.predict(np.c_[XX.ravel(), YY.ravel()])
  Z = Z.reshape(XX.shape)
  Zplot = Z >= 0.5
  axis.pcolormesh(XX, YY, Zplot ,cmap='YlGn')
  axis.contour(XX, YY, Z, alpha=1, colors=["k", "k", "k"], linestyles=["--", "-", "--"],
  levels=[-2, 0, 2])
  plt.show()

In [4]:
from sklearn.tree import DecisionTreeClassifier
Tree = DecisionTreeClassifier(max_leaf_nodes=3, random_state=0)
Tree.fit(X_train, Y_train)
plot_classifier(Tree,X_train,Y_train,X_test,Y_test,"DecisionTree")



In [5]:
from sklearn.linear_model import LogisticRegression
LR = LogisticRegression(C=100, penalty='l2', tol=0.01)
LR.fit(X_train, Y_train)
plot_classifier(LR,X_train,Y_train,X_test,Y_test,"Logistic Regression")



In [6]:
from sklearn.svm import SVC as SVM
model_SVM = SVM(kernel='rbf')
model_SVM.fit(X_train,Y_train)
plot_classifier(model_SVM,X_train,Y_train,X_test,Y_test,"SVM")


b) Demuestre experimentalmente que una red neuronal artificial correspondiente a 1 sola neurona (i.e. sin capas escondidas) no puede resolver satisfactoriamente el problema. Puede utilizar la función de activación y el método de entrenamiento que prefiera. Sea convincente: por ejemplo, intente modificar los parámetros de la máquina de aprendizaje, reportando métricas que permitan evaluar el desempeño del modelo en el problema con cada cambio efectuado. Adapte también la función plot classifier para que represente gráficamente la solución encontrada por la red neuronal. Describa y explique lo que observa, reportando gráficos de la solución sólo para algunos casos representativos.


In [7]:
from keras.models import Sequential
from keras.layers import Dense
from keras.optimizers import SGD,RMSprop,Adam,Adagrad

def train_models():
    activ = ['sigmoid','relu','softmax']
    opti = [SGD,RMSprop,Adam]
    clipn = [0.7,1.0]
    clipv = [0.7,1.0]
    n_h=1
    total = len(activ)*len(opti)*len(clipn)*len(clipv)
    actual = 0
    scores = []
    for a in activ:
        for o in opti:
            for cn in clipn:
                for cv in clipv:
                    actual+=1
                    print("Training %d/%d"%(actual,total))
                    model = Sequential()
                    model.add(Dense(n_h, input_dim=X_train.shape[1], kernel_initializer='uniform', activation='relu'))
                    model.add(Dense(1, init='uniform', activation=a))
                    model.compile(optimizer=o(lr=1, clipnorm=cn, clipvalue=cv), loss="binary_crossentropy", metrics=["accuracy"])
                    
                    train = model.fit(X_train, Y_train, epochs=50, batch_size=100, verbose=0)
                    train_scores = model.evaluate(X_train, Y_train)
                    test_scores = model.evaluate(X_test, Y_test)
                    scores.append((model,train_scores,test_scores))
                    print(model,train_scores,test_scores)
    return scores

scores = train_models()


Using TensorFlow backend.
Training 1/36
1000/1000 [==============================] - 0s 46us/step
1000/1000 [==============================] - 0s 63us/step
<keras.models.Sequential object at 0x7fe3ae3f9668> [0.53329177474975586, 0.70699999999999996] [0.56765610313415527, 0.66300000000000003]
Training 2/36
1000/1000 [==============================] - 0s 66us/step
1000/1000 [==============================] - 0s 51us/step
<keras.models.Sequential object at 0x7fe397c445f8> [0.53357252645492559, 0.70699999999999996] [0.57050743103027346, 0.65600000000000003]
Training 3/36
1000/1000 [==============================] - 0s 70us/step
1000/1000 [==============================] - 0s 44us/step
<keras.models.Sequential object at 0x7fe3958fb518> [0.53917391490936284, 0.69799999999999995] [0.5662380132675171, 0.66800000000000004]
Training 4/36
1000/1000 [==============================] - 0s 66us/step
1000/1000 [==============================] - 0s 41us/step
<keras.models.Sequential object at 0x7fe3955e5e10> [0.56024616289138796, 0.67800000000000005] [0.53560415887832646, 0.69399999999999995]
Training 5/36
1000/1000 [==============================] - 0s 83us/step
1000/1000 [==============================] - 0s 36us/step
<keras.models.Sequential object at 0x7fe394fd6d68> [0.59809335136413577, 0.68500000000000005] [0.62616265010833738, 0.66600000000000004]
Training 6/36
1000/1000 [==============================] - 0s 115us/step
1000/1000 [==============================] - 0s 48us/step
<keras.models.Sequential object at 0x7fe394fa7748> [0.56732887554168698, 0.67700000000000005] [0.56402896499633792, 0.67500000000000004]
Training 7/36
1000/1000 [==============================] - 0s 164us/step
1000/1000 [==============================] - 0s 61us/step
<keras.models.Sequential object at 0x7fe394cc89e8> [0.54051197791099548, 0.69799999999999995] [0.55144263792037962, 0.67300000000000004]
Training 8/36
1000/1000 [==============================] - 0s 103us/step
1000/1000 [==============================] - 0s 47us/step
<keras.models.Sequential object at 0x7fe3953883c8> [0.54970223951339725, 0.69299999999999995] [0.58320280885696407, 0.65900000000000003]
Training 9/36
1000/1000 [==============================] - 0s 102us/step
1000/1000 [==============================] - 0s 39us/step
<keras.models.Sequential object at 0x7fe3945f89e8> [0.69332344055175776, 0.49299999999999999] [0.69305961608886724, 0.50700000000000001]
Training 10/36
1000/1000 [==============================] - 0s 118us/step
1000/1000 [==============================] - 0s 77us/step
<keras.models.Sequential object at 0x7fe3942adc50> [0.543179594039917, 0.69299999999999995] [0.55762294673919677, 0.67000000000000004]
Training 11/36
1000/1000 [==============================] - 0s 127us/step
1000/1000 [==============================] - 0s 40us/step
<keras.models.Sequential object at 0x7fe387f064e0> [0.71778130054473877, 0.50700000000000001] [0.72443349170684812, 0.49299999999999999]
Training 12/36
1000/1000 [==============================] - 0s 136us/step
1000/1000 [==============================] - 0s 48us/step
<keras.models.Sequential object at 0x7fe387b6b780> [0.53452560091018675, 0.69999999999999996] [0.54976681327819821, 0.67800000000000005]
Training 13/36
1000/1000 [==============================] - 0s 385us/step
1000/1000 [==============================] - 0s 87us/step
<keras.models.Sequential object at 0x7fe38774bba8> [8.1718744888305661, 0.49299999999999999] [7.946221115112305, 0.50700000000000001]
Training 14/36
1000/1000 [==============================] - 0s 229us/step
1000/1000 [==============================] - 0s 65us/step
<keras.models.Sequential object at 0x7fe38739d518> [8.1718744888305661, 0.49299999999999999] [7.946221115112305, 0.50700000000000001]
Training 15/36
1000/1000 [==============================] - 0s 217us/step
1000/1000 [==============================] - 0s 61us/step
<keras.models.Sequential object at 0x7fe387088470> [8.1718744888305661, 0.49299999999999999] [7.946221115112305, 0.50700000000000001]
Training 16/36
1000/1000 [==============================] - 0s 198us/step
1000/1000 [==============================] - 0s 251us/step
<keras.models.Sequential object at 0x7fe386e1e780> [8.1718744888305661, 0.49299999999999999] [7.946221115112305, 0.50700000000000001]
Training 17/36
1000/1000 [==============================] - 0s 256us/step
1000/1000 [==============================] - 0s 62us/step
<keras.models.Sequential object at 0x7fe386a81390> [7.8595956602096555, 0.0] [8.0827890396118161, 0.0]
Training 18/36
1000/1000 [==============================] - 0s 315us/step
1000/1000 [==============================] - 0s 60us/step
<keras.models.Sequential object at 0x7fe38644fc50> [8.1718744888305661, 0.49299999999999999] [7.946221115112305, 0.50700000000000001]
Training 19/36
1000/1000 [==============================] - 0s 261us/step
1000/1000 [==============================] - 0s 79us/step
<keras.models.Sequential object at 0x7fe3864039e8> [7.8595956602096555, 0.0] [8.0827890396118161, 0.0]
Training 20/36
1000/1000 [==============================] - 0s 216us/step
1000/1000 [==============================] - 0s 49us/step
<keras.models.Sequential object at 0x7fe386115080> [8.1718744888305661, 0.49299999999999999] [7.946221115112305, 0.50700000000000001]
Training 21/36
1000/1000 [==============================] - 0s 334us/step
1000/1000 [==============================] - 0s 51us/step
<keras.models.Sequential object at 0x7fe385d895f8> [5.5323078126715259, 0.20999999999999999] [5.6853757247924808, 0.20799999999999999]
Training 22/36
1000/1000 [==============================] - 1s 540us/step
1000/1000 [==============================] - 0s 50us/step
<keras.models.Sequential object at 0x7fe385abd470> [0.6004863793849945, 0.69099999999999995] [0.63163766050338743, 0.66900000000000004]
Training 23/36
1000/1000 [==============================] - 0s 321us/step
1000/1000 [==============================] - 0s 294us/step
<keras.models.Sequential object at 0x7fe3856e8f60> [4.9580818815231327, 0.182] [5.4044686164855955, 0.16800000000000001]
Training 24/36
1000/1000 [==============================] - 0s 352us/step
1000/1000 [==============================] - 0s 67us/step
<keras.models.Sequential object at 0x7fe38532a9b0> [8.1718744888305661, 0.49299999999999999] [7.946221115112305, 0.50700000000000001]
Training 25/36
1000/1000 [==============================] - 0s 408us/step
1000/1000 [==============================] - 0s 65us/step
<keras.models.Sequential object at 0x7fe3867c1898> [7.8595956602096555, 0.50700000000000001] [8.0827890396118161, 0.49299999999999999]
Training 26/36
1000/1000 [==============================] - 0s 472us/step
1000/1000 [==============================] - 0s 78us/step
<keras.models.Sequential object at 0x7fe384c31f28> [7.8595956602096555, 0.50700000000000001] [8.0827890396118161, 0.49299999999999999]
Training 27/36
1000/1000 [==============================] - 0s 388us/step
1000/1000 [==============================] - 0s 83us/step
<keras.models.Sequential object at 0x7fe3848eadd8> [7.8595956602096555, 0.50700000000000001] [8.0827890396118161, 0.49299999999999999]
Training 28/36
1000/1000 [==============================] - 0s 306us/step
1000/1000 [==============================] - 0s 61us/step
<keras.models.Sequential object at 0x7fe3845b0eb8> [7.8595956602096555, 0.50700000000000001] [8.0827890396118161, 0.49299999999999999]
Training 29/36
1000/1000 [==============================] - 0s 272us/step
1000/1000 [==============================] - 0s 64us/step
<keras.models.Sequential object at 0x7fe38426d6a0> [7.8595956602096555, 0.50700000000000001] [8.0827890396118161, 0.49299999999999999]
Training 30/36
1000/1000 [==============================] - 0s 286us/step
1000/1000 [==============================] - 0s 66us/step
<keras.models.Sequential object at 0x7fe377f7bf28> [7.8595956602096555, 0.50700000000000001] [8.0827890396118161, 0.49299999999999999]
Training 31/36
1000/1000 [==============================] - 0s 291us/step
1000/1000 [==============================] - 0s 44us/step
<keras.models.Sequential object at 0x7fe377b87ac8> [7.8595956602096555, 0.50700000000000001] [8.0827890396118161, 0.49299999999999999]
Training 32/36
1000/1000 [==============================] - 0s 473us/step
1000/1000 [==============================] - 0s 45us/step
<keras.models.Sequential object at 0x7fe3778b6630> [7.8595956602096555, 0.50700000000000001] [8.0827890396118161, 0.49299999999999999]
Training 33/36
1000/1000 [==============================] - 0s 319us/step
1000/1000 [==============================] - 0s 59us/step
<keras.models.Sequential object at 0x7fe3775f49b0> [7.8595956602096555, 0.50700000000000001] [8.0827890396118161, 0.49299999999999999]
Training 34/36
1000/1000 [==============================] - 0s 329us/step
1000/1000 [==============================] - 0s 50us/step
<keras.models.Sequential object at 0x7fe377226780> [7.8595956602096555, 0.50700000000000001] [8.0827890396118161, 0.49299999999999999]
Training 35/36
1000/1000 [==============================] - 0s 331us/step
1000/1000 [==============================] - 0s 54us/step
<keras.models.Sequential object at 0x7fe376e0f860> [7.8595956602096555, 0.50700000000000001] [8.0827890396118161, 0.49299999999999999]
Training 36/36
1000/1000 [==============================] - 1s 884us/step
1000/1000 [==============================] - 0s 67us/step
<keras.models.Sequential object at 0x7fe376a0a710> [7.8595956602096555, 0.50700000000000001] [8.0827890396118161, 0.49299999999999999]

In [8]:
def plot_scores(scores,n=3,title="Scores"):
    count = len(scores)
    models = []
    # for x in scores[:n]: models.append(x) 
    for x in scores[int(count/2):int(count/2)+n]: models.append(x) 
    for x in scores[-n:]: models.append(x) 
    
    N = len(models)
    ind = np.arange(N)  # the x locations for the groups
    width = 0.45         # the width of the bars

    fig, ax = plt.subplots()

    test_miss = list(min([m[2][0],1]) for m in models)
    test_acc = list(m[2][1] for m in models)

    rects1 = ax.bar(ind, test_miss, width, color='#00b1ff')
    rects1b = ax.bar(ind+width, test_acc, width, color='#0070ff')

    # add some text for labels, title and axes ticks
    ax.set_ylabel('Score')
    ax.set_title(title)
    ax.set_xticks(ind + width/2)
    ax.set_xticklabels((m[0].get_config()[1]['config']['activation'] for m in models),fontsize='xx-small')

    ax.legend((rects1[0], rects1b[0]), ('Misclassification', 'Accuracy'), loc='upper right')


    def autolabel(rects):
        for rect in rects:
            height = rect.get_height()
            ax.text(rect.get_x() + rect.get_width()/2., height,
                    '%.1f%%' % (100*height),
                    ha='center', va='bottom', fontsize='x-small')

    autolabel(rects1)
    autolabel(rects1b)
    fig.set_dpi(170)

    plt.show()
    
    for m in models[-n:]:
        plot_classifier(m[0],X_train,Y_train,X_test,Y_test,"Sequential")
    
scores = sorted(scores, key=(lambda x: x[2][1]), reverse=False)
plot_scores(scores)


Los gráficos muestran las 3 clasificaciones de en medio y las 3 mejores.

Como se puede ver en éstos gráficos de los modelos generados, todas las clasificaciones realizadas fueron separaciones lineales y como fue demostrado anteriormente, este problema no puede ser resuelto utilizando estas técnicas.

c) Demuestre experimentalmente que una red neuronal artificial con 1 capa escondida puede resolver satisfactoriamente el problema obtenido en a). Puede utilizar la arquitectura y el método de entrenamiento que prefiera, pero en esta actividad puede optar tranquilamente por usar los hiper-parámetros que se entregan como referencia en el código de ejemplo. Cambie el número de neuronas Nh en la red entre 2 y 32 en potencias de 2, graficando el error de entrenamiento y pruebas como función de Nh. Describa y explique lo que observa. Utilice la función plot_classifier, diseñada anteriormente, para construir gráficos de la solución en algunos casos representativos.


In [9]:
def train_models_nh():
    activ = ['sigmoid','relu']
    opti = [SGD,RMSprop]
    n_h = [2,4,8,16,32]
    total = len(activ)*len(opti)*len(n_h)
    actual = 0
    scores = []
    for a in activ:
        for o in opti:
            for h in n_h:
                actual+=1
                print("Training %d/%d"%(actual,total))
                model = Sequential()
                model.add(Dense(h, input_dim=X_train.shape[1], kernel_initializer='uniform', activation='relu'))
                model.add(Dense(1, init='uniform', activation=a))
                model.compile(optimizer=o(lr=1), loss="binary_crossentropy", metrics=["accuracy"])

                train = model.fit(X_train, Y_train, epochs=50, batch_size=100, verbose=0)
                train_scores = model.evaluate(X_train, Y_train)
                test_scores = model.evaluate(X_test, Y_test)
                scores.append((model,train_scores,test_scores))
                print(model,train_scores,test_scores)
    return scores

scores_nh = train_models_nh()


Training 1/20
1000/1000 [==============================] - 0s 382us/step
1000/1000 [==============================] - 0s 74us/step
<keras.models.Sequential object at 0x7fe37517eef0> [0.33459080398082736, 0.86599999999999999] [0.31241905522346497, 0.88100000000000001]
Training 2/20
1000/1000 [==============================] - 0s 358us/step
1000/1000 [==============================] - 0s 58us/step
<keras.models.Sequential object at 0x7fe375118ac8> [0.0096573284417390826, 1.0] [0.009352339550852776, 1.0]
Training 3/20
1000/1000 [==============================] - 1s 767us/step
1000/1000 [==============================] - 0s 130us/step
<keras.models.Sequential object at 0x7fe374ea2550> [0.0048589595258235934, 1.0] [0.0054451972618699074, 1.0]
Training 4/20
1000/1000 [==============================] - 1s 753us/step
1000/1000 [==============================] - 0s 95us/step
<keras.models.Sequential object at 0x7fe374c04390> [0.0043087821602821354, 1.0] [0.0049561949297785759, 1.0]
Training 5/20
1000/1000 [==============================] - 0s 389us/step
1000/1000 [==============================] - 0s 83us/step
<keras.models.Sequential object at 0x7fe376ab3630> [0.0046773646511137487, 1.0] [0.0050071214213967324, 1.0]
Training 6/20
1000/1000 [==============================] - 1s 800us/step
1000/1000 [==============================] - 0s 91us/step
<keras.models.Sequential object at 0x7fe3746df710> [8.1718744888305661, 0.49299999999999999] [7.946221115112305, 0.50700000000000001]
Training 7/20
1000/1000 [==============================] - 1s 613us/step
1000/1000 [==============================] - 0s 110us/step
<keras.models.Sequential object at 0x7fe374729b00> [1.1086857443842746e-07, 1.0] [2.3520203429256979e-05, 1.0]
Training 8/20
1000/1000 [==============================] - 1s 519us/step
1000/1000 [==============================] - 0s 77us/step
<keras.models.Sequential object at 0x7fe3743bee80> [8.1718744888305661, 0.49299999999999999] [7.946221115112305, 0.50700000000000001]
Training 9/20
1000/1000 [==============================] - 1s 698us/step
1000/1000 [==============================] - 0s 85us/step
<keras.models.Sequential object at 0x7fe3741747b8> [8.1718744888305661, 0.49299999999999999] [7.946221115112305, 0.50700000000000001]
Training 10/20
1000/1000 [==============================] - 0s 447us/step
1000/1000 [==============================] - 0s 64us/step
<keras.models.Sequential object at 0x7fe373e9dcf8> [8.1718744888305661, 0.49299999999999999] [7.946221115112305, 0.50700000000000001]
Training 11/20
1000/1000 [==============================] - 1s 568us/step
1000/1000 [==============================] - 0s 64us/step
<keras.models.Sequential object at 0x7fe373bdf6a0> [7.8595956602096555, 0.0] [8.0827890396118161, 0.0]
Training 12/20
1000/1000 [==============================] - 0s 422us/step
1000/1000 [==============================] - 0s 63us/step
<keras.models.Sequential object at 0x7fe37390d470> [7.8595956602096555, 0.0] [8.0827890396118161, 0.0]
Training 13/20
1000/1000 [==============================] - 0s 435us/step
1000/1000 [==============================] - 0s 65us/step
<keras.models.Sequential object at 0x7fe3736a1cc0> [7.8595956602096555, 0.0] [8.0827890396118161, 0.0]
Training 14/20
1000/1000 [==============================] - 0s 466us/step
1000/1000 [==============================] - 0s 69us/step
<keras.models.Sequential object at 0x7fe37347a320> [8.1718744888305661, 0.49299999999999999] [7.946221115112305, 0.50700000000000001]
Training 15/20
1000/1000 [==============================] - 1s 708us/step
1000/1000 [==============================] - 0s 90us/step
<keras.models.Sequential object at 0x7fe373212eb8> [7.8595956602096555, 0.0] [8.0827890396118161, 0.0]
Training 16/20
1000/1000 [==============================] - 1s 1ms/step
1000/1000 [==============================] - 0s 83us/step
<keras.models.Sequential object at 0x7fe372f761d0> [7.8595956602096555, 0.0] [8.0827890396118161, 0.0]
Training 17/20
1000/1000 [==============================] - 1s 852us/step
1000/1000 [==============================] - 0s 68us/step
<keras.models.Sequential object at 0x7fe372d23438> [7.8595956602096555, 0.0] [8.0827890396118161, 0.0]
Training 18/20
1000/1000 [==============================] - 1s 668us/step
1000/1000 [==============================] - 0s 69us/step
<keras.models.Sequential object at 0x7fe372a175c0> [7.8595956602096555, 0.0] [8.0827890396118161, 0.0]
Training 19/20
1000/1000 [==============================] - 0s 486us/step
1000/1000 [==============================] - 0s 64us/step
<keras.models.Sequential object at 0x7fe3731d0208> [7.8595956602096555, 0.0] [8.0827890396118161, 0.0]
Training 20/20
1000/1000 [==============================] - 1s 994us/step
1000/1000 [==============================] - 0s 80us/step
<keras.models.Sequential object at 0x7fe37251f5c0> [7.8595956602096555, 0.0] [8.0827890396118161, 0.0]

In [10]:
scores_nh = sorted(scores_nh, key=(lambda x: x[2][1]), reverse=False)
plot_scores(scores_nh)


Al contrario de lo visto anteriormente, al utilizar una capa oculta se logra una clasificación totalmente acertada de los datos. Como se puede ver gráficamente, para la clasificación se utilizan divisiones mayoritariamente circulares, las cuales 'encierran' a la clase interior.

d) Demuestre experimentalmente que stump (árbol de clasificación de 1 nivel) no puede resolver satisfactoriamente el problema anterior. Puede utilizar el criterio y la función de partición que prefiera. Sea convincente: por ejemplo, intente modificar los parámetros de la máquina, reportando métricas que permitan evaluar el desempeño del modelo en el problema con cada cambio efectuado. Adapte también la función plot_classifier para que represente gráficamente la solución encontrada por el árbol. Describa y explique lo que observa, reportando gráficos de la solución sólo para algunos casos representativos.


In [12]:
from sklearn.tree import DecisionTreeClassifier
crit=['gini','entropy']
split=['best','random']
sort=[True,False]

for c in crit:
    for s in split:
        clf=DecisionTreeClassifier(criterion=c,splitter=s,random_state=0,max_depth=1,presort=False)
        clf.fit(X_train,Y_train)
        acc_test = clf.score(X_test,Y_test)
        print("Test Accuracy = %f"%acc_test)
        plot_classifier(clf,X_train,Y_train,X_test,Y_test,'tree')


Test Accuracy = 0.659000
Test Accuracy = 0.676000
Test Accuracy = 0.659000
Test Accuracy = 0.676000

Nuevamente, al tener sólo un nivel, se realiza una sóla división del espacio, resultando en una clasificación lineal, la cual, como se vió previamente, no puede realizar una separación exacta de los datos.

e) Demuestre experimentalmente que un árbol de clasificación de múltiples niveles puede resolver satisfactoriamente el problema estudiado. Puede utilizar el criterio y la función de partición que prefiera, pero puede optar tranquilamente por usar los hiper-parámetros que se entregan como referencia en el código de ejemplo. Cambie el número de niveles admitidos en el árbol Nt entre 2 y 20, graficando el error de entrenamiento y pruebas como función de Nt. Describa y explique lo que observa. Utilice la función plot classifier, dise~nada anteriormente, para construir gráficos de la solución en algunos casos representativos.


In [21]:
crit = ['gini','entropy']
sort = [True,False]

scores_dt=[]

for n in range(2,5):
    for c in crit:
        clf=DecisionTreeClassifier(criterion=c,splitter='best',random_state=0,max_depth=n)
        clf.fit(X_train,Y_train)
        acc_train = clf.score(X_train,Y_train)
        acc_test = clf.score(X_test,Y_test)
        scores_dt.append((clf,acc_train,acc_test))

scores_dt


Out[21]:
[(DecisionTreeClassifier(class_weight=None, criterion='gini', max_depth=2,
              max_features=None, max_leaf_nodes=None,
              min_impurity_decrease=0.0, min_impurity_split=None,
              min_samples_leaf=1, min_samples_split=2,
              min_weight_fraction_leaf=0.0, presort=False, random_state=0,
              splitter='best'), 0.876, 0.85499999999999998),
 (DecisionTreeClassifier(class_weight=None, criterion='entropy', max_depth=2,
              max_features=None, max_leaf_nodes=None,
              min_impurity_decrease=0.0, min_impurity_split=None,
              min_samples_leaf=1, min_samples_split=2,
              min_weight_fraction_leaf=0.0, presort=False, random_state=0,
              splitter='best'), 0.876, 0.85499999999999998),
 (DecisionTreeClassifier(class_weight=None, criterion='gini', max_depth=3,
              max_features=None, max_leaf_nodes=None,
              min_impurity_decrease=0.0, min_impurity_split=None,
              min_samples_leaf=1, min_samples_split=2,
              min_weight_fraction_leaf=0.0, presort=False, random_state=0,
              splitter='best'), 0.94199999999999995, 0.91700000000000004),
 (DecisionTreeClassifier(class_weight=None, criterion='entropy', max_depth=3,
              max_features=None, max_leaf_nodes=None,
              min_impurity_decrease=0.0, min_impurity_split=None,
              min_samples_leaf=1, min_samples_split=2,
              min_weight_fraction_leaf=0.0, presort=False, random_state=0,
              splitter='best'), 0.94199999999999995, 0.91700000000000004),
 (DecisionTreeClassifier(class_weight=None, criterion='gini', max_depth=4,
              max_features=None, max_leaf_nodes=None,
              min_impurity_decrease=0.0, min_impurity_split=None,
              min_samples_leaf=1, min_samples_split=2,
              min_weight_fraction_leaf=0.0, presort=False, random_state=0,
              splitter='best'), 1.0, 0.996),
 (DecisionTreeClassifier(class_weight=None, criterion='entropy', max_depth=4,
              max_features=None, max_leaf_nodes=None,
              min_impurity_decrease=0.0, min_impurity_split=None,
              min_samples_leaf=1, min_samples_split=2,
              min_weight_fraction_leaf=0.0, presort=False, random_state=0,
              splitter='best'), 1.0, 0.996)]

In [23]:
def plot_scores_dt(scores,n=2,title="Scores"):
    count = len(scores)
    models = []
    
    for x in scores[:n]: models.append(x) 
    for x in scores[int(count/2):int(count/2)+n]: models.append(x) 
    for x in scores[-n:]: models.append(x) 
    
    N = len(models)
    ind = np.arange(N)  # the x locations for the groups
    width = 0.45         # the width of the bars

    fig, ax = plt.subplots()

    train_acc = list(m[1] for m in models)
    test_acc = list(m[2] for m in models)

    rects1 = ax.bar(ind, train_acc, width, color='#00b1ff')
    rects1b = ax.bar(ind+width, test_acc, width, color='#0070ff')
    
    ax.set_ylabel('Score')
    ax.set_title(title)
    ax.set_xticks(ind + width/2)
    ax.set_xticklabels((m[0].criterion+" "+str(m[0].max_depth) for m in models),fontsize='xx-small')

    ax.legend((rects1[0], rects1b[0]), ('Train', 'Test'), loc='lower right')

    def autolabel(rects):
        for rect in rects:
            height = rect.get_height()
            ax.text(rect.get_x() + rect.get_width()/2., height,
                    '%.1f%%' % (100*height),
                    ha='center', va='bottom', fontsize='x-small')

    autolabel(rects1)
    autolabel(rects1b)
    fig.set_dpi(170)

    plt.show()
    
    for m in models:
        plot_classifier(m[0],X_train,Y_train,X_test,Y_test,"DecisionTree")
        
scores_dt = sorted(scores_dt, key=(lambda x: x[1]), reverse=False)
plot_scores_dt(scores_dt,n=1)


Como se puede ver en los gráficos, con solo 4 niveles es posible 'encerrar' a la clase interior, solucionando con un excelente puntaje el problema

**(f)** Como ya se demostró experimentalmente que este problema es linealmente inseperable, ahora se pide experimentar otra alternativa. Para ello deberá realizar una proyección de los datos a un nuevo espacio dimensional (manifold) en el cual se reconozcan sus patrones no lineales, para poder trabajarlos con fronteras lineales. Utilice la técnica de PCA con la ayuda de un Kernel Gaussiano para extraer sus vectores con dimensión infinita de mayor varianza.


In [26]:
from sklearn.decomposition import KernelPCA
kpca = KernelPCA(n_components=2,kernel="rbf", gamma=5)
kpca = kpca.fit(X_train)
Xkpca_train = kpca.transform(X_train)
Xkpca_test = kpca.transform(X_test)

**(g)** Ajuste un algoritmo de aprendizaje con fronteras lineal para los datos proyectados en este nuevo espacio que captura sus componentes no lineales, muestre graficamente que el problema ahora puede ser resulto con estos métodos. Reporte métricas para evaluar el desempeño, comente y concluya.


In [27]:
Tree = DecisionTreeClassifier(max_leaf_nodes=3, random_state=0)
Tree.fit(Xkpca_train, Y_train)
plot_classifier(Tree,Xkpca_train,Y_train,Xkpca_test,Y_test,"DecisionTree")



In [28]:
LR = LogisticRegression(C=100, penalty='l2', tol=0.01)
LR.fit(Xkpca_train, Y_train)
plot_classifier(LR,Xkpca_train,Y_train,Xkpca_test,Y_test,"LogisticRegression")



In [29]:
model_SVM = SVM(kernel='linear')
model_SVM.fit(Xkpca_train,Y_train)
plot_classifier(model_SVM,Xkpca_train,Y_train,Xkpca_test,Y_test,"SVM")



In [ ]: