Keras Tutorial

http://keras.io

Esse tutorial é uma versão simplificada do tutorial disponível em: https://github.com/MLIME/Frameworks/tree/master/Keras

O que é Keras?

Keras is a high-level neural networks API, written in Python and capable of running on top of either TensorFlow or Theano. It was developed with a focus on enabling fast experimentation. Being able to go from idea to result with the least possible delay is key to doing good research.

Esse tutorial é dividido em três partes

Funcionamento Básico do Keras
Exemplo de Deep Feedforward Network
Exemplo de Convolutional Neural Network

1. Funcionamento básico do Keras

Backends

Theano ou TensorFlow (CPU ou GPU)

Tipos de Layers

Core layers: Dense, Activation, Dropout, Flatten
Convolutional layers: ConvXD, CroppingXD, UpSamplingXD
Pooling Layers: MaxPoolingXD, AveragePoolingXD
Custom layers can be created

Funções de perda

categorical_crossentropy
sparse_categorical_crossentropy
binary_crossentropy
mean_squared_error
mean_absolute_error

Otimizadores

SGD
RMSprop
Adagrad
Adadelta
Adam
Adamax

Ativações

softmax
elu
relu
tanh
sigmoid
hard_sigmoid
linear

Inicializadores

Zeros
RandomNormal
RandomUniform
TruncatedNormal
VarianceScaling
Orthogonal
Identity
lecun_uniform
glorot_normal
glorot_uniform
he_normal
he_uniform

Inicialização

Importamos bibliotecas e carregamos os dados



In [ ]:

    
import util
import numpy as np
import keras
from keras.utils import np_utils

X_train, y_train, X_test, y_test = util.load_mnist_dataset()
y_train_labels = np.array(util.get_label_names(y_train))

# Converte em one-hot para treino
y_train = np_utils.to_categorical(y_train, 10)
y_test = np_utils.to_categorical(y_test, 10)

#Mostra algumas imagens
examples = np.random.randint(0, X_train.shape[0] - 9, 9)
image_shape = (X_train.shape[2], X_train.shape[3])
util.plot9images(X_train[examples], y_train_labels[examples], image_shape)

2. Construindo DFNs com Keras

Reshaping MNIST data



In [ ]:

    
#Achatamos imagem em um vetor
X_train = X_train.reshape(X_train.shape[0], np.prod(X_train.shape[1:]))
X_test = X_test.reshape(X_test.shape[0], np.prod(X_test.shape[1:]))



In [ ]:

    
#Sequential é a API que permite construirmos um modelo ao adicionar incrementalmente layers
from keras.models import Sequential
from keras.layers import Dense, Flatten
from keras.optimizers import SGD

DFN = Sequential()
DFN.add(Dense(128, input_shape=(28*28,), activation='relu'))
DFN.add(Dense(128, activation='relu'))
DFN.add(Dense(128, activation='relu'))
DFN.add(Dense(10, activation='softmax'))

#optim = SGD(lr=0.01 ) - pode construir o otimizador por fora para definir parametros

DFN.compile(loss='categorical_crossentropy', 
              optimizer='sgd', #ou usar os parâmetros padrão
              metrics=['accuracy'])

DFN.fit(X_train, y_train, batch_size=32, epochs=2,
          validation_split=0.2, 
          verbose=1)

print('\nAccuracy: %.2f' % DFN.evaluate(X_test, y_test, verbose=1)[1])

3. Construindo CNNs com Keras

Reshaping MNIST data



In [ ]:

    
X_train = X_train.reshape(X_train.shape[0], 28, 28, 1)
X_test = X_test.reshape(X_test.shape[0], 28, 28, 1)

Compilando e ajustando CNN



In [ ]:

    
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import MaxPooling2D
from keras.layers.convolutional import Conv2D

CNN = Sequential()
CNN.add(Conv2D(32, (3, 3), padding='same', activation='relu',
                 input_shape=(28, 28, 1),))
CNN.add(MaxPooling2D(pool_size=(2, 2)))
CNN.add(Conv2D(32, (3, 3), padding='same', activation='relu'))
CNN.add(MaxPooling2D(pool_size=(2, 2)))
CNN.add(Dropout(0.25))
CNN.add(Flatten())
CNN.add(Dense(256, activation='relu'))
CNN.add(Dropout(0.5))
CNN.add(Dense(10, activation='softmax'))

CNN.compile(loss='categorical_crossentropy',
              optimizer='sgd', 
              metrics=['accuracy'])

CNN.fit(X_train, y_train, batch_size=32, epochs=2,
          validation_split=0.2, 
          verbose=1)

print('\nAccuracy: %.2f' % CNN.evaluate(X_test, y_test, verbose=1)[1])

Comparamos resultados:



In [ ]:

    
cnn_pred = CNN.predict(X_test, verbose=1)
dfn_pred = DFN.predict(X_test.reshape((X_test.shape[0], np.prod(X_test.shape[1:]))), verbose=1)

cnn_pred = np.array(list(map(np.argmax, cnn_pred)))
dfn_pred = np.array(list(map(np.argmax, dfn_pred)))
y_pred   = np.array(list(map(np.argmax, y_test)))



In [ ]:

    
util.plotconfusion(util.get_label_names(y_pred), util.get_label_names(dfn_pred))



In [ ]:

    
util.plotconfusion(util.get_label_names(y_pred), util.get_label_names(cnn_pred))

Vamos observar alguns exemplos mal classificados:



In [ ]:

    
cnn_missed = cnn_pred != y_pred
dfn_missed = dfn_pred != y_pred
cnn_and_dfn_missed = np.logical_and(dfn_missed, cnn_missed)



In [ ]:

    
util.plot_missed_examples(X_test, y_pred, dfn_missed, dfn_pred)



In [ ]:

    
util.plot_missed_examples(X_test, y_pred, cnn_missed, cnn_pred)



In [ ]:

    
util.plot_missed_examples(X_test, y_pred, cnn_and_dfn_missed)



In [ ]: