In [0]:
#@title Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
|
Hyperparamter tuning or search is somewhat of a black box, an art as it is so often referred to as is the process of choosing some of the parameters of a deep learning model in order to obtain the best possible performance for that architecture. There are quite a few tools out there that do a decent job of tuning parameters, but none are as straightforward, robust and state-of-the-art as Keras-Tuner.
This notebook will show how the parameters can be tuned manually and using Keras-Tuner. But first, here's a peek at few of the tools:
Hyperopt
: a popular Python library for optimizing over all sorts of complex
search spaces (including real values such as the learning rate, or discrete values
such as the number of layers).Hyperas, kopt or Talos
: optimizing hyperparameters for Keras model (the first
two are based on Hyperopt).Scikit-Optimize (skopt)
: a general-purpose optimization library. The Bayes
SearchCV class performs Bayesian optimization using an interface similar to Grid
SearchCV .Spearmint
: a Bayesian optimization library.Sklearn-Deap
: a hyperparameter optimization library based on evolutionary
algorithms, also with a GridSearchCV -like interface. Linkkeras-tuner
: Bayesian as well as RandomSearch based tuning library that is known as "Hypertuning for humans"
In [0]:
import tensorflow as tf
assert tf.__version__.startswith('2')
print(f'{tf.__version__}')
In [0]:
import numpy as np
from sklearn.metrics import accuracy_score
from sklearn.model_selection import RandomizedSearchCV
from tensorflow.keras.regularizers import l2
from tensorflow.keras.layers import Dense, Dropout, Conv2D, Flatten, Activation
from tensorflow.keras.wrappers.scikit_learn import KerasClassifier
In [0]:
mnist = tf.keras.datasets.mnist
(X_train, y_train), (X_test, y_test) = mnist.load_data()
In [0]:
X_train = tf.cast(np.reshape(X_train, (X_train.shape[0], X_train.shape[1], X_train.shape[2], 1)), tf.float64)
X_test = tf.cast(np.reshape(X_test, (X_test.shape[0], X_test.shape[1], X_test.shape[2], 1)), tf.float64)
In [0]:
y_train = tf.keras.utils.to_categorical(y_train)
y_test = tf.keras.utils.to_categorical(y_test)
In [0]:
model = tf.keras.models.Sequential()
model.add(Conv2D(32, (3,3), activation='relu', kernel_initializer='he_uniform', input_shape=(28,28,1)))
model.add(Conv2D(64, (3,3), activation='relu', kernel_initializer='he_uniform'))
model.add(Flatten())
model.add(Dense(20))
model.add(Dropout(0.2))
model.add(Dense(10, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer=tf.keras.optimizers.Adam(0.001), metrics=['accuracy'])
model.summary()
In [0]:
model.fit(X_train, y_train, epochs=5, batch_size=128)
Although this works, there is an element of luck and expertise to tune hyperparameters effectively. The use of Keras-Tuner is discussed below that performs the tuning effectively.
NOTE: Do not download the Pypi version of keras-tuner. Follow the steps in the cell below for downloading.
In [0]:
# use pip install keras-tuner once https://github.com/keras-team/keras-tuner/issues/71 is fixed in the pip package
!pip install -q git+https://github.com/keras-team/keras-tuner
In [0]:
import kerastuner
from kerastuner.tuners import RandomSearch
In [0]:
# Step 1: Wrap model in a function
def model_fn(hp):
# Step 2: Define the hyper-parameters
LR = hp.Choice('learning_rate', [0.001, 0.0005, 0.0001])
DROPOUT_RATE = hp.Float('dropout_rate', 0.0, 0.5, 5)
NUM_DIMS = hp.Int('num_dims', 8, 32, 8)
NUM_LAYERS = hp.Int('num_layers', 1, 3)
L2_NUM_FILTERS = hp.Int('l2_num_filters', 8, 64, 8)
L1_NUM_FILTERS = hp.Int('l1_num_filters', 8, 64, 8)
# Step 3: Replace static values with hyper-parameters
model = tf.keras.models.Sequential()
model.add(Conv2D(L1_NUM_FILTERS, (3,3), activation='relu', kernel_initializer='he_uniform', input_shape=(28,28,1)))
model.add(Conv2D(L2_NUM_FILTERS, (3,3), activation='relu', kernel_initializer='he_uniform'))
model.add(Flatten())
for _ in range(NUM_LAYERS):
model.add(Dense(NUM_DIMS))
model.add(Dropout(DROPOUT_RATE))
model.add(Dense(10, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer=tf.keras.optimizers.Adam(0.001), metrics=['accuracy'])
return model
In [0]:
tuner = RandomSearch(
model_fn,
objective='val_accuracy',
max_trials=5,
executions_per_trial=3,
directory='temp_dir')
In [0]:
tuner.search_space_summary()
In [0]:
tuner.search(X_train, y_train, epochs=5, validation_data=(X_test, y_test))
In [0]:
models = tuner.get_best_models(num_models=3)
In [0]:
tuner.results_summary()