In [1]:
import matchzoo as mz
train_raw = mz.datasets.toy.load_data('train')
dev_raw = mz.datasets.toy.load_data('dev')
test_raw = mz.datasets.toy.load_data('test')
A couple things are needed by the tuner:
Since MatchZoo models have pre-defined hyper-spaces, the tuner can start tuning right away once you have the data ready.
In [2]:
preprocessor = mz.models.DenseBaseline.get_default_preprocessor()
train = preprocessor.fit_transform(train_raw, verbose=0)
dev = preprocessor.transform(dev_raw, verbose=0)
test = preprocessor.transform(test_raw, verbose=0)
In [3]:
model = mz.models.DenseBaseline()
model.params['input_shapes'] = preprocessor.context['input_shapes']
model.params['task'] = mz.tasks.Ranking()
In [4]:
tuner = mz.auto.Tuner(
params=model.params,
train_data=train,
test_data=dev,
num_runs=5
)
results = tuner.tune()
In [5]:
results['best']
Out[5]:
In [6]:
results['best']['params'].to_frame()
Out[6]:
model.params.hyper_space
reprensents the model's hyper-parameters search space, which is the cross-product of individual hyper parameter's hyper space. When a Tuner
builds a model, for each hyper parameter in model.params
, if the hyper-parameter has a hyper-space, then a sample will be taken in the space. However, if the hyper-parameter does not have a hyper-space, then the default value of the hyper-parameter will be used.
In [7]:
model.params.hyper_space
Out[7]:
In a DenseBaseline
model, only mlp_num_units
, mlp_num_layers
, and mlp_num_fan_out
have pre-defined hyper-space. In other words, only these hyper-parameters will change values during a tuning. Other hyper-parameters, like mlp_activation_func
, are fixed and will not change.
In [8]:
def sample_and_build(params):
sample = mz.hyper_spaces.sample(params.hyper_space)
print('if sampled:', sample, '\n')
params.update(sample)
print('the built model will have:\n')
print(params, '\n\n\n')
for _ in range(3):
sample_and_build(model.params)
This is similar to the process of a tuner sampling model hyper-parameters, but with one key difference: a tuner's hyper-space is suggestive. This means the sampling process in a tuner is not truely random but skewed. Scores of the past samples affect future choices: a tuner with more runs knows better about its hyper-space, and take samples in a way that will likely yields better scores.
For more details, consult tuner's backend: hyperopt, and the search algorithm tuner uses: Tree of Parzen Estimators (TPE)
Hyper-spaces can also be represented in a human-readable format.
In [9]:
print(model.params.get('mlp_num_units').hyper_space)
In [10]:
model.params.to_frame()[['Name', 'Hyper-Space']]
Out[10]:
What if I want the tuner to choose optimizer
among adam
, adagrad
, and rmsprop
?
In [11]:
model.params.get('optimizer').hyper_space = mz.hyper_spaces.choice(['adam', 'adagrad', 'rmsprop'])
In [12]:
for _ in range(10):
print(mz.hyper_spaces.sample(model.params.hyper_space))
What about setting mlp_num_layers
to a fixed value of 2?
In [13]:
model.params['mlp_num_layers'] = 2
model.params.get('mlp_num_layers').hyper_space = None
In [14]:
for _ in range(10):
print(mz.hyper_spaces.sample(model.params.hyper_space))
To save the model during the tuning process, use mz.auto.tuner.callbacks.SaveModel
.
In [15]:
tuner.num_runs = 2
tuner.callbacks.append(mz.auto.tuner.callbacks.SaveModel())
results = tuner.tune()
This will save all built models to your mz.USER_TUNED_MODELS_DIR
, and can be loaded by:
In [16]:
best_model_id = results['best']['model_id']
mz.load_model(mz.USER_TUNED_MODELS_DIR.joinpath(best_model_id))
Out[16]:
To load a pre-trained embedding layer into a built model during a tuning process, use mz.auto.tuner.callbacks.LoadEmbeddingMatrix
.
In [17]:
toy_embedding = mz.datasets.toy.load_embedding()
preprocessor = mz.models.DUET.get_default_preprocessor()
train = preprocessor.fit_transform(train_raw, verbose=0)
dev = preprocessor.transform(dev_raw, verbose=0)
params = mz.models.DUET.get_default_params()
params['task'] = mz.tasks.Ranking()
params.update(preprocessor.context)
params['embedding_output_dim'] = toy_embedding.output_dim
In [18]:
embedding_matrix = toy_embedding.build_matrix(preprocessor.context['vocab_unit'].state['term_index'])
load_embedding_matrix_callback = mz.auto.tuner.callbacks.LoadEmbeddingMatrix(embedding_matrix)
In [19]:
tuner = mz.auto.tuner.Tuner(
params=params,
train_data=train,
test_data=dev,
num_runs=1
)
tuner.callbacks.append(load_embedding_matrix_callback)
results = tuner.tune()
To build your own callbacks, inherit mz.auto.tuner.callbacks.Callback
and overrides corresponding methods.
A run proceeds in the following way:
This process is repeated for num_runs
times in a tuner.
For example, say I want to verify if my embedding matrix is correctly loaded.
In [20]:
import numpy as np
class ValidateEmbedding(mz.auto.tuner.callbacks.Callback):
def __init__(self, embedding_matrix):
self._matrix = embedding_matrix
def on_build_end(self, tuner, model):
loaded_matrix = model.get_embedding_layer().get_weights()[0]
if np.isclose(self._matrix, loaded_matrix).all():
print("Yes! The my embedding is correctly loaded!")
In [21]:
validate_embedding_matrix_callback = ValidateEmbedding(embedding_matrix)
In [22]:
tuner = mz.auto.tuner.Tuner(
params=params,
train_data=train,
test_data=dev,
num_runs=1,
callbacks=[load_embedding_matrix_callback, validate_embedding_matrix_callback]
)
tuner.callbacks.append(load_embedding_matrix_callback)
results = tuner.tune()