NMT-Keras tutorial

3. Decoding with a trained Neural Machine Translation Model

Now, we'll load from disk a trained Neural Machine Translation (NMT) model. We'll apply it for translating new text. In this case, we want to translate the 'test' split of our dataset.

This tutorial assumes that you followed both previous tutorials.

As before, let's import some stuff and load the dataset instance.


In [2]:
from config import load_parameters
from data_engine.prepare_data import keep_n_captions
from keras_wrapper.cnn_model import loadModel
from keras_wrapper.dataset import loadDataset
params = load_parameters()
dataset = loadDataset('datasets/Dataset_tutorial_dataset.pkl')


[30/11/2016 17:16:59] <<< Loading Dataset instance from datasets/Dataset_tutorial_dataset.pkl ... >>>
[30/11/2016 17:16:59] <<< Dataset instance loaded >>>

Since we want to translate a new data split ('test') we must add it to the dataset instance, just as we did before (at the first tutorial). In case we also had the refences of the test split and we wanted to evaluate it, we can add it to the dataset. Note that this is not mandatory and we could just predict without evaluating.


In [3]:
dataset.setInput('examples/EuTrans/DATA/test.es',
            'test',
            type='text',
            id='source_text',
            pad_on_batch=True,
            tokenization='tokenize_none',
            fill='end',
            max_text_len=30,
            min_occ=0)

dataset.setInput(None,
            'test',
            type='ghost',
            id='state_below',
            required=False)


[30/11/2016 17:17:04] Loaded "test" set inputs of type "text" with id "source_text" and length 2996.
[30/11/2016 17:17:04] Loaded "test" set inputs of type "ghost" with id "state_below" and length 2996.

Now, let's load the translation model. Suppose we want to load the model saved at the end of the epoch 4:


In [4]:
params['INPUT_VOCABULARY_SIZE'] = dataset.vocabulary_len[params['INPUTS_IDS_DATASET'][0]]
params['OUTPUT_VOCABULARY_SIZE'] = dataset.vocabulary_len[params['OUTPUTS_IDS_DATASET'][0]]

# Load model
nmt_model = loadModel('trained_models/tutorial_model', 4)


[30/11/2016 17:17:07] <<< Loading model from trained_models/tutorial_model/epoch_4_Model_Wrapper.pkl ... >>>
[30/11/2016 17:17:10] <<< Model loaded in 2.7996 seconds. >>>
[30/11/2016 17:17:10] Preparing optimizer and compiling.

Once we loaded the model, we just have to invoke the sampling method (in this case, the Beam Search algorithm) for the 'test' split:


In [5]:
params_prediction = {'max_batch_size': 50,
                     'n_parallel_loaders': 8,
                     'predict_on_sets': ['test'],
                     'beam_size': 12,
                     'maxlen': 50,
                     'model_inputs': ['source_text', 'state_below'],
                     'model_outputs': ['target_text'],
                     'dataset_inputs': ['source_text', 'state_below'],
                     'dataset_outputs': ['target_text'],
                     'normalize': True,
                     'alpha_factor': 0.6                   
                     }
predictions = nmt_model.predictBeamSearchNet(dataset, params_prediction)['test']


[30/11/2016 17:17:21] <<< Predicting outputs of test set >>>
Sampling 2996/2996  -  ETA: 0s Total cost of the translations: 159.292059 	 Average cost of the translations: 0.053168
The sampling took: 682.227991 secs (Speed: 0.227713 sec/sample)

Up to this moment, in the variable 'predictions', we have the indices of the words of the hypotheses. We must decode them into words. For doing this, we'll use the dictionary stored in the dataset object:


In [6]:
from keras_wrapper.utils import decode_predictions_beam_search
vocab = dataset.vocabulary['target_text']['idx2words']
predictions = decode_predictions_beam_search(predictions,
                                             vocab,
                                             verbose=params['VERBOSE'])


[30/11/2016 17:28:57] Decoding beam search prediction ...

Finally, we store the system hypotheses:


In [9]:
filepath = nmt_model.model_path+'/' + 'test' + '_sampling.pred'  # results file
from keras_wrapper.extra.read_write import list2file
list2file(filepath, predictions)

If we have the references of this split, we can also evaluate the performance of our system on it. First, we must add them to the dataset object:


In [10]:
# In case we had the references of this split, we could also load the split and evaluate on it
dataset.setOutput('examples/EuTrans/DATA/test.en',
             'test',
             type='text',
             id='target_text',
             pad_on_batch=True,
             tokenization='tokenize_none',
             sample_weights=True,
             max_text_len=30,
             max_words=0)
keep_n_captions(dataset, repeat=1, n=1, set_names=['test'])


[30/11/2016 17:29:42] Loaded "test" set outputs of type "text" with id "target_text" and length 2996.
[30/11/2016 17:29:42] Keeping 1 captions per input on the test set.
[30/11/2016 17:29:43] Samples reduced to 2996 in test set.

Next, we call the evaluation system: The COCO package. Although its main usage is for multimodal captioning, we can use it in machine translation:


In [19]:
from keras_wrapper.extra.evaluation import select
metric = 'coco'
# Apply sampling
extra_vars = dict()
extra_vars['tokenize_f'] = eval('dataset.' + 'tokenize_none')
extra_vars['language'] = params['TRG_LAN']
extra_vars['test'] = dict()
extra_vars['test']['references'] = dataset.extra_variables['test']['target_text']
metrics = select[metric](pred_list=predictions,
                                          verbose=1,
                                          extra_vars=extra_vars,
                                          split='test')


[30/11/2016 17:33:48] Computing coco scores on the test split...
[30/11/2016 17:33:48] Bleu_1: 0.991317697799
[30/11/2016 17:33:48] Bleu_2: 0.987540341905
[30/11/2016 17:33:48] Bleu_3: 0.983835279345
[30/11/2016 17:33:48] Bleu_4: 0.980146481838
[30/11/2016 17:33:48] CIDEr: 9.70615051823
[30/11/2016 17:33:48] ROUGE_L: 0.990315105909