Encoder-Decoder Analysis

Model Architecture


In [1]:
report_file = '/Users/bking/IdeaProjects/LanguageModelRNN/experiment_results/encdec_noing10_200_512_04dra/encdec_noing10_200_512_04dra.json'
log_file = '/Users/bking/IdeaProjects/LanguageModelRNN/experiment_results/encdec_noing10_200_512_04dra/encdec_noing10_200_512_04dra_logs.json'

import json
import matplotlib.pyplot as plt
with open(report_file) as f:
    report = json.loads(f.read())
with open(log_file) as f:
    logs = json.loads(f.read())
print'Encoder: \n\n', report['architecture']['encoder']
print'Decoder: \n\n', report['architecture']['decoder']


Encoder: 

nn.Sequential {
  [input -> (1) -> (2) -> (3) -> output]
  (1): nn.LookupTable
  (2): nn.LSTM(200 -> 512)
  (3): nn.Dropout(0.400000)
}
Decoder: 

nn.gModule

Perplexity on Each Dataset


In [2]:
print('Train Perplexity: ', report['train_perplexity'])
print('Valid Perplexity: ', report['valid_perplexity'])
print('Test Perplexity: ', report['test_perplexity'])


('Train Perplexity: ', 71.711428171036)
('Valid Perplexity: ', 413.59050936172)
('Test Perplexity: ', 440.71114299039)

Loss vs. Epoch


In [3]:
%matplotlib inline
for k in logs.keys():
    plt.plot(logs[k][0], logs[k][1], label=str(k) + ' (train)')
    plt.plot(logs[k][0], logs[k][2], label=str(k) + ' (valid)')
plt.title('Loss v. Epoch')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.legend()
plt.show()


Perplexity vs. Epoch


In [4]:
%matplotlib inline
for k in logs.keys():
    plt.plot(logs[k][0], logs[k][3], label=str(k) + ' (train)')
    plt.plot(logs[k][0], logs[k][4], label=str(k) + ' (valid)')
plt.title('Perplexity v. Epoch')
plt.xlabel('Epoch')
plt.ylabel('Perplexity')
plt.legend()
plt.show()


Generations


In [5]:
def print_sample(sample, best_bleu=None):
    enc_input = ' '.join([w for w in sample['encoder_input'].split(' ') if w != '<pad>'])
    gold = ' '.join([w for w in sample['gold'].split(' ') if w != '<mask>'])
    print('Input: '+ enc_input + '\n')
    print('Gend: ' + sample['generated'] + '\n')
    print('True: ' + gold + '\n')
    if best_bleu is not None:
        cbm = ' '.join([w for w in best_bleu['best_match'].split(' ') if w != '<mask>'])
        print('Closest BLEU Match: ' + cbm + '\n')
        print('Closest BLEU Score: ' + str(best_bleu['best_score']) + '\n')
    print('\n')

In [6]:
for i, sample in enumerate(report['train_samples']):
    print_sample(sample, report['best_bleu_matches_train'][i] if 'best_bleu_matches_train' in report else None)


Input:  5 - minute healthy strawberry frozen yogurt

Gend:  <beg> . . . . . . . . . . . . . . . . . . . . . . . . .

True:  preheat oven to 350 <step> brown beef in skillet & add taco seasoning . cook as directed on seasoning packet <step> spray an <end>

Closest BLEU Match:  heat the oil in a skillet over medium heat . <step> in a bowl , combine the coconut flour , <end>

Closest BLEU Score: 0



Input:  coffee cake in a mug with cinnamon oatmeal struesel topping

Gend:  <beg> . . . . . . . . . . . . . . . . . . . . . . . . .

True:  to make the apple - cranberry relish : peel , core , and chop the apples into 1 / 4 - inch <end>

Closest BLEU Match:  heat the oil in a skillet over medium heat . <step> in a bowl , combine the coconut flour , <end>

Closest BLEU Score: 0



Input:  french fry stuffed chili enchiladas

Gend:  <beg> . . . . . . . . . . . . . . . . . . . . . . . . .

True:  start your rice , i use a rice cooker . when nearly finished ( or actually finished ) <end>

Closest BLEU Match:  heat the oil in a skillet over medium heat . <step> in a bowl , combine the coconut flour , <end>

Closest BLEU Score: 0



Input:  pizza zacineti breadsticks

Gend:  <beg> . . . . . . . . . . . . . . . . . . . . . . . . .

True:  using a mixer with a paddle , cream together the butter , sugar and eggs until smooth . add the vanilla and zest and combine

Closest BLEU Match:  heat the oil in a skillet over medium heat . <step> in a bowl , combine the coconut flour , <end>

Closest BLEU Score: 0



Input:  hidden valley pinwheel sandwiches

Gend:  <beg> . . . . . . . . . . . . . . . . . . . . . . . . .

True:  chop your green pepper , red pepper , sweet onion , and carrots up . put your carrots off to the side . <end>

Closest BLEU Match:  heat the oil in a skillet over medium heat . <step> in a bowl , combine the coconut flour , <end>

Closest BLEU Score: 0



Input:  hidden valley pinwheel sandwiches

Gend:  <beg> . . . . . . . . . . . . . . . . . . . . . . . . .

True:  chop your green pepper , red pepper , sweet onion , and carrots up . put your carrots off to the side . <end>

Closest BLEU Match:  heat the oil in a skillet over medium heat . <step> in a bowl , combine the coconut flour , <end>

Closest BLEU Score: 0



Input:  savory french omelet

Gend:  <beg> . . . . . . . . . . . . . . . . . . . . . . . . .

True:  combine the cornmeal , flour , sugar , mustard , baking powder and salt , mixing well . <step> add the milk , egg <end>

Closest BLEU Match:  heat the oil in a skillet over medium heat . <step> in a bowl , combine the coconut flour , <end>

Closest BLEU Score: 0




In [7]:
for i, sample in enumerate(report['valid_samples']):
    print_sample(sample, report['best_bleu_matches_valid'][i] if 'best_bleu_matches_valid' in report else None)


Input:  5 - minute healthy strawberry frozen yogurt

Gend:  <beg> . . . . . . . . . . . . . . . . . . . . . . . . .

True:  add the frozen strawberries , agave nectar ( or honey ) , yogurt and lemon juice to the <end>

Closest BLEU Match:  heat the oil in a skillet over medium heat . <step> in a bowl , combine the coconut flour , <end>

Closest BLEU Score: 0



Input:  coffee cake in a mug with cinnamon oatmeal struesel topping

Gend:  <beg> . . . . . . . . . . . . . . . . . . . . . . . . .

True:  place 1 tablespoon of room temperature butter in mug . if cold , place in microwave <end>

Closest BLEU Match:  heat the oil in a skillet over medium heat . <step> in a bowl , combine the coconut flour , <end>

Closest BLEU Score: 0



Input:  french fry stuffed chili enchiladas

Gend:  <beg> . . . . . . . . . . . . . . . . . . . . . . . . .

True:  preheat oven and bake french fries as directed on bag . <step> take your flour tortillas and heat <end>

Closest BLEU Match:  heat the oil in a skillet over medium heat . <step> in a bowl , combine the coconut flour , <end>

Closest BLEU Score: 0



Input:  pizza zacineti breadsticks

Gend:  <beg> . . . . . . . . . . . . . . . . . . . . . . . . .

True:  in a bowl , dissolve the yeast with lukewarm and sugar for about 5 minutes , then add flour , salt and <end>

Closest BLEU Match:  heat the oil in a skillet over medium heat . <step> in a bowl , combine the coconut flour , <end>

Closest BLEU Score: 0



Input:  hidden valley pinwheel sandwiches

Gend:  <beg> . . . . . . . . . . . . . . . . . . . . . . . . .

True:  mix cream cheese , dressing mix and onions until blended . spread on tortillas . blot <end>

Closest BLEU Match:  heat the oil in a skillet over medium heat . <step> in a bowl , combine the coconut flour , <end>

Closest BLEU Score: 0



Input:  hidden valley pinwheel sandwiches

Gend:  <beg> . . . . . . . . . . . . . . . . . . . . . . . . .

True:  mix cream cheese , dressing mix and onions until blended . spread on tortillas . blot <end>

Closest BLEU Match:  heat the oil in a skillet over medium heat . <step> in a bowl , combine the coconut flour , <end>

Closest BLEU Score: 0



Input:  savory french omelet

Gend:  <beg> . . . . . . . . . . . . . . . . . . . . . . . . .

True:  in a large skillet , fry bacon until crisp . crumble cooked bacon ; . <step> <end>

Closest BLEU Match:  heat the oil in a skillet over medium heat . <step> in a bowl , combine the coconut flour , <end>

Closest BLEU Score: 0




In [8]:
for i, sample in enumerate(report['test_samples']):
    print_sample(sample, report['best_bleu_matches_test'][i] if 'best_bleu_matches_test' in report else None)


Input:  smoked salmon , avocado , dill and parsley mayo sandwich

Gend:  <beg> preheat oven to a . . . . . . . . . . . . . . . . . . . . .

True:  mash the lemon juice and avocado with dill and parsley mayo . spoon over a slice of bread and <end>

Closest BLEU Match:  heat the oil in a skillet over medium heat . <step> in a bowl , combine the coconut flour , <end>

Closest BLEU Score: 0



Input:  fancy hot dogs

Gend:  <beg> . . . . . . . . . . . . . . . . . . . . . . . . .

True:  <step> 1 melt butter in a large skillet ( cast iron works well for this purpose <end>

Closest BLEU Match:  heat the oil in a skillet over medium heat . <step> in a bowl , combine the coconut flour , <end>

Closest BLEU Score: 0



Input:  healthy oatmeal cookies

Gend:  <beg> . . . . . . . . . . . . . . . . . . . . . . . . .

True:  preheat oven to 350 degrees . in a medium bowl , whisk together flours and baking powder ; set aside . <step> in <end>

Closest BLEU Match:  heat the oil in a skillet over medium heat . <step> in a bowl , combine the coconut flour , <end>

Closest BLEU Score: 0



Input:  mexican hummus

Gend:  <beg> . . . . . . . . . . . . . . . . . . . . . . . . .

True:  position the knife blade in a food processor bowl . drop the garlic through the food chute with the processor running ; process 3 seconds

Closest BLEU Match:  heat the oil in a skillet over medium heat . <step> in a bowl , combine the coconut flour , <end>

Closest BLEU Score: 0



Input:  saute ? ed mushrooms

Gend:  <beg> . . . . . . . . . . . . . . . . . . . . . . . . .

True:  1 . cook shiitake mushrooms in a single layer in 1 1 / 2 tbsp . hot oil in a 10 - to <end>

Closest BLEU Match:  heat the oil in a skillet over medium heat . <step> in a bowl , combine the coconut flour , <end>

Closest BLEU Score: 0



Input:  saute ? ed mushrooms

Gend:  <beg> . . . . . . . . . . . . . . . . . . . . . . . . .

True:  1 . cook shiitake mushrooms in a single layer in 1 1 / 2 tbsp . hot oil in a 10 - to <end>

Closest BLEU Match:  heat the oil in a skillet over medium heat . <step> in a bowl , combine the coconut flour , <end>

Closest BLEU Score: 0



Input:  marty 's loosemeat sandwich

Gend:  <beg> . . . . . . . . . . . . . . . . . . . . . . . . .

True:  in a medium skillet over medium heat , cook the ground beef until evenly browned ; drain <end>

Closest BLEU Match:  heat the oil in a skillet over medium heat . <step> in a bowl , combine the coconut flour , <end>

Closest BLEU Score: 0



BLEU Analysis


In [9]:
def print_bleu(blue_struct):
    print 'Overall Score: ', blue_struct['score'], '\n'
    print '1-gram Score: ', blue_struct['components']['1']
    print '2-gram Score: ', blue_struct['components']['2']
    print '3-gram Score: ', blue_struct['components']['3']
    print '4-gram Score: ', blue_struct['components']['4']

In [10]:
# Training Set BLEU Scores
print_bleu(report['train_bleu'])


Overall Score:  0 

1-gram Score:  4.4
2-gram Score:  0
3-gram Score:  0
4-gram Score:  0

In [11]:
# Validation Set BLEU Scores
print_bleu(report['valid_bleu'])


Overall Score:  0 

1-gram Score:  4.4
2-gram Score:  0
3-gram Score:  0
4-gram Score:  0

In [12]:
# Test Set BLEU Scores
print_bleu(report['test_bleu'])


Overall Score:  0 

1-gram Score:  4.9
2-gram Score:  0
3-gram Score:  0
4-gram Score:  0

In [13]:
# All Data BLEU Scores
print_bleu(report['combined_bleu'])


Overall Score:  0 

1-gram Score:  4.6
2-gram Score:  0
3-gram Score:  0
4-gram Score:  0

N-pairs BLEU Analysis

This analysis randomly samples 1000 pairs of generations/ground truths and treats them as translations, giving their BLEU score. We can expect very low scores in the ground truth and high scores can expose hyper-common generations


In [14]:
# Training Set BLEU n-pairs Scores
print_bleu(report['n_pairs_bleu_train'])


Overall Score:  100 

1-gram Score:  100
2-gram Score:  100
3-gram Score:  100
4-gram Score:  100

In [15]:
# Validation Set n-pairs BLEU Scores
print_bleu(report['n_pairs_bleu_valid'])


Overall Score:  100 

1-gram Score:  100
2-gram Score:  100
3-gram Score:  100
4-gram Score:  100

In [16]:
# Test Set n-pairs BLEU Scores
print_bleu(report['n_pairs_bleu_test'])


Overall Score:  94.66 

1-gram Score:  95.8
2-gram Score:  94.5
3-gram Score:  94.3
4-gram Score:  94

In [17]:
# Combined n-pairs BLEU Scores
print_bleu(report['n_pairs_bleu_all'])


Overall Score:  98.19 

1-gram Score:  98.6
2-gram Score:  98.1
3-gram Score:  98.1
4-gram Score:  98

In [18]:
# Ground Truth n-pairs BLEU Scores
print_bleu(report['n_pairs_bleu_gold'])


Overall Score:  9.89 

1-gram Score:  24.8
2-gram Score:  10.6
3-gram Score:  6.7
4-gram Score:  5.5

Alignment Analysis

This analysis computs the average Smith-Waterman alignment score for generations, with the same intuition as N-pairs BLEU, in that we expect low scores in the ground truth and hyper-common generations to raise the scores


In [19]:
print 'Average (Train) Generated Score: ', report['average_alignment_train']
print 'Average (Valid) Generated Score: ', report['average_alignment_valid']
print 'Average (Test) Generated Score: ', report['average_alignment_test']
print 'Average (All) Generated Score: ', report['average_alignment_all']
print 'Average Gold Score: ', report['average_alignment_gold']


Average (Train) Generated Score:  56
Average (Valid) Generated Score:  56
Average (Test) Generated Score:  52
Average (All) Generated Score:  54.6666666667
Average Gold Score:  21.5428571429