Lab 6.1 - Keras for RNN

In this lab we will use the Keras deep learning library to construct a simple recurrent neural network (RNN) that can learn linguistic structure from a piece of text, and use that knowledge to generate new text passages. To review general RNN architecture, specific types of RNN networks such as the LSTM networks we'll be using here, and other concepts behind this type of machine learning, you should consult the following resources:

This code is an adaptation of these two examples:

You can consult the original sites for more information and documentation.

Let's start by importing some of the libraries we'll be using in this lab:


In [1]:
import numpy as np
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Dropout
from keras.layers import LSTM
from keras.callbacks import ModelCheckpoint
from keras.utils import np_utils

from time import gmtime, strftime
import os
import re
import pickle
import random
import sys


Using TensorFlow backend.

The first thing we need to do is generate our training data set. In this case we will use a recent article written by Barack Obama for The Economist newspaper. Make sure you have the obama.txt file in the /data folder within the /week-6 folder in your repository.


In [2]:
# load ascii text from file
filename = "data/obama.txt"
raw_text = open(filename).read()

# get rid of any characters other than letters, numbers, 
# and a few special characters
raw_text = re.sub('[^\nA-Za-z0-9 ,.:;?!-]+', '', raw_text)

# convert all text to lowercase
raw_text = raw_text.lower()

n_chars = len(raw_text)
print "length of text:", n_chars
print "text preview:", raw_text[:500]


length of text: 18312
text preview: wherever i go these days, at home or abroad, people ask me the same question: what is happening in the american political system? how has a country that has benefitedperhaps more than any otherfrom immigration, trade and technological innovation suddenly developed a strain of anti-immigrant, anti-innovation protectionism? why have some on the far left and even more on the far right embraced a crude populism that promises a return to a past that is not possible to restoreand that, for most americ

Next, we use python's set() function to generate a list of all unique characters in the text. This will form our 'vocabulary' of characters, which is similar to the categories found in typical ML classification problems.

Since neural networks work with numerical data, we also need to create a mapping between each character and a unique integer value. To do this we create two dictionaries: one which has characters as keys and the associated integers as the value, and one which has integers as keys and the associated characters as the value. These dictionaries will allow us to do translation both ways.


In [3]:
# extract all unique characters in the text
chars = sorted(list(set(raw_text)))
n_vocab = len(chars)
print "number of unique characters found:", n_vocab

# create mapping of characters to integers and back
char_to_int = dict((c, i) for i, c in enumerate(chars))
int_to_char = dict((i, c) for i, c in enumerate(chars))

# test our mapping
print 'a', "- maps to ->", char_to_int["a"]
print 25, "- maps to ->", int_to_char[25]


number of unique characters found: 44
a - maps to -> 18
25 - maps to -> h

Now we need to define the training data for our network. With RNN's, the training data usually takes the shape of a three-dimensional matrix, with the size of each dimension representing:

[# of training sequences, # of training samples per sequence, # of features per sample]

  • The training sequences are the sets of data subjected to the RNN at each training step. As with all neural networks, these training sequences are presented to the network in small batches during training.
  • Each training sequence is composed of some number of training samples. The number of samples in each sequence dictates how far back in the data stream the algorithm will learn, and sets the depth of the RNN layer.
  • Each training sample within a sequence is composed of some number of features. This is the data that the RNN layer is learning from at each time step. In our example, the training samples and targets will use one-hot encoding, so will have a feature for each possible character, with the actual character represented by 1, and all others by 0.

To prepare the data, we first set the length of training sequences we want to use. In this case we will set the sequence length to 100, meaning the RNN layer will be able to predict future characters based on the 100 characters that came before.

We will then slide this 100 character 'window' over the entire text to create input and output arrays. Each entry in the input array contains 100 characters from the text, and each entry in the output array contains the single character that came after.


In [5]:
# prepare the dataset of input to output pairs encoded as integers
seq_length = 100

inputs = []
outputs = []

for i in range(0, n_chars - seq_length, 1):
    inputs.append(raw_text[i:i + seq_length])
    outputs.append(raw_text[i + seq_length])
    
n_sequences = len(inputs)
print "Total sequences: ", n_sequences


Total sequences:  18212

Now let's shuffle both the input and output data so that we can later have Keras split it automatically into a training and test set. To make sure the two lists are shuffled the same way (maintaining correspondance between inputs and outputs), we create a separate shuffled list of indeces, and use these indeces to reorder both lists.


In [9]:
indeces = range(len(inputs))
random.shuffle(indeces)

inputs = [inputs[x] for x in indeces]
outputs = [outputs[x] for x in indeces]

Let's visualize one of these sequences to make sure we are getting what we expect:


In [10]:
print inputs[0], "-->", outputs[0]


ms to our criminal-justice system and improvements to re-entry into the workforce that have won bipa --> r

Next we will prepare the actual numpy datasets which will be used to train our network. We first initialize two empty numpy arrays in the proper formatting:

  • X --> [# of training sequences, # of training samples, # of features]
  • y --> [# of training sequences, # of features]

We then iterate over the arrays we generated in the previous step and fill the numpy arrays with the proper data. Since all character data is formatted using one-hot encoding, we initialize both data sets with zeros. As we iterate over the data, we use the char_to_int dictionary to map each character to its related position integer, and use that position to change the related value in the data set to 1.


In [11]:
# create two empty numpy array with the proper dimensions
X = np.zeros((n_sequences, seq_length, n_vocab), dtype=np.bool)
y = np.zeros((n_sequences, n_vocab), dtype=np.bool)

# iterate over the data and build up the X and y data sets
# by setting the appropriate indices to 1 in each one-hot vector
for i, example in enumerate(inputs):
    for t, char in enumerate(example):
        X[i, t, char_to_int[char]] = 1
    y[i, char_to_int[outputs[i]]] = 1
    
print 'X dims -->', X.shape
print 'y dims -->', y.shape


X dims --> (18212, 100, 44)
y dims --> (18212, 44)

Next, we define our RNN model in Keras. This is very similar to how we defined the CNN model, except now we use the LSTM() function to create an LSTM layer with an internal memory of 128 neurons. LSTM is a special type of RNN layer which solves the unstable gradients issue seen in basic RNN. Along with LSTM layers, Keras also supports basic RNN layers and GRU layers, which are similar to LSTM. You can find full documentation for recurrent layers in Keras' documentation

As before, we need to explicitly define the input shape for the first layer. Also, we need to tell Keras whether the LSTM layer should pass its sequence of predictions or its internal memory as the output to the next layer. If you are connecting the LSTM layer to a fully connected layer as we do in this case, you should set the return_sequences parameter to False to have the layer pass the value of its hidden neurons. If you are connecting multiple LSTM layers, you should set the parameter to True in all but the last layer, so that subsequent layers can learn from the sequence of predictions of previous layers.

We will use dropout with a probability of 50% to regularize the network and prevent overfitting on our training data. The output of the network will be a fully connected layer with one neuron for each character in the vocabulary. The softmax function will convert this output to a probability distribution across all characters.


In [12]:
# define the LSTM model
model = Sequential()
model.add(LSTM(128, return_sequences=False, input_shape=(X.shape[1], X.shape[2])))
model.add(Dropout(0.50))
model.add(Dense(y.shape[1], activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam')

Next, we define two helper functions: one to select a character based on a probability distribution, and one to generate a sequence of predicted characters based on an input (or 'seed') list of characters.

The sample() function will take in a probability distribution generated by the softmax() function, and select a character based on the 'temperature' input. The temperature (also often called the 'diversity') effects how strictly the probability distribution is sampled.

  • Lower values (closer to zero) output more confident predictions, but are also more conservative. In our case, if the model has overfit the training data, lower values are likely to give back exactly what is found in the text
  • Higher values (1 and above) introduce more diversity and randomness into the results. This can lead the model to generate novel information not found in the training data. However, you are also likely to see more errors such as grammatical or spelling mistakes.

In [13]:
def sample(preds, temperature=1.0):
    # helper function to sample an index from a probability array
    preds = np.asarray(preds).astype('float64')
    preds = np.log(preds) / temperature
    exp_preds = np.exp(preds)
    preds = exp_preds / np.sum(exp_preds)
    probas = np.random.multinomial(1, preds, 1)
    return np.argmax(probas)

The generate() function will take in:

  • input sentance ('seed')
  • number of characters to generate
  • and target diversity or temperature

and print the resulting sequence of characters to the screen.


In [15]:
def generate(sentence, prediction_length=50, diversity=0.35):
    print '----- diversity:', diversity 

    generated = sentence
    sys.stdout.write(generated)

    # iterate over number of characters requested
    for i in range(prediction_length):
        
        # build up sequence data from current sentence
        x = np.zeros((1, X.shape[1], X.shape[2]))
        for t, char in enumerate(sentence):
            x[0, t, char_to_int[char]] = 1.

        # use trained model to return probability distribution
        # for next character based on input sequence
        preds = model.predict(x, verbose=0)[0]
        
        # use sample() function to sample next character 
        # based on probability distribution and desired diversity
        next_index = sample(preds, diversity)
        
        # convert integer to character
        next_char = int_to_char[next_index]

        # add new character to generated text
        generated += next_char
        
        # delete the first character from beginning of sentance, 
        # and add new caracter to the end. This will form the 
        # input sequence for the next predicted character.
        sentence = sentence[1:] + next_char

        # print results to screen
        sys.stdout.write(next_char)
        sys.stdout.flush()
    print

Next, we define a system for Keras to save our model's parameters to a local file after each epoch where it achieves an improvement in the overall loss. This will allow us to reuse the trained model at a later time without having to retrain it from scratch. This is useful for recovering models incase your computer crashes, or you want to stop the training early.


In [16]:
filepath="-basic_LSTM.hdf5"
checkpoint = ModelCheckpoint(filepath, monitor='loss', verbose=0, save_best_only=True, mode='min')
callbacks_list = [checkpoint]

Now we are finally ready to train the model. We want to train the model over 50 epochs, but we also want to output some generated text after each epoch to see how our model is doing.

To do this we create our own loop to iterate over each epoch. Within the loop we first train the model for one epoch. Since all parameters are stored within the model, training one epoch at a time has the same exact effect as training over a longer series of epochs. We also use the model's validation_split parameter to tell Keras to automatically split the data into 80% training data and 20% test data for validation. Remember to always shuffle your data if you will be using validation!

After each epoch is trained, we use the raw_text data to extract a new sequence of 100 characters as the 'seed' for our generated text. Finally, we use our generate() helper function to generate text using two different diversity settings.

Warning: because of their large depth (remember that an RNN trained on a 100 long sequence effectively has 100 layers!), these networks typically take a much longer time to train than traditional multi-layer ANN's and CNN's. You shoud expect these models to train overnight on the virtual machine, but you should be able to see enough progress after the first few epochs to know if it is worth it to train a model to the end. For more complex RNN models with larger data sets in your own work, you should consider a native installation, along with a dedicated GPU if possible.


In [17]:
epochs = 50
prediction_length = 100

for iteration in range(epochs):
    
    print 'epoch:', iteration + 1, '/', epochs
    model.fit(X, y, validation_split=0.2, batch_size=256, nb_epoch=1, callbacks=callbacks_list)
    
    # get random starting point for seed
    start_index = random.randint(0, len(raw_text) - seq_length - 1)
    # extract seed sequence from raw text
    seed = raw_text[start_index: start_index + seq_length]
    
    print '----- generating with seed:', seed
    
    for diversity in [0.5, 1.2]:
        generate(seed, prediction_length, diversity)


epoch: 1 / 50
Train on 14569 samples, validate on 3643 samples
Epoch 1/1
14569/14569 [==============================] - 103s - loss: 3.2195 - val_loss: 2.9803
----- generating with seed:  economy also depends on meaningful opportunities for work for everyone who wants a job. however, am
----- diversity: 0.5
 economy also depends on meaningful opportunities for work for everyone who wants a job. however, am  iot u rn o ea   n eientee   ane a  r  i  r  gaec te  rie i e     ai  gie  eg sn   ee   ee  aa  n  
----- diversity: 1.2
 economy also depends on meaningful opportunities for work for everyone who wants a job. however, amgeddrilwcaonii2mtcly  ehed h u1i  epe oaomjjkns,ntdw0rs1uetidtmhrrsyge rlcnv nj 5o c7 idmrt1sruo oot
epoch: 2 / 50
Train on 14569 samples, validate on 3643 samples
Epoch 1/1
14569/14569 [==============================] - 102s - loss: 3.0235 - val_loss: 2.9320
----- generating with seed: des in the making. while i am proud of what my administration has accomplished these past eight year
----- diversity: 0.5
des in the making. while i am proud of what my administration has accomplished these past eight yearse  aane   o c    eeeee lssoin ttt  e s e is te oot s ar   iee ete  r e  ialiate oe ie  aiec ro eere
----- diversity: 1.2
des in the making. while i am proud of what my administration has accomplished these past eight year 1nlyc1-poscgs, mal eueyaesr ntde dh7 ltedejt  ls.la evntet 1rx aiaiclit fluhleeltfhxsfnfeiieevlvnui
epoch: 3 / 50
Train on 14569 samples, validate on 3643 samples
Epoch 1/1
14569/14569 [==============================] - 106s - loss: 2.9538 - val_loss: 2.8611
----- generating with seed: ven by fears that are not fundamentally economic. the anti-immigrant, anti-mexican, anti-muslim and 
----- diversity: 0.5
ven by fears that are not fundamentally economic. the anti-immigrant, anti-mexican, anti-muslim and reo ar ono ts nee rtomott degot gnttme e bire nt hwre eeenr so   ant loo  eni adin  sse  ate iin ns 
----- diversity: 1.2
ven by fears that are not fundamentally economic. the anti-immigrant, anti-mexican, anti-muslim and o tol,aonlans,;narudlc hdertofihtenmkent.wqhrtsese ged,ohcsst  eo wre a3ids mtre hdhe av e  ooes est
epoch: 4 / 50
Train on 14569 samples, validate on 3643 samples
Epoch 1/1
14569/14569 [==============================] - 104s - loss: 2.8533 - val_loss: 2.7604
----- generating with seed: d restore past glory if they just got some group or idea that was threatening america under control.
----- diversity: 0.5
d restore past glory if they just got some group or idea that was threatening america under control. eey ohoaalantangro rsel e ace aere t se a ten ctoutenl un ee see tesee ee  eeee int s ea iss areni 
----- diversity: 1.2
d restore past glory if they just got some group or idea that was threatening america under control.seah iblshhiwenvnas  oiwrevrkny. cxer.lsuse by -tainmestlaa9ri1dunn alruyrux ncnesciwrolgnautwneliul
epoch: 5 / 50
Train on 14569 samples, validate on 3643 samples
Epoch 1/1
14569/14569 [==============================] - 106s - loss: 2.7348 - val_loss: 2.6332
----- generating with seed: ce for the common good, driving businesses to create products that consumers rave about or motivatin
----- diversity: 0.5
ce for the common good, driving businesses to create products that consumers rave about or motivatins goith  aree eranatoe  ad roor an tan eneree ae  aai s an enin nle are tee is al in aea ae theri uo
----- diversity: 1.2
ce for the common good, driving businesses to create products that consumers rave about or motivatinitsdoesaincary hoj uk4 cc taq vigaf ie  tioe o5o ctor ydtea aine hoceenwtsi d orevibfwaa  ha sambrr

epoch: 6 / 50
Train on 14569 samples, validate on 3643 samples
Epoch 1/1
14569/14569 [==============================] - 127s - loss: 2.6257 - val_loss: 2.5449
----- generating with seed: fully restoring faith in an economy where hardworking americans can get ahead requires addressing fo
----- diversity: 0.5
fully restoring faith in an economy where hardworking americans can get ahead requires addressing fo tece tis an aat erate the panscorcim ghe theiin ist onn rres thout  and nt eevener te taait esenore
----- diversity: 1.2
fully restoring faith in an economy where hardworking americans can get ahead requires addressing fonmelenstw hchotmconm air oor ony rbuoctk ao oach1yt ns mnvk nayt 0iinnd;r bleyser oestoti cirbaf
mis
epoch: 7 / 50
Train on 14569 samples, validate on 3643 samples
Epoch 1/1
14569/14569 [==============================] - 93s - loss: 2.5512 - val_loss: 2.4841
----- generating with seed: ic research and development. policies focused on education are critical both for increasing economic
----- diversity: 0.5
ic research and development. policies focused on education are critical both for increasing economice the ges ror an than tee tout the the we por on eere the ane fore the a the ges cous con ore to pet
----- diversity: 1.2
ic research and development. policies focused on education are critical both for increasing economicsetge e lare peprimiy pgfit be mn ohrls altime  vetfw1 ld aasgicnn mhatf a:titga.gu0bh at ola ireocn
epoch: 8 / 50
Train on 14569 samples, validate on 3643 samples
Epoch 1/1
14569/14569 [==============================] - 99s - loss: 2.4992 - val_loss: 2.4373
----- generating with seed: cent vote to leave the european union and the rise of populist parties around the world.

much of th
----- diversity: 0.5
cent vote to leave the european union and the rise of populist parties around the world.

much of the the ant ye that anse enth be and wer rabe the drong ronas an the th an that an ton es and in e the
----- diversity: 1.2
cent vote to leave the european union and the rise of populist parties around the world.

much of tha-s
eoltiger andam sura fshes bh gheantxyckdy icr euacims. ngor  oe vaacgmsoomles pndice top apollgn
epoch: 9 / 50
Train on 14569 samples, validate on 3643 samples
Epoch 1/1
14569/14569 [==============================] - 92s - loss: 2.4521 - val_loss: 2.4028
----- generating with seed: ystem so partisan that previously bipartisan ideas like bridge and airport upgrades are nonstarters.
----- diversity: 0.5
ystem so partisan that previously bipartisan ideas like bridge and airport upgrades are nonstarters. she anc ameres and she ore an okeronchat and an porese the the ans porat ouves oad cont he the the 
----- diversity: 1.2
ystem so partisan that previously bipartisan ideas like bridge and airport upgrades are nonstarters.hay amt onk tollssy ox fold apobom an anm gatijl. th far8yowiigs1s ,s o pevorentingicschaled cprefri
epoch: 10 / 50
Train on 14569 samples, validate on 3643 samples
Epoch 1/1
14569/14569 [==============================] - 92s - loss: 2.4198 - val_loss: 2.3790
----- generating with seed: to our citizens is vital. curbs to entitlement growth that build on the affordable care acts progres
----- diversity: 0.5
to our citizens is vital. curbs to entitlement growth that build on the affordable care acts progress the and thece wrele pores an the polat  aot  he the te the thos resse the bulle gof the le wes in 
----- diversity: 1.2
to our citizens is vital. curbs to entitlement growth that build on the affordable care acts progresek fet 6ham umtin ceut ann ralm creon ord sveusait mey efte ,0ee opeofeoig ganxt; boet-e punti, wmir
epoch: 11 / 50
Train on 14569 samples, validate on 3643 samples
Epoch 1/1
14569/14569 [==============================] - 93s - loss: 2.3895 - val_loss: 2.3492
----- generating with seed: satlantic trade and investment partnership with the eu. these agreements, and stepped-up trade enfor
----- diversity: 0.5
satlantic trade and investment partnership with the eu. these agreements, and stepped-up trade enforele enstigcen wan for and onceres tho te to ereres an he ant wat the tall an for te on outing are th
----- diversity: 1.2
satlantic trade and investment partnership with the eu. these agreements, and stepped-up trade enfore.bt2fon endgeste7 fed yed uirksioty rmmore7nwadd cnalel, lir ont -at9ed hho  aum3ers rethowe.drewtl
epoch: 12 / 50
Train on 14569 samples, validate on 3643 samples
Epoch 1/1
14569/14569 [==============================] - 92s - loss: 2.3512 - val_loss: 2.3296
----- generating with seed: artisan that previously bipartisan ideas like bridge and airport upgrades are nonstarters.

we could
----- diversity: 0.5
artisan that previously bipartisan ideas like bridge and airport upgrades are nonstarters.

we could and forte the the bes ag be the ame an core to and parderiscang th go and as to re theve tha past i
----- diversity: 1.2
artisan that previously bipartisan ideas like bridge and airport upgrades are nonstarters.

we could,f. gintund 1g andt phort od wevtmwar dates, afd calnopee lporate thace , inergf,qdeb and sotus, a6i
epoch: 13 / 50
Train on 14569 samples, validate on 3643 samples
Epoch 1/1
14569/14569 [==============================] - 93s - loss: 2.3297 - val_loss: 2.3033
----- generating with seed: d help finding new jobs would assist. so would making unemployment insurance available to more worke
----- diversity: 0.5
d help finding new jobs would assist. so would making unemployment insurance available to more workeis and the the the tho perser oud hes furlinan the atiing thes the ans and tho werenor the rale gon 
----- diversity: 1.2
d help finding new jobs would assist. so would making unemployment insurance available to more workes ol suiniun fondoscamey
.
rorhes waniln coess
emfathoal horle-pvrinhuitxunto xxibles fretfivinl. in
epoch: 14 / 50
Train on 14569 samples, validate on 3643 samples
Epoch 1/1
14569/14569 [==============================] - 93s - loss: 2.3006 - val_loss: 2.2888
----- generating with seed: in any previous administration since at least 1960.

even these efforts fall well short. in the futu
----- diversity: 0.5
in any previous administration since at least 1960.

even these efforts fall well short. in the future for the wes reste an an wuring tha  he rus prerees an ing thal ane for enconethan the to hed the 
----- diversity: 1.2
in any previous administration since at least 1960.

even these efforts fall well short. in the futut woncebtect bus annes pod, th irkwr
es, 4osr nepraly wcovs  moalm, uld cra.sinsont me lun0 y a8s er
epoch: 15 / 50
Train on 14569 samples, validate on 3643 samples
Epoch 1/1
14569/14569 [==============================] - 93s - loss: 2.2782 - val_loss: 2.2697
----- generating with seed: ny potential physicists and engineers spend their careers shifting money around in the financial sec
----- diversity: 0.5
ny potential physicists and engineers spend their careers shifting money around in the financial secon the erang bet and racees ans an aming on the  aop in tor anceuticing ans an poriling se porings a
----- diversity: 1.2
ny potential physicists and engineers spend their careers shifting money around in the financial secomist ad creresbbling mat qvaelec00, is ofnuplns as sopguf the 2hoty meawint 0namw bamis or lesfofd 
epoch: 16 / 50
Train on 14569 samples, validate on 3643 samples
Epoch 1/1
14569/14569 [==============================] - 93s - loss: 2.2522 - val_loss: 2.2511
----- generating with seed: ir share, tax changes enacted during my administration have increased the share of income received b
----- diversity: 0.5
ir share, tax changes enacted during my administration have increased the share of income received butily fof the proutid an aricingre at anu mer and the the f rathe nop onte ins ard ane tog the parse
----- diversity: 1.2
ir share, tax changes enacted during my administration have increased the share of income received be blpwongelns m?eocusitdyont rziansst?
lofewore ecbo dower? yvor canlcune at ak ueclinity nsingrsta

epoch: 17 / 50
Train on 14569 samples, validate on 3643 samples
Epoch 1/1
14569/14569 [==============================] - 92s - loss: 2.2339 - val_loss: 2.2320
----- generating with seed: ts at the expense of the deferred maintenance bills we are passing to our children, particularly for
----- diversity: 0.5
ts at the expense of the deferred maintenance bills we are passing to our children, particularly fore serenom ion abitite bo the to al miged and than the the an the the the pions cors the wort and the
----- diversity: 1.2
ts at the expense of the deferred maintenance bills we are passing to our children, particularly for ant1 aledybgithe  ow be tholoript, times s,ucste-stroerty enidly thoncecs fomite mese lominc
-8rs p
epoch: 18 / 50
Train on 14569 samples, validate on 3643 samples
Epoch 1/1
14569/14569 [==============================] - 97s - loss: 2.1989 - val_loss: 2.2167
----- generating with seed: d investment partnership with the eu. these agreements, and stepped-up trade enforcement, will level
----- diversity: 0.5
d investment partnership with the eu. these agreements, and stepped-up trade enforcement, will level ancoming thal cenal to protte the gromer the the reane for perer bout pronged furt conte for the an
----- diversity: 1.2
d investment partnership with the eu. these agreements, and stepped-up trade enforcement, will levelings.r.n wgod riesroaven b5am robe perlless wfs erdiey wh fofaly sobre borespty eluevibat edtouv-lus
epoch: 19 / 50
Train on 14569 samples, validate on 3643 samples
Epoch 1/1
14569/14569 [==============================] - 98s - loss: 2.1913 - val_loss: 2.2038
----- generating with seed: ould not bear the full burden of stabilising our economy. unfortunately, good economics can be overr
----- diversity: 0.5
ould not bear the full burden of stabilising our economy. unfortunately, good economics can be overrent bome insiste foul aad ar in and tor and icenome thig thaly the ing one the the coned th aced and
----- diversity: 1.2
ould not bear the full burden of stabilising our economy. unfortunately, good economics can be overr as reeswet  ncarme leye pyecite bl. wir-terciny tuts id in tpexge?ts, mat4rivp,imigp.
titaat s oply
epoch: 20 / 50
Train on 14569 samples, validate on 3643 samples
Epoch 1/1
14569/14569 [==============================] - 98s - loss: 2.1732 - val_loss: 2.1920
----- generating with seed: only thrives when there are rules to guard against systemic failure and ensure fair competition.

po
----- diversity: 0.5
only thrives when there are rules to guard against systemic failure and ensure fair competition.

pore be the thal cendenting the fin the resest the hand fint and avery and wrale the the ther an the h
----- diversity: 1.2
only thrives when there are rules to guard against systemic failure and ensure fair competition.

pohrcat eorusciner avereftr1ee we thamod punc9ty. iss thm uschall sobmev ow w.
toced, wal lparde troos
epoch: 21 / 50
Train on 14569 samples, validate on 3643 samples
Epoch 1/1
14569/14569 [==============================] - 98s - loss: 2.1504 - val_loss: 2.1841
----- generating with seed: reaks for the most fortunate can address long-term fiscal challenges without sacrificing investments
----- diversity: 0.5
reaks for the most fortunate can address long-term fiscal challenges without sacrificing investments-ealle the more cance and wome conanged the prast on eromucing ain sore the the partirest and wall g
----- diversity: 1.2
reaks for the most fortunate can address long-term fiscal challenges without sacrificing investments, ils ufkems moud, sourc as in tezs buf xfrnnvame iwriggorpasisle hes -yprhmuhlegho goviln, ferveng 
epoch: 22 / 50
Train on 14569 samples, validate on 3643 samples
Epoch 1/1
14569/14569 [==============================] - 98s - loss: 2.1322 - val_loss: 2.1795
----- generating with seed: re and introducing new rules cutting emissions from vehicles and power plants.

the results are clea
----- diversity: 0.5
re and introducing new rules cutting emissions from vehicles and power plants.

the results are cleasing the pard mat an wer sount are mality the dones and in the pald and weve the pord cod the the al
----- diversity: 1.2
re and introducing new rules cutting emissions from vehicles and power plants.

the results are cleass, mene to amal whos, ivevlly gechamely tht ospra-itos tuxthbil forthderctaltieg tox a?ouvicrfustto
epoch: 23 / 50
Train on 14569 samples, validate on 3643 samples
Epoch 1/1
14569/14569 [==============================] - 98s - loss: 2.1118 - val_loss: 2.1563
----- generating with seed: conomy when needed and to meet our long-term obligations to our citizens is vital. curbs to entitlem
----- diversity: 0.5
conomy when needed and to meet our long-term obligations to our citizens is vital. curbs to entitlem rowg the sering me tut ingreatising for the the the the whe- sopurto that doprenting our the to per
----- diversity: 1.2
conomy when needed and to meet our long-term obligations to our citizens is vital. curbs to entitlemamingitn meaqqutiinpist dare ditl, acsy 1odme d0vosumed by.e toos ivinl ad evem, nalh-rove phor chen
epoch: 24 / 50
Train on 14569 samples, validate on 3643 samples
Epoch 1/1
14569/14569 [==============================] - 98s - loss: 2.0961 - val_loss: 2.1577
----- generating with seed: orts helped lead us out of the recession. american firms that export pay their workers up to 18 more
----- diversity: 0.5
orts helped lead us out of the recession. american firms that export pay their workers up to 18 more wore the ar and bebulle sore the mece and inecano thar in and that the leono less and the vered for
----- diversity: 1.2
orts helped lead us out of the recession. american firms that export pay their workers up to 18 more sf toee casale momt ancuur incaners erower balt nxaul ons mnce con ntrete-
:al of thepe fythang oug
epoch: 25 / 50
Train on 14569 samples, validate on 3643 samples
Epoch 1/1
14569/14569 [==============================] - 99s - loss: 2.0785 - val_loss: 2.1470
----- generating with seed:  many is a threat to all. economies are more successful when we close the gap between rich and poor 
----- diversity: 0.5
 many is a threat to all. economies are more successful when we close the gap between rich and poor the tas in al wat the the the past the be to jor antiingrting the thes maris somer chat the incont i
----- diversity: 1.2
 many is a threat to all. economies are more successful when we close the gap between rich and poor inod ctaninw pakiming inalme bat fcoll.. 
ancon. toqdpuilial, rnce andq aowte novatiomy thet avthigt
epoch: 26 / 50
Train on 14569 samples, validate on 3643 samples
Epoch 1/1
14569/14569 [==============================] - 98s - loss: 2.0547 - val_loss: 2.1430
----- generating with seed: highest aspirations. so where does my successor go from here?

further progress requires recognising
----- diversity: 0.5
highest aspirations. so where does my successor go from here?

further progress requires recognising and intore thas busting the thas ancoun the forem for ande tove to fore reathous and insinis sores 
----- diversity: 1.2
highest aspirations. so where does my successor go from here?

further progress requires recognising.
youl theo gof jatr.
paasinm, ts is indvating0ritvjsisunsy nemetsls jod anbrestian, fus nhe reorof 
epoch: 27 / 50
Train on 14569 samples, validate on 3643 samples
Epoch 1/1
14569/14569 [==============================] - 98s - loss: 2.0406 - val_loss: 2.1399
----- generating with seed: expense of the deferred maintenance bills we are passing to our children, particularly for infrastru
----- diversity: 0.5
expense of the deferred maintenance bills we are passing to our children, particularly for infrastruthe than the fand meverle and aticing and iccang the for whan sout whond wation son men the and the 
----- diversity: 1.2
expense of the deferred maintenance bills we are passing to our children, particularly for infrastrun melt a-sagemem comrtos fore thitleft. cobneticam eramcasty lya:es csituuk8 coly. save meper paady4
epoch: 28 / 50
Train on 14569 samples, validate on 3643 samples
Epoch 1/1
14569/14569 [==============================] - 98s - loss: 2.0176 - val_loss: 2.1323
----- generating with seed: y have delivered in the past centuries.

this paradox of progress and peril has been decades in the 
----- diversity: 0.5
y have delivered in the past centuries.

this paradox of progress and peril has been decades in the we the and were to in will growth and anlicing and the in the the racing enoming the gre the the peo
----- diversity: 1.2
y have delivered in the past centuries.

this paradox of progress and peril has been decades in the lene-im aor 1hagk-inte n1t..
3is sfouc-ing the ma boml-smwrequsarstlon of ry meret red thic turihs f
epoch: 29 / 50
Train on 14569 samples, validate on 3643 samples
Epoch 1/1
14569/14569 [==============================] - 99s - loss: 2.0040 - val_loss: 2.1216
----- generating with seed: r workers were constrained by a greater degree of social interaction between employees at all levels
----- diversity: 0.5
r workers were constrained by a greater degree of social interaction between employees at all levels an to nepress and the pand and enoresing the and andendency and in for paiting ly and anderengore t
----- diversity: 1.2
r workers were constrained by a greater degree of social interaction between employees at all levelsinca. o nogt.
en aldratr aif nodingithou fhrleg7 hes nhpettconly avedbyon tee thans, alenous mile se
epoch: 30 / 50
Train on 14569 samples, validate on 3643 samples
Epoch 1/1
14569/14569 [==============================] - 98s - loss: 1.9818 - val_loss: 2.1155
----- generating with seed: rans-pacific partnership and to conclude a transatlantic trade and investment partnership with the e
----- diversity: 0.5
rans-pacific partnership and to conclude a transatlantic trade and investment partnership with the erdass, more that the than the the best camer of and tor the on wer ureting to the forming the proste
----- diversity: 1.2
rans-pacific partnership and to conclude a transatlantic trade and investment partnership with the ero0e rom puruncus of perainime amongetruingb meat to hes. paidy ulming sopat sorfse. jasb aid tmam. 
epoch: 31 / 50
Train on 14569 samples, validate on 3643 samples
Epoch 1/1
14569/14569 [==============================] - 98s - loss: 1.9695 - val_loss: 2.1184
----- generating with seed: y policy should not bear the full burden of stabilising our economy. unfortunately, good economics c
----- diversity: 0.5
y policy should not bear the full burden of stabilising our economy. unfortunately, good economics can the worl dowe the graty the that fan the reversing and icrange in the the the gat efore tha to pe
----- diversity: 1.2
y policy should not bear the full burden of stabilising our economy. unfortunately, good economics cpaniny , s-e t; wirt it cout, tif wiag sust anf 2617, 9ngo, daligien acchatutiacies th iverome wos p
epoch: 32 / 50
Train on 14569 samples, validate on 3643 samples
Epoch 1/1
14569/14569 [==============================] - 98s - loss: 1.9429 - val_loss: 2.1134
----- generating with seed: an address long-term fiscal challenges without sacrificing investments in growth and opportunity.

f
----- diversity: 0.5
an address long-term fiscal challenges without sacrificing investments in growth and opportunity.

for the arod workest an the serans ans fould the gromen comenting thas promeven for coul the detican 
----- diversity: 1.2
an address long-term fiscal challenges without sacrificing investments in growth and opportunity.

forpcrualial undcmtaimicans cesthmyde camenared ast-omperiness as ise on pabpear grovit jeat souckud 
epoch: 33 / 50
Train on 14569 samples, validate on 3643 samples
Epoch 1/1
14569/14569 [==============================] - 98s - loss: 1.9253 - val_loss: 2.1190
----- generating with seed: level the playing field for workers and businesses alike.

second, alongside slowing productivity, i
----- diversity: 0.5
level the playing field for workers and businesses alike.

second, alongside slowing productivity, inconomiming the worker and ingresting the word unting tore the the pare of enalicy sor and the abe r
----- diversity: 1.2
level the playing field for workers and businesses alike.

second, alongside slowing productivity, ingare. gaed hive nous fpart ang eclatle. 
nordimine chdusicc oun sbamings cegssy apto toblty ruca- p
epoch: 34 / 50
Train on 14569 samples, validate on 3643 samples
Epoch 1/1
14569/14569 [==============================] - 117s - loss: 1.9242 - val_loss: 2.1054
----- generating with seed: ildrens schools, in civic organisations. thats why ceos took home about 20- to 30-times as much as t
----- diversity: 0.5
ildrens schools, in civic organisations. thats why ceos took home about 20- to 30-times as much as the lowing and prouting sithan the the lost cencestre the enore pars the mare ther and insstunting th
----- diversity: 1.2
ildrens schools, in civic organisations. thats why ceos took home about 20- to 30-times as much as thew sallra th-tugend-putit got lest of lros fraoscurdsin, fy 3ogus whon the must nonkentstrestsuwtru
epoch: 35 / 50
Train on 14569 samples, validate on 3643 samples
Epoch 1/1
14569/14569 [==============================] - 96s - loss: 1.8959 - val_loss: 2.1082
----- generating with seed: it is related to a devastating rise of opioid abuse and an associated increase in overdose deaths an
----- diversity: 0.5
it is related to a devastating rise of opioid abuse and an associated increase in overdose deaths and the more the fromurs ald the reas se whe dalle in hamit jon that pol enon the whan . jogre the pro
----- diversity: 1.2
it is related to a devastating rise of opioid abuse and an associated increase in overdose deaths and gat eherogtpingriggeviveanhtre l7ouy incire cos, of  wohl bedilise pa. whht nvitiabs the ppavilee 
epoch: 36 / 50
Train on 14569 samples, validate on 3643 samples
Epoch 1/1
14569/14569 [==============================] - 111s - loss: 1.8768 - val_loss: 2.1101
----- generating with seed: gap between rich and poor and growth is broadly based. a world in which 1 of humanity controls as mu
----- diversity: 0.5
gap between rich and poor and growth is broadly based. a world in which 1 of humanity controls as muth red beed and pporting for ald eronoming the lest on aperingres prowthe sore in of and deaten the 
----- diversity: 1.2
gap between rich and poor and growth is broadly based. a world in which 1 of humanity controls as muate ifimiriss fofeimicisminy the specbligg fatioy encayy 205t fmond,isngt bfourd fort dpaintive ammo
epoch: 37 / 50
Train on 14569 samples, validate on 3643 samples
Epoch 1/1
14569/14569 [==============================] - 97s - loss: 1.8572 - val_loss: 2.1079
----- generating with seed: d expanding high-quality job training.

lifting productivity and wages also depends on creating a gl
----- diversity: 0.5
d expanding high-quality job training.

lifting productivity and wages also depends on creating a globit sere the fore thet thar constinct and beot cinl and a come the predem no worn for and belted an
----- diversity: 1.2
d expanding high-quality job training.

lifting productivity and wages also depends on creating a glecer cevenoby bem ibestrlty what workcod eestur fuchiss cresinines hatghress buch of that cnange s e
epoch: 38 / 50
Train on 14569 samples, validate on 3643 samples
Epoch 1/1
14569/14569 [==============================] - 92s - loss: 1.8489 - val_loss: 2.1060
----- generating with seed: . progress in america also helped catalyse the historic paris climate agreement, which presents the 
----- diversity: 0.5
. progress in america also helped catalyse the historic paris climate agreement, which presents the prober for comering wor incont the former worke sore the to part destorsis sicce pas be an brester a
----- diversity: 1.2
. progress in america also helped catalyse the historic paris climate agreement, which presents the joptecings storamy wholl.e to and dotabrss mone bal 8ssite that exol dovens thes iatp by peoiny sheo
epoch: 39 / 50
Train on 14569 samples, validate on 3643 samples
Epoch 1/1
14569/14569 [==============================] - 93s - loss: 1.8316 - val_loss: 2.1200
----- generating with seed: eers shifting money around in the financial sector, instead of applying their talents to innovating 
----- diversity: 0.5
eers shifting money around in the financial sector, instead of applying their talents to innovating the faring the furede the werkers ac tople the prostent to ingerest at or parices and averest peasti
----- diversity: 1.2
eers shifting money around in the financial sector, instead of applying their talents to innovating to coma teputribitr ivesice uctind y-cruwitingts natiuls, mechmingvelils avstint. betwmus te ahus be
epoch: 40 / 50
Train on 14569 samples, validate on 3643 samples
Epoch 1/1
14569/14569 [==============================] - 92s - loss: 1.8121 - val_loss: 2.1204
----- generating with seed: estore past glory if they just got some group or idea that was threatening america under control. we
----- diversity: 0.5
estore past glory if they just got some group or idea that was threatening america under control. we dane we pant the the prowst tor buting we have cand could contes in furting leget ana merepportitis
----- diversity: 1.2
estore past glory if they just got some group or idea that was threatening america under control. wecreophatss,r the, desing, daveustdrl. dedactifanonesen copsotmemslquorl egreastverod dbies: te cbabu
epoch: 41 / 50
Train on 14569 samples, validate on 3643 samples
Epoch 1/1
14569/14569 [==============================] - 96s - loss: 1.7967 - val_loss: 2.1127
----- generating with seed: er the past decade, america has enjoyed the fastest productivity growth in the g7, but it has slowed
----- diversity: 0.5
er the past decade, america has enjoyed the fastest productivity growth in the g7, but it has slowed the dabing in instering that challe cant reconons in bustreing in thet more to mere reanes in the t
----- diversity: 1.2
er the past decade, america has enjoyed the fastest productivity growth in the g7, but it has slowed ancspiodtining iget ontived fotfaly, oe oxlorge nitheslide veredubino thead grows-trhachever gro , 
epoch: 42 / 50
Train on 14569 samples, validate on 3643 samples
Epoch 1/1
14569/14569 [==============================] - 95s - loss: 1.7806 - val_loss: 2.1269
----- generating with seed: trans-pacific partnership and to conclude a transatlantic trade and investment partnership with the 
----- diversity: 0.5
trans-pacific partnership and to conclude a transatlantic trade and investment partnership with the and semicing ay reat ore ant of the merer for efor coule by the one bustermin sertuce and and arenti
----- diversity: 1.2
trans-pacific partnership and to conclude a transatlantic trade and investment partnership with the ald rewstreist 5o ae imirics 19e. wjorits -ofredy.

nforges bore 3avo can, setveran a6 uice cos thal
epoch: 43 / 50
Train on 14569 samples, validate on 3643 samples
Epoch 1/1
14569/14569 [==============================] - 107s - loss: 1.7664 - val_loss: 2.1214
----- generating with seed:  threatening america under control. we overcame those fears and we will again.

but some of the disc
----- diversity: 0.5
 threatening america under control. we overcame those fears and we will again.

but some of the discen and werker worl and at come and the fal dent reen fore the reas of the bestrecin secting to-e rad
----- diversity: 1.2
 threatening america under control. we overcame those fears and we will again.

but some of the discono
poust and br libber, ande mipalels, a nislvinelut. acdiincition, anchallod anomd-tintarns, im lo
epoch: 44 / 50
Train on 14569 samples, validate on 3643 samples
Epoch 1/1
14569/14569 [==============================] - 97s - loss: 1.7472 - val_loss: 2.1364
----- generating with seed:  has accomplished these past eight years, i have always acknowledged that the work of perfecting our
----- diversity: 0.5
 has accomplished these past eight years, i have always acknowledged that the work of perfecting our the pering in icalles, the prosten that and inst mere wan proved to in the tha  of entaticne to bat
----- diversity: 1.2
 has accomplished these past eight years, i have always acknowledged that the work of perfecting our dovito has nmomend gutiant eaulde crasimeg5rasiw whol acloole. sofrt dfunaves coresicts, waph the s
epoch: 45 / 50
Train on 14569 samples, validate on 3643 samples
Epoch 1/1
14569/14569 [==============================] - 96s - loss: 1.7340 - val_loss: 2.1265
----- generating with seed:  12. in 1999, 23 of prime-age women were out of the labour force. today, it is 26. people joining or
----- diversity: 0.5
 12. in 1999, 23 of prime-age women were out of the labour force. today, it is 26. people joining or the ard be wowe farto pertsure and or gatine shered and medery and the fore and cament, an in enont
----- diversity: 1.2
 12. in 1999, 23 of prime-age women were out of the labour force. today, it is 26. people joining or ths doguttessisind of 1hateoalt dingcomel the l79iss wo phod comulicas fof al om revecfis netarito 
epoch: 46 / 50
Train on 14569 samples, validate on 3643 samples
Epoch 1/1
14569/14569 [==============================] - 97s - loss: 1.7213 - val_loss: 2.1454
----- generating with seed: quality: technology, education, globalisation, declining unions and a falling minimum wage. there is
----- diversity: 0.5
quality: technology, education, globalisation, declining unions and a falling minimum wage. there is comented for enorestorminis thit. in the americhall fored by the cenamies an thame the progrest con
----- diversity: 1.2
quality: technology, education, globalisation, declining unions and a falling minimum wage. there is ubnore to ealy beomicien brevoarmy ac to kert ineteningecunow thear ancerecaald mecmitisg, a vonear
epoch: 47 / 50
Train on 14569 samples, validate on 3643 samples
Epoch 1/1
14569/14569 [==============================] - 97s - loss: 1.6945 - val_loss: 2.1496
----- generating with seed: er known.

over the past 25 years, the proportion of people living in extreme poverty has fallen fro
----- diversity: 0.5
er known.

over the past 25 years, the proportion of people living in extreme poverty has fallen from the lobsen wo leve of the indemticas and with heve patsthe fin more of rangibatition the tor fanco
----- diversity: 1.2
er known.

over the past 25 years, the proportion of people living in extreme poverty has fallen frome inve the gas. beemeg ov userugdersse bo lignsa cof atd sel io. is ctausuledet he womn, arcinaus, 
epoch: 48 / 50
Train on 14569 samples, validate on 3643 samples
Epoch 1/1
14569/14569 [==============================] - 100s - loss: 1.6889 - val_loss: 2.1433
----- generating with seed: is more prosperous than ever before and yet our societies are marked by uncertainty and unease. so w
----- diversity: 0.5
is more prosperous than ever before and yet our societies are marked by uncertainty and unease. so whe for economy and we hand resters to the economy sion the tor that and best acle thes the the the e
----- diversity: 1.2
is more prosperous than ever before and yet our societies are marked by uncertainty and unease. so whill  hofturiticca, icpfontoul a posunctisttrons rofart o7 ythe wof tom soweldtilnte more ofrmutsina
epoch: 49 / 50
Train on 14569 samples, validate on 3643 samples
Epoch 1/1
14569/14569 [==============================] - 97s - loss: 1.6820 - val_loss: 2.1403
----- generating with seed:  a capitalism shaped by the few and unaccountable to the many is a threat to all. economies are more
----- diversity: 0.5
 a capitalism shaped by the few and unaccountable to the many is a threat to all. economies are more andiciss and in boult nor the whone that the and thet giting the to proste cuncont alicale the more
----- diversity: 1.2
 a capitalism shaped by the few and unaccountable to the many is a threat to all. economies are more kurdw breab it catding that domesy nol gzent that of otomryge-ceraandt, on sakite the sserankentb i
epoch: 50 / 50
Train on 14569 samples, validate on 3643 samples
Epoch 1/1
14569/14569 [==============================] - 96s - loss: 1.6574 - val_loss: 2.1470
----- generating with seed:  all after-tax income. by 2007, that share had more than doubled to 17. this challenges the very ess
----- diversity: 0.5
 all after-tax income. by 2007, that share had more than doubled to 17. this challenges the very essured for enory tho forres and ertiting the our hast werl meat peen in serses and patichos seatse and
----- diversity: 1.2
 all after-tax income. by 2007, that share had more than doubled to 17. this challenges the very essrated. ans pfikifig thvaligg of thad propregrtitiasidl-ommore snoliles thay mure whree shorpe, ratic

That looks pretty good! You can see that the RNN has learned alot of the linguistic structure of the original writing, including typical length for words, where to put spaces, and basic punctuation with commas and periods. Many words are still misspelled but seem almost reasonable, and it is pretty amazing that it is able to learn this much in only 50 epochs of training.

You can see that the loss is still going down after 50 epochs, so the model can definitely benefit from longer training. If you're curious you can try to train for more epochs, but as the error decreases be careful to monitor the output to make sure that the model is not overfitting. As with other neural network models, you can monitor the difference between training and validation loss to see if overfitting might be occuring. In this case, since we're using the model to generate new information, we can also get a sense of overfitting from the material it generates.

A good indication of overfitting is if the model outputs exactly what is in the original text given a seed from the text, but jibberish if given a seed that is not in the original text. Remember we don't want the model to learn how to reproduce exactly the original text, but to learn its style to be able to generate new text. As with other models, regularization methods such as dropout and limiting model complexity can be used to avoid the problem of overfitting.

Finally, let's save our training data and character to integer mapping dictionaries to an external file so we can reuse it with the model at a later time.


In [18]:
pickle_file = '-basic_data.pickle'

try:
    f = open(pickle_file, 'wb')
    save = {
        'X': X,
        'y': y,
        'int_to_char': int_to_char,
        'char_to_int': char_to_int,
    }
    pickle.dump(save, f, pickle.HIGHEST_PROTOCOL)
    f.close()
except Exception as e:
    print 'Unable to save data to', pickle_file, ':', e
    raise
    
statinfo = os.stat(pickle_file)
print 'Saved data to', pickle_file
print 'Compressed pickle size:', statinfo.st_size


Saved data to -basic_data.pickle
Compressed pickle size: 80934860