Wayne H Nixalo - 13 June 2017

Practical Deep Learning I

Lesson 5 - RNNs, NLP

Code along of char-rnn.ipynb


In [1]:
import theano


/home/wnixalo/miniconda3/envs/FAI/lib/python2.7/site-packages/theano/gpuarray/dnn.py:135: UserWarning: Your cuDNN version is more recent than Theano. If you encounter problems, try updating Theano or downgrading cuDNN to version 5.1.
  warnings.warn("Your cuDNN version is more recent than "
Using cuDNN version 6021 on context None
Mapped name None to device cuda: GeForce GTX 870M (0000:01:00.0)

In [2]:
%matplotlib inline
import os, sys
sys.path.insert(1, os.path.join('utils'))
import utils; reload(utils)
from utils import *
from __future__ import print_function, division


Using Theano backend.

In [3]:
from keras.layers import TimeDistributed, Activation
# https://keras.io/layers/wrappers/
# [Doc:TimeDistributed] this wrapper allows to apply a layer to every temporal slice of an input
# https://keras.io/activations/
# [Doc:Activation] activations can be used through an Activation layer
from numpy.random import choice

Setup

We haven't really looked into the detail of how this works yet - so this is provided for self-study for those who're interested. We'll look at it closely next week.


In [4]:
path = get_file('nietzsche.txt', origin='https://s3.amazonaws.com/text-datasets/nietzsche.txt')
text = open(path).read().lower()
print('corpus length:', len(text))


corpus length: 600901

In [5]:
!tail {path} -n25


are thinkers who believe in the saints.


144

It stands to reason that this sketch of the saint, made upon the model
of the whole species, can be confronted with many opposing sketches that
would create a more agreeable impression. There are certain exceptions
among the species who distinguish themselves either by especial
gentleness or especial humanity, and perhaps by the strength of their
own personality. Others are in the highest degree fascinating because
certain of their delusions shed a particular glow over their whole
being, as is the case with the founder of christianity who took himself
for the only begotten son of God and hence felt himself sinless; so that
through his imagination--that should not be too harshly judged since the
whole of antiquity swarmed with sons of god--he attained the same goal,
the sense of complete sinlessness, complete irresponsibility, that can
now be attained by every individual through science.--In the same manner
I have viewed the saints of India who occupy an intermediate station
between the christian saints and the Greek philosophers and hence are
not to be regarded as a pure type. Knowledge and science--as far as they
existed--and superiority to the rest of mankind by logical discipline
and training of the intellectual powers were insisted upon by the
Buddhists as essential to sanctity, just as they were denounced by the
christian world as the indications of sinfulness.

In [6]:
# unique characters
chars = sorted(list(set(text)))
vocab_size = len(chars) + 1
print('total chars:', vocab_size)


total chars: 60

In [7]:
chars.insert(0, "\0")

In [8]:
# the unique characters (in UpLoCase corpus, add 26)
''.join(chars[1:-6])


Out[8]:
'\n !"\'(),-.0123456789:;=?[]_abcdefghijklmnopqrstuvwxyz'

In [9]:
# create a mapping from char to index in which it appears
char_indices = dict((c, i) for i, c in enumerate(chars))
# create a mapping from index to char
indices_char = dict((i, c) for i, c in enumerate(chars))

^ This allows us to take the text & convert into a list of numbers, where the number represents the index in which the char appears in the unique-char list.


In [10]:
idx = [char_indices[c] for c in text]
idx[:10]


Out[10]:
[43, 45, 32, 33, 28, 30, 32, 1, 1, 1]

In [11]:
''.join(indices_char[i] for i in idx[:70])


Out[11]:
'preface\n\n\nsupposing that truth is a woman--what then? is there not gro'

Preprocess and create model


In [12]:
maxlen = 40
sentences = []
next_chars = []
for i in xrange(0, len(idx) - maxlen + 1):
    sentences.append(idx[i: i + maxlen])
    next_chars.append(idx[i + 1: i + maxlen + 1])
print('nb sequences:', len(sentences))


nb sequences: 600862

In [13]:
sentences = np.concatenate([[np.array(o)] for o in sentences[:-2]])
next_chars = np.concatenate([[np.array(o)] for o in next_chars[:-2]])

In [14]:
sentences.shape, next_chars.shape


Out[14]:
((600860, 40), (600860, 40))

In [15]:
n_fac = 24

In Lesson 6: improved: an RNN feeding into an RNN. See lecture at ~ 1:10:00


In [16]:
model = Sequential([
            Embedding(vocab_size, n_fac, input_length=maxlen),
            LSTM(512, input_dim=n_fac, return_sequences=True, dropout_U=0.2, dropout_W=0.2,
                 consume_less='gpu'),
            Dropout(0.2),
            LSTM(512, return_sequences=True, dropout_U=0.2, dropout_W=0.2,
                 consume_less='gpu'),
            Dropout(0.2),
            TimeDistributed(Dense(vocab_size)),
            Activation('softmax')
        ])

In [17]:
model.compile(loss='sparse_categorical_crossentropy', optimizer=Adam())

Train


In [18]:
def print_example():
    seed_string="ethics is a basic foundation of all that"
    for i in range(320):
        x=np.array([char_indices[c] for c in seed_string[-40:]])[np.newaxis,:]
        preds = model.predict(x, verbose=0)[0][-1]
        preds = preds/np.sum(preds)
        next_char = choice(chars, p=preds)
        seed_string = seed_string + next_char
    print(seed_string)

In [19]:
model.fit(sentences, np.expand_dims(next_chars, -1), batch_size=64, nb_epoch=1)


Epoch 1/1
600860/600860 [==============================] - 1481s - loss: 1.5205  
Out[19]:
<keras.callbacks.History at 0x7f32e347c310>

In [20]:
print_example()


ethics is a basic foundation of all that has hitherto uncreated the one resemblous, indeed, be hidained preferred to new very acclustomed and a'tist method
that
the world is no doubt which our own
theology is not that experience bid to make them as yet question. is the "woman and his e knowledge," and the preacher his own choos: it would be twent ethic. with

In [21]:
model.fit(sentences, np.expand_dims(next_chars, -1), batch_size=64, nb_epoch=1)


Epoch 1/1
600860/600860 [==============================] - 1477s - loss: 1.2873  
Out[21]:
<keras.callbacks.History at 0x7f32ac5f5450>

In [22]:
print_example()


ethics is a basic foundation of all that makes himself pain he arouses a mysterious epicurean work of philosophy, and fundamentally being far most enduring vucgar and at daints and sounds to which a series of his life
called face of being. he would be
rendered with this condition that did
not
believe that has very said at any exactly suicide, who could just 

In [23]:
model.optimizer.lr=1e-3
model.fit(sentences, np.expand_dims(next_chars, -1), batch_size=128, nb_epoch=1)


Epoch 1/1
600860/600860 [==============================] - 1071s - loss: 1.2375  
Out[23]:
<keras.callbacks.History at 0x7f32aa2d6050>

In [24]:
print_example()


ethics is a basic foundation of all that feelings;
on its strength which was more modest and excess of the purpose is a guiltlessness
of children (faculty)"--we immoral: and if they were difficult
to be case, no consequence in physics.=--love almost everything which we should not perceive idea; as a pessimist
matter, he requires
coarse, without
"slave-moral 

In [25]:
model.optimizer.lr=1e-4
model.fit(sentences, np.expand_dims(next_chars, -1), batch_size=256, nb_epoch=1)


Epoch 1/1
600860/600860 [==============================] - 938s - loss: 1.2087   
Out[25]:
<keras.callbacks.History at 0x7f32aa2d6710>

In [26]:
print_example()


ethics is a basic foundation of all that great here taken such a theologians, or of the scholar in germany; now to power--it
must not be as it were one.

140. the seriousness of the commonplace of enlightenment such a
way also dream-timely to an
advance of the germans?--it will still be a feeling, of the experience, then they be sure, upon the degree of mora

In [27]:
%mkdir -p 'data/char_rnn/'
model.save_weights('data/char_rnn.h5')

In [28]:
model.optimizer.lr=1e-5
model.fit(sentences, np.expand_dims(next_chars, -1), batch_size=256, nb_epoch=1)


Epoch 1/1
600860/600860 [==============================] - 938s - loss: 1.1957   
Out[28]:
<keras.callbacks.History at 0x7f32e42ff790>

In [29]:
print_example()


ethics is a basic foundation of all that decides no longer discredited
with severity, would not wish to say to up into matters and where throughout men who--we extent that they have done the appearances of arward, through the desire for richard
wagner's consciousness and intellectual value to the habit that has hitherto played its
shadow and rapidly deemed r

In [30]:
model.fit(sentences, np.expand_dims(next_chars, -1), batch_size=128, nb_epoch=1)


Epoch 1/1
600860/600860 [==============================] - 1070s - loss: 1.1957  
Out[30]:
<keras.callbacks.History at 0x7f32e34749d0>

In [31]:
print_example()


ethics is a basic foundation of all that has been most eusoleant
proceeding) over convuction, and also somebody; a philosophy of those hand, that of the form of all the
honours, and depression, of the perspective! very instinct implies: a stronger ethics, the "man" is not yet bound for the spiritual power of all absolutely believed that the fact that
the mor

In [32]:
print_example()


ethics is a basic foundation of all that example, renound to bear against "winding" is perhaps false judgment, during
all the extent of all self connections and man and attain hod men taken roff good europeans
how product
to certain man of the "taste of truth." let
us not yet be consusted for the punishment--he should
venture your nor perhaps to the
impulse 

In [33]:
model.save_weights('data/char_rnn.h5')

In [34]:
def print_example(seed_string=''):
    if not seed_string:
        seed_string="ethics is a basic foundation of all that"
    for i in range(320):
        x=np.array([char_indices[c] for c in seed_string[-40:]])[np.newaxis,:]
        preds = model.predict(x, verbose=0)[0][-1]
        preds = preds/np.sum(preds)
        next_char = choice(chars, p=preds)
        seed_string = seed_string + next_char
    print(seed_string)

In [40]:
text ='so um first i was afraid i was petrified'
print_example(text)


so um first i was afraid i was petrified, and with achievement and
deeper synthesis, in short, a yea and the
great historical problem of things.

35. there is neither a hammer of few of the suffering of cheerful mediocrity is comforted to the "principle"!--it is the last great spoctise only even in him. he discovers much flung part of its claim, the outside 

In [ ]: