In [ ]:
from neon.backends import gen_backend
be = gen_backend(backend='gpu', batch_size=1)
print be
We also define a few parameters, and the load the vocabulary. The vocab is a 1:1 mapping of words to numbers. The file imdb.vocab can be downloaded from https://s3-us-west-1.amazonaws.com/nervana-course/imdb.vocab and placed in the data directory.
In [ ]:
import pickle as pkl
sentence_length = 128
vocab_size = 20000
# we have some special codes
pad_char = 0 # padding character
start = 1 # marker for start of review
oov = 2 # when the word is out of the vocab
index_from = 3 # index of first word in vocab
# load the vocab
vocab, rev_vocab = pkl.load(open('data/imdb.vocab', 'rb'))
In [ ]:
from neon.models import Model
model = Model('imdb_lstm.pkl')
# we initialize the model, passing in the size of the input data.
model.initialize(dataset=(sentence_length, 1))
We first generate some buffers on both the host (CPU) and the device (GPU) to hold the input data that we would like to pass to the model for inference. Below the variable be is the backend that we creater with gen_backend earlier in the code. Our backend supports numpy-like functions for allocating buffers on the compute device.
In [ ]:
import numpy as np
input_device = be.zeros((sentence_length, 1), dtype=np.int32) # `be` is the backend that we created earlier in the code.
input_numpy = np.zeros((sentence_length, 1), dtype=np.int32)
Now we write our new movie review. We've included a sample here, but feel free to write your own and see how well the model responds.
POSITIVE:
"The pace is steady and constant, the characters full and engaging, the relationships and interactions natural showing that you do not need floods of tears to show emotion, screams to show fear, shouting to show dispute or violence to show anger. Naturally Joyce's short story lends the film a ready made structure as perfect as a polished diamond, but the small changes Huston makes such as the inclusion of the poem fit in neatly. It is truly a masterpiece of tact, subtlety and overwhelming beauty."
NEGATIVE:
"Beautiful attracts excellent idea, but ruined with a bad selection of the actors. The main character is a loser and his woman friend and his friend upset viewers. Apart from the first episode all the other become more boring and boring. First, it considers it illogical behavior. No one normal would not behave the way the main character behaves. It all represents a typical Halmark way to endear viewers to the reduced amount of intelligence. Does such a scenario, or the casting director and destroy this question is on Halmark producers. Cat is the main character is wonderful. The main character behaves according to his friend selfish."
NEUTRAL:
"The characters voices were very good. I was only really bothered by Kanga. The music, however, was twice as loud in parts than the dialog, and incongruous to the film. As for the story, it was a bit preachy and militant in tone. Overall, I was disappointed, but I would go again just to see the same excitement on my child's face. I liked Lumpy's laugh..."
In [ ]:
line = """Beautiful attracts excellent idea, but ruined with a bad selection of the actors. The main character is
a loser and his woman friend and his friend upset viewers. Apart from the first episode all the other become
more boring and boring. First, it considers it illogical behavior. No one normal would not behave the way the
main character behaves. It all represents a typical Halmark way to endear viewers to the reduced amount of
intelligence. Does such a scenario, or the casting director and destroy this question is on Halmark
producers. Cat is the main character is wonderful. The main character behaves according to
his friend selfish."""
Before we send the data to the model, we need to convert the string to a sequence of numbers, with each number representing a word, using the vocab that we loaded earlier in the code. If a word is not in our vocab, we use a special out-of-vocab number.
In [ ]:
from neon.data.text_preprocessing import clean_string
tokens = clean_string(line).strip().split()
sent = [len(vocab) + 1 if t not in vocab else vocab[t] for t in tokens]
sent = [start] + [w + index_from for w in sent]
sent = [oov if w >= vocab_size else w for w in sent]
The text data is now converted to a list of integers:
In [ ]:
print sent
We truncate the input to sentence_length=128 words. If the text is less than 128 words, we pad with zeros. The text is then loaded into the numpy array named input_host.
In [ ]:
trunc = sent[-sentence_length:] # take the last sentence_length words
input_numpy[:] = 0 # fill with zeros
input_numpy[-len(trunc):, 0] = trunc # place the input into the numpy array
print input_numpy.T
In [ ]:
input_device.set(input_numpy) # copy the numpy array to device
y_pred = model.fprop(input_device, inference=True) # run the forward pass through the model
print("Predicted sentiment: {}".format(y_pred.get()[1])) # print the estimated sentiment
In [ ]:
def sentiment(sent, model):
input_device = be.zeros((sentence_length, 1), dtype=np.int32)
input_numpy = np.zeros((sentence_length, 1), dtype=np.int32)
tokens = clean_string(line).strip().split()
sent = [len(vocab) + 1 if t not in vocab else vocab[t] for t in tokens]
sent = [start] + [w + index_from for w in sent]
sent = [oov if w >= vocab_size else w for w in sent]
trunc = sent[-sentence_length:] # take the last sentence_length words
input_numpy[:] = 0 # fill with zeros
input_numpy[-len(trunc):, 0] = trunc # place the input into the numpy array
input_device.set(input_numpy) # copy the numpy array to device
y_pred = model.fprop(input_device, inference=True) # run the forward pass through the model
return y_pred.get()[1]
Now you can easily enter your own review and get the result. Here we included a more neutral review below:
In [ ]:
line = """The characters voices were very good. I was only really bothered by Kanga. The music, however, was twice
as loud in parts than the dialog, and incongruous to the film. As for the story, it was a bit preachy and
militant in tone. Overall, I was disappointed, but I would go again just to see the same excitement on my
child's face. I liked Lumpy's laugh..."""
result = sentiment(line, model)
print("Sentiment: {}".format(result))
In [ ]: