Sentiment Analysis with an RNN

In this notebook, you'll implement a recurrent neural network that performs sentiment analysis. Using an RNN rather than a feedfoward network is more accurate since we can include information about the sequence of words. Here we'll use a dataset of movie reviews, accompanied by labels.

The architecture for this network is shown below.

Here, we'll pass in words to an embedding layer. We need an embedding layer because we have tens of thousands of words, so we'll need a more efficient representation for our input data than one-hot encoded vectors. You should have seen this before from the word2vec lesson. You can actually train up an embedding with word2vec and use it here. But it's good enough to just have an embedding layer and let the network learn the embedding table on it's own.

From the embedding layer, the new representations will be passed to LSTM cells. These will add recurrent connections to the network so we can include information about the sequence of words in the data. Finally, the LSTM cells will go to a sigmoid output layer here. We're using the sigmoid because we're trying to predict if this text has positive or negative sentiment. The output layer will just be a single unit then, with a sigmoid activation function.

We don't care about the sigmoid outputs except for the very last one, we can ignore the rest. We'll calculate the cost from the output of the last step and the training label.


In [71]:
import numpy as np
import tensorflow as tf

In [72]:
with open('reviews.txt', 'r') as f:
    reviews = f.read()
with open('labels.txt', 'r') as f:
    labels = f.read()

In [73]:
reviews[:2000]


Out[73]:
'bromwell high is a cartoon comedy . it ran at the same time as some other programs about school life  such as  teachers  . my   years in the teaching profession lead me to believe that bromwell high  s satire is much closer to reality than is  teachers  . the scramble to survive financially  the insightful students who can see right through their pathetic teachers  pomp  the pettiness of the whole situation  all remind me of the schools i knew and their students . when i saw the episode in which a student repeatedly tried to burn down the school  i immediately recalled . . . . . . . . . at . . . . . . . . . . high . a classic line inspector i  m here to sack one of your teachers . student welcome to bromwell high . i expect that many adults of my age think that bromwell high is far fetched . what a pity that it isn  t   \nstory of a man who has unnatural feelings for a pig . starts out with a opening scene that is a terrific example of absurd comedy . a formal orchestra audience is turned into an insane  violent mob by the crazy chantings of it  s singers . unfortunately it stays absurd the whole time with no general narrative eventually making it just too off putting . even those from the era should be turned off . the cryptic dialogue would make shakespeare seem easy to a third grader . on a technical level it  s better than you might think with some good cinematography by future great vilmos zsigmond . future stars sally kirkland and frederic forrest can be seen briefly .  \nhomelessness  or houselessness as george carlin stated  has been an issue for years but never a plan to help those on the street that were once considered human who did everything from going to school  work  or vote for the matter . most people think of the homeless as just a lost cause while worrying about things such as racism  the war on iraq  pressuring kids to succeed  technology  the elections  inflation  or worrying if they  ll be next to end up on the streets .  br    br   but what if y'

Data preprocessing

The first step when building a neural network model is getting your data into the proper form to feed into the network. Since we're using embedding layers, we'll need to encode each word with an integer. We'll also want to clean it up a bit.

You can see an example of the reviews data above. We'll want to get rid of those periods. Also, you might notice that the reviews are delimited with newlines \n. To deal with those, I'm going to split the text into each review using \n as the delimiter. Then I can combined all the reviews back together into one big string.

First, let's remove all punctuation. Then get all the text without the newlines and split it into individual words.


In [74]:
from string import punctuation
all_text = ''.join([c for c in reviews if c not in punctuation])
reviews = all_text.split('\n')

all_text = ' '.join(reviews)
words = all_text.split()

In [75]:
all_text[:2000]


Out[75]:
'bromwell high is a cartoon comedy  it ran at the same time as some other programs about school life  such as  teachers   my   years in the teaching profession lead me to believe that bromwell high  s satire is much closer to reality than is  teachers   the scramble to survive financially  the insightful students who can see right through their pathetic teachers  pomp  the pettiness of the whole situation  all remind me of the schools i knew and their students  when i saw the episode in which a student repeatedly tried to burn down the school  i immediately recalled          at           high  a classic line inspector i  m here to sack one of your teachers  student welcome to bromwell high  i expect that many adults of my age think that bromwell high is far fetched  what a pity that it isn  t    story of a man who has unnatural feelings for a pig  starts out with a opening scene that is a terrific example of absurd comedy  a formal orchestra audience is turned into an insane  violent mob by the crazy chantings of it  s singers  unfortunately it stays absurd the whole time with no general narrative eventually making it just too off putting  even those from the era should be turned off  the cryptic dialogue would make shakespeare seem easy to a third grader  on a technical level it  s better than you might think with some good cinematography by future great vilmos zsigmond  future stars sally kirkland and frederic forrest can be seen briefly    homelessness  or houselessness as george carlin stated  has been an issue for years but never a plan to help those on the street that were once considered human who did everything from going to school  work  or vote for the matter  most people think of the homeless as just a lost cause while worrying about things such as racism  the war on iraq  pressuring kids to succeed  technology  the elections  inflation  or worrying if they  ll be next to end up on the streets   br    br   but what if you were given a bet to live on the st'

In [76]:
words[:100]


Out[76]:
['bromwell',
 'high',
 'is',
 'a',
 'cartoon',
 'comedy',
 'it',
 'ran',
 'at',
 'the',
 'same',
 'time',
 'as',
 'some',
 'other',
 'programs',
 'about',
 'school',
 'life',
 'such',
 'as',
 'teachers',
 'my',
 'years',
 'in',
 'the',
 'teaching',
 'profession',
 'lead',
 'me',
 'to',
 'believe',
 'that',
 'bromwell',
 'high',
 's',
 'satire',
 'is',
 'much',
 'closer',
 'to',
 'reality',
 'than',
 'is',
 'teachers',
 'the',
 'scramble',
 'to',
 'survive',
 'financially',
 'the',
 'insightful',
 'students',
 'who',
 'can',
 'see',
 'right',
 'through',
 'their',
 'pathetic',
 'teachers',
 'pomp',
 'the',
 'pettiness',
 'of',
 'the',
 'whole',
 'situation',
 'all',
 'remind',
 'me',
 'of',
 'the',
 'schools',
 'i',
 'knew',
 'and',
 'their',
 'students',
 'when',
 'i',
 'saw',
 'the',
 'episode',
 'in',
 'which',
 'a',
 'student',
 'repeatedly',
 'tried',
 'to',
 'burn',
 'down',
 'the',
 'school',
 'i',
 'immediately',
 'recalled',
 'at',
 'high']

In [77]:
reviews[:10]


Out[77]:
['bromwell high is a cartoon comedy  it ran at the same time as some other programs about school life  such as  teachers   my   years in the teaching profession lead me to believe that bromwell high  s satire is much closer to reality than is  teachers   the scramble to survive financially  the insightful students who can see right through their pathetic teachers  pomp  the pettiness of the whole situation  all remind me of the schools i knew and their students  when i saw the episode in which a student repeatedly tried to burn down the school  i immediately recalled          at           high  a classic line inspector i  m here to sack one of your teachers  student welcome to bromwell high  i expect that many adults of my age think that bromwell high is far fetched  what a pity that it isn  t   ',
 'story of a man who has unnatural feelings for a pig  starts out with a opening scene that is a terrific example of absurd comedy  a formal orchestra audience is turned into an insane  violent mob by the crazy chantings of it  s singers  unfortunately it stays absurd the whole time with no general narrative eventually making it just too off putting  even those from the era should be turned off  the cryptic dialogue would make shakespeare seem easy to a third grader  on a technical level it  s better than you might think with some good cinematography by future great vilmos zsigmond  future stars sally kirkland and frederic forrest can be seen briefly   ',
 'homelessness  or houselessness as george carlin stated  has been an issue for years but never a plan to help those on the street that were once considered human who did everything from going to school  work  or vote for the matter  most people think of the homeless as just a lost cause while worrying about things such as racism  the war on iraq  pressuring kids to succeed  technology  the elections  inflation  or worrying if they  ll be next to end up on the streets   br    br   but what if you were given a bet to live on the streets for a month without the luxuries you once had from a home  the entertainment sets  a bathroom  pictures on the wall  a computer  and everything you once treasure to see what it  s like to be homeless  that is goddard bolt  s lesson   br    br   mel brooks  who directs  who stars as bolt plays a rich man who has everything in the world until deciding to make a bet with a sissy rival  jeffery tambor  to see if he can live in the streets for thirty days without the luxuries if bolt succeeds  he can do what he wants with a future project of making more buildings  the bet  s on where bolt is thrown on the street with a bracelet on his leg to monitor his every move where he can  t step off the sidewalk  he  s given the nickname pepto by a vagrant after it  s written on his forehead where bolt meets other characters including a woman by the name of molly  lesley ann warren  an ex  dancer who got divorce before losing her home  and her pals sailor  howard morris  and fumes  teddy wilson  who are already used to the streets  they  re survivors  bolt isn  t  he  s not used to reaching mutual agreements like he once did when being rich where it  s fight or flight  kill or be killed   br    br   while the love connection between molly and bolt wasn  t necessary to plot  i found  life stinks  to be one of mel brooks  observant films where prior to being a comedy  it shows a tender side compared to his slapstick work such as blazing saddles  young frankenstein  or spaceballs for the matter  to show what it  s like having something valuable before losing it the next day or on the other hand making a stupid bet like all rich people do when they don  t know what to do with their money  maybe they should give it to the homeless instead of using it like monopoly money   br    br   or maybe this film will inspire you to help others   ',
 'airport    starts as a brand new luxury    plane is loaded up with valuable paintings  such belonging to rich businessman philip stevens  james stewart  who is flying them  a bunch of vip  s to his estate in preparation of it being opened to the public as a museum  also on board is stevens daughter julie  kathleen quinlan   her son  the luxury jetliner takes off as planned but mid  air the plane is hi  jacked by the co  pilot chambers  robert foxworth   his two accomplice  s banker  monte markham   wilson  michael pataki  who knock the passengers  crew out with sleeping gas  they plan to steal the valuable cargo  land on a disused plane strip on an isolated island but while making his descent chambers almost hits an oil rig in the ocean  loses control of the plane sending it crashing into the sea where it sinks to the bottom right bang in the middle of the bermuda triangle  with air in short supply  water leaking in  having flown over    miles off course the problems mount for the survivor  s as they await help with time fast running out     br    br   also known under the slightly different tile airport     this second sequel to the smash  hit disaster thriller airport       was directed by jerry jameson  while once again like it  s predecessors i can  t say airport    is any sort of forgotten classic it is entertaining although not necessarily for the right reasons  out of the three airport films i have seen so far i actually liked this one the best  just  it has my favourite plot of the three with a nice mid  air hi  jacking  then the crashing  didn  t he see the oil rig    sinking of the     maybe the makers were trying to cross the original airport with another popular disaster flick of the period the poseidon adventure         submerged is where it stays until the end with a stark dilemma facing those trapped inside  either suffocate when the air runs out or drown as the    floods or if any of the doors are opened  it  s a decent idea that could have made for a great little disaster flick but bad unsympathetic character  s  dull dialogue  lethargic set  pieces  a real lack of danger or suspense or tension means this is a missed opportunity  while the rather sluggish plot keeps one entertained for    odd minutes not that much happens after the plane sinks  there  s not as much urgency as i thought there should have been  even when the navy become involved things don  t pick up that much with a few shots of huge ships  helicopters flying about but there  s just something lacking here  george kennedy as the jinxed airline worker joe patroni is back but only gets a couple of scenes  barely even says anything preferring to just look worried in the background   br    br   the home video  theatrical version of airport    run    minutes while the us tv versions add an extra hour of footage including a new opening credits sequence  many more scenes with george kennedy as patroni  flashbacks to flesh out character  s  longer rescue scenes  the discovery or another couple of dead bodies including the navigator  while i would like to see this extra footage i am not sure i could sit through a near three hour cut of airport     as expected the film has dated badly with horrible fashions  interior design choices  i will say no more other than the toy plane model effects aren  t great either  along with the other two airport sequels this takes pride of place in the razzie award  s hall of shame although i can think of lots of worse films than this so i reckon that  s a little harsh  the action scenes are a little dull unfortunately  the pace is slow  not much excitement or tension is generated which is a shame as i reckon this could have been a pretty good film if made properly   br    br   the production values are alright if nothing spectacular  the acting isn  t great  two time oscar winner jack lemmon has said since it was a mistake to star in this  one time oscar winner james stewart looks old  frail  also one time oscar winner lee grant looks drunk while sir christopher lee is given little to do  there are plenty of other familiar faces to look out for too   br    br   airport    is the most disaster orientated of the three airport films so far  i liked the ideas behind it even if they were a bit silly  the production  bland direction doesn  t help though  a film about a sunken plane just shouldn  t be this boring or lethargic  followed by the concorde    airport            ',
 'brilliant over  acting by lesley ann warren  best dramatic hobo lady i have ever seen  and love scenes in clothes warehouse are second to none  the corn on face is a classic  as good as anything in blazing saddles  the take on lawyers is also superb  after being accused of being a turncoat  selling out his boss  and being dishonest the lawyer of pepto bolt shrugs indifferently  i  m a lawyer  he says  three funny words  jeffrey tambor  a favorite from the later larry sanders show  is fantastic here too as a mad millionaire who wants to crush the ghetto  his character is more malevolent than usual  the hospital scene  and the scene where the homeless invade a demolition site  are all  time classics  look for the legs scene and the two big diggers fighting  one bleeds   this movie gets better each time i see it  which is quite often    ',
 'this film lacked something i couldn  t put my finger on at first charisma on the part of the leading actress  this inevitably translated to lack of chemistry when she shared the screen with her leading man  even the romantic scenes came across as being merely the actors at play  it could very well have been the director who miscalculated what he needed from the actors  i just don  t know   br    br   but could it have been the screenplay  just exactly who was the chef in love with  he seemed more enamored of his culinary skills and restaurant  and ultimately of himself and his youthful exploits  than of anybody or anything else  he never convinced me he was in love with the princess   br    br   i was disappointed in this movie  but  don  t forget it was nominated for an oscar  so judge for yourself   ',
 'this is easily the most underrated film inn the brooks cannon  sure  its flawed  it does not give a realistic view of homelessness  unlike  say  how citizen kane gave a realistic view of lounge singers  or titanic gave a realistic view of italians you idiots   many of the jokes fall flat  but still  this film is very lovable in a way many comedies are not  and to pull that off in a story about some of the most traditionally reviled members of society is truly impressive  its not the fisher king  but its not crap  either  my only complaint is that brooks should have cast someone else in the lead  i love mel as a director and writer  not so much as a lead    ',
 'sorry everyone    i know this is supposed to be an  art  film   but wow  they should have handed out guns at the screening so people could blow their brains out and not watch  although the scene design and photographic direction was excellent  this story is too painful to watch  the absence of a sound track was brutal  the loooonnnnng shots were too long  how long can you watch two people just sitting there and talking  especially when the dialogue is two people complaining  i really had a hard time just getting through this film  the performances were excellent  but how much of that dark  sombre  uninspired  stuff can you take  the only thing i liked was maureen stapleton and her red dress and dancing scene  otherwise this was a ripoff of bergman  and i  m no fan f his either  i think anyone who says they enjoyed     hours of this is   well  lying   ',
 'this is not the typical mel brooks film  it was much less slapstick than most of his movies and actually had a plot that was followable  leslie ann warren made the movie  she is such a fantastic  under  rated actress  there were some moments that could have been fleshed out a bit more  and some scenes that could probably have been cut to make the room to do so  but all in all  this is worth the price to rent and see it  the acting was good overall  brooks himself did a good job without his characteristic speaking to directly to the audience  again  warren was the best actor in the movie  but  fume  and  sailor  both played their parts well   ',
 'when i was little my parents took me along to the theater to see interiors  it was one of many movies i watched with my parents  but this was the only one we walked out of  since then i had never seen interiors until just recently  and i could have lived out the rest of my life without it  what a pretentious  ponderous  and painfully boring piece of    s wine and cheese tripe  woody allen is one of my favorite directors but interiors is by far the worst piece of crap of his career  in the unmistakable style of ingmar berman  allen gives us a dark  angular  muted  insight in to the lives of a family wrought by the psychological damage caused by divorce  estrangement  career  love  non  love  halitosis  whatever  the film  intentionally  has no comic relief  no music  and is drenched in shadowy pathos  this film style can be best defined as expressionist in nature  using an improvisational method of dialogue to illicit a  more pronounced depth of meaning and truth   but woody allen is no ingmar bergman  the film is painfully slow and dull  but beyond that  i simply had no connection with or sympathy for any of the characters  instead i felt only contempt for this parade of shuffling  whining  nicotine stained  martyrs in a perpetual quest for identity  amid a backdrop of cosmopolitan affluence and baked brie intelligentsia the story looms like a fart in the room  everyone speaks in affected platitudes and elevated language between cigarettes  everyone is  lost  and  struggling   desperate to find direction or understanding or whatever and it just goes on and on to the point where you just want to slap all of them  it  s never about resolution  it  s only about interminable introspective babble  it is nothing more than a psychological drama taken to an extreme beyond the audience  s ability to connect  woody allen chose to make characters so immersed in themselves we feel left out  and for that reason i found this movie painfully self indulgent and spiritually draining  i see what he was going for but his insistence on promoting his message through prozac prose and distorted film techniques jettisons it past the point of relevance  i highly recommend this one if you  re feeling a little too happy and need something to remind you of death  otherwise  let  s just pretend this film never happened   ']

Encoding the words

The embedding lookup requires that we pass in integers to our network. The easiest way to do this is to create dictionaries that map the words in the vocabulary to integers. Then we can convert each of our reviews into integers so they can be passed into the network.

Exercise: Now you're going to encode the words with integers. Build a dictionary that maps words to integers. Later we're going to pad our input vectors with zeros, so make sure the integers start at 1, not 0. Also, convert the reviews to integers and store the reviews in a new list called reviews_ints.


In [78]:
from collections import Counter
counter = Counter(words)
#print(counter.most_common())
vocabs = set(counter.keys())
vocab_size = len(vocabs)
print(vocab_size)


74072

In [81]:
# Create your dictionary that maps vocab words to integers here
vocab_to_int = {}
for i,vocab in enumerate(vocabs):
    #if (i <100): print(i, vocab)
    vocab_to_int[vocab] = i+1

#vocab_to_int

# Convert the reviews to integers, same shape as reviews list, but with integers
#reviews[:10]
reviews_ints = []
for review in reviews:
    review_ints = []
    for word in review.split(" "):
        if word not in ' ':
            review_ints.append(vocab_to_int[word])
    reviews_ints.append(review_ints)

reviews_ints[0]


Out[81]:
[33763,
 3104,
 19407,
 67410,
 51917,
 24488,
 10671,
 65200,
 12505,
 840,
 4908,
 19247,
 72388,
 58242,
 62421,
 63531,
 70470,
 68036,
 3696,
 38402,
 72388,
 68648,
 47498,
 35417,
 39477,
 840,
 40092,
 63191,
 64210,
 55788,
 41912,
 22962,
 39574,
 33763,
 3104,
 70692,
 54923,
 19407,
 67436,
 2475,
 41912,
 27970,
 13237,
 19407,
 68648,
 840,
 41387,
 41912,
 4663,
 58478,
 840,
 16373,
 8108,
 8082,
 37092,
 42809,
 24350,
 33541,
 7481,
 14590,
 68648,
 27484,
 840,
 5092,
 29180,
 840,
 60007,
 39536,
 3841,
 39814,
 55788,
 29180,
 840,
 29480,
 48709,
 15205,
 44962,
 7481,
 8108,
 1513,
 48709,
 54046,
 840,
 50521,
 39477,
 14164,
 67410,
 16170,
 72466,
 22327,
 41912,
 41935,
 73669,
 840,
 68036,
 48709,
 8610,
 66206,
 12505,
 3104,
 67410,
 16746,
 40067,
 21661,
 48709,
 63505,
 21377,
 41912,
 22469,
 41319,
 29180,
 26414,
 68648,
 16170,
 21727,
 41912,
 33763,
 3104,
 48709,
 40897,
 39574,
 27722,
 44267,
 29180,
 47498,
 57402,
 25224,
 39574,
 33763,
 3104,
 19407,
 34306,
 40743,
 58888,
 67410,
 64327,
 39574,
 10671,
 16575,
 67652]

In [93]:
print(reviews[0])
print(len(reviews[0].split(' ')))
print(len(reviews_ints[0]))
print(reviews_ints[:2])


bromwell high is a cartoon comedy  it ran at the same time as some other programs about school life  such as  teachers   my   years in the teaching profession lead me to believe that bromwell high  s satire is much closer to reality than is  teachers   the scramble to survive financially  the insightful students who can see right through their pathetic teachers  pomp  the pettiness of the whole situation  all remind me of the schools i knew and their students  when i saw the episode in which a student repeatedly tried to burn down the school  i immediately recalled          at           high  a classic line inspector i  m here to sack one of your teachers  student welcome to bromwell high  i expect that many adults of my age think that bromwell high is far fetched  what a pity that it isn  t   
185
140
[[33763, 3104, 19407, 67410, 51917, 24488, 10671, 65200, 12505, 840, 4908, 19247, 72388, 58242, 62421, 63531, 70470, 68036, 3696, 38402, 72388, 68648, 47498, 35417, 39477, 840, 40092, 63191, 64210, 55788, 41912, 22962, 39574, 33763, 3104, 70692, 54923, 19407, 67436, 2475, 41912, 27970, 13237, 19407, 68648, 840, 41387, 41912, 4663, 58478, 840, 16373, 8108, 8082, 37092, 42809, 24350, 33541, 7481, 14590, 68648, 27484, 840, 5092, 29180, 840, 60007, 39536, 3841, 39814, 55788, 29180, 840, 29480, 48709, 15205, 44962, 7481, 8108, 1513, 48709, 54046, 840, 50521, 39477, 14164, 67410, 16170, 72466, 22327, 41912, 41935, 73669, 840, 68036, 48709, 8610, 66206, 12505, 3104, 67410, 16746, 40067, 21661, 48709, 63505, 21377, 41912, 22469, 41319, 29180, 26414, 68648, 16170, 21727, 41912, 33763, 3104, 48709, 40897, 39574, 27722, 44267, 29180, 47498, 57402, 25224, 39574, 33763, 3104, 19407, 34306, 40743, 58888, 67410, 64327, 39574, 10671, 16575, 67652], [36319, 29180, 67410, 66139, 8082, 40177, 12060, 57688, 67537, 67410, 61622, 32303, 16110, 46935, 67410, 26925, 6119, 39574, 19407, 67410, 53677, 40710, 29180, 17336, 24488, 67410, 31182, 8559, 38730, 19407, 6213, 9785, 63642, 24521, 65321, 41653, 43948, 840, 11413, 16130, 29180, 10671, 70692, 50652, 10622, 10671, 44493, 17336, 840, 60007, 19247, 46935, 41555, 70103, 31688, 815, 49248, 10671, 37603, 49909, 66038, 29446, 59743, 46626, 23950, 840, 27470, 6951, 46968, 6213, 66038, 840, 54627, 51118, 8039, 12617, 33962, 43822, 4412, 41912, 67410, 28548, 70629, 73153, 67410, 54547, 69771, 10671, 70692, 35495, 13237, 21530, 61806, 25224, 46935, 58242, 26080, 29554, 43948, 63063, 22841, 21643, 61611, 63063, 52403, 28486, 53510, 44962, 48089, 64206, 37092, 46968, 10905, 28873]]

Encoding the labels

Our labels are "positive" or "negative". To use these labels in our network, we need to convert them to 0 and 1.

Exercise: Convert labels from positive and negative to 1 and 0, respectively.


In [94]:
def get_target_for_label(label):
    if(label == 'positive'):
        return 1
    else:
        return 0

# Convert labels to 1s and 0s for 'positive' and 'negative'
#print(labels[:100])
#([c for c in reviews if c not in punctuation])
text_labels = labels.split('\n')
labels_ints = []
for text_label in text_labels:
    labels_ints.append(get_target_for_label(text_label))

#labels_ints[:11]
labels = labels_ints

If you built labels correctly, you should see the next output.


In [102]:
review_lens = Counter([len(x) for x in reviews_ints])
print("Zero-length reviews: {}".format(review_lens[0]))
print("Maximum review length: {}".format(max(review_lens)))


Zero-length reviews: 0
Maximum review length: 2514

Okay, a couple issues here. We seem to have one review with zero length. And, the maximum review length is way too many steps for our RNN. Let's truncate to 200 steps. For reviews shorter than 200, we'll pad with 0s. For reviews longer than 200, we can truncate them to the first 200 characters.

Exercise: First, remove the review with zero length from the reviews_ints list.


In [103]:
# Filter out that review with 0 length
reviews_ints = [review for review in reviews_ints if len(review) > 0]

Exercise: Now, create an array features that contains the data we'll pass to the network. The data should come from review_ints, since we want to feed integers to the network. Each row should be 200 elements long. For reviews shorter than 200 words, left pad with 0s. That is, if the review is ['best', 'movie', 'ever'], [117, 18, 128] as integers, the row will look like [0, 0, 0, ..., 0, 117, 18, 128]. For reviews longer than 200, use on the first 200 words as the feature vector.

This isn't trivial and there are a bunch of ways to do this. But, if you're going to be building your own deep learning networks, you're going to have to get used to preparing your data.


In [149]:
seq_len = 200

reviews_ints_optimized = []
for review_ints in reviews_ints:
    if len(review_ints) > seq_len:
        truncated = review_ints[0:seq_len]
        reviews_ints_optimized.append(truncated)
    else:
        padded = [0] * (seq_len - len(review_ints)) + review_ints
        reviews_ints_optimized.append(padded)

features = np.array(reviews_ints_optimized)

In [148]:
len(reviews_ints_optimized[1130])


Out[148]:
200

In [151]:
features[9,:100]


Out[151]:
array([ 1513, 48709, 65002, 37021, 47498, 70156, 43727, 55788, 63125,
       41912,   840, 14846, 41912, 42809,   551, 10671, 65002, 41319,
       29180, 27722, 10488, 48709, 41187, 46935, 47498, 70156, 71020,
       55811, 65002,   840,  1337, 41319, 31137, 60502, 16110, 29180,
       58412, 17330, 48709,  2044, 26542, 10905,   551, 62876, 37603,
       29031, 44962, 48709,  6136, 23089,  6861, 16110,   840, 32311,
       29180, 47498,  3696, 26906, 10671, 58888, 67410, 63591, 40460,
       44962, 36737, 36542, 28145, 29180, 70692,  7346, 44962, 47318,
       48458, 42556, 58741, 19407, 41319, 29180, 47498, 12537, 62802,
       71020,   551, 19407, 43948, 34306,   840, 59085, 28145, 29180,
       35703, 29180,   748, 44166, 39477,   840,   678, 35662, 29180, 67958])

If you build features correctly, it should look like that cell output below.


In [13]:
features[:10,:100]


Out[13]:
array([[    0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0, 21282,   308,     6,
            3,  1050,   207,     8,  2143,    32,     1,   171,    57,
           15,    49,    81,  5832,    44,   382,   110,   140,    15,
         5236,    60,   154,     9,     1,  5014,  5899,   475,    71,
            5,   260,    12, 21282,   308,    13,  1981,     6,    74,
         2396],
       [    0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,    63,     4,     3,   125,
           36,    47,  7487,  1397,    16,     3,  4212,   505,    45,
           17],
       [23249,    42, 55982,    15,   707, 17288,  3398,    47,    77,
           35,  1822,    16,   154,    19,   114,     3,  1308,     5,
          336,   147,    22,     1,   857,    12,    70,   281,  1170,
          399,    36,   120,   283,    38,   169,     5,   382,   158,
           42,  2278,    16,     1,   541,    90,    78,   102,     4,
            1,  3256,    15,    43,     3,   407,  1069,   136,  8165,
           44,   182,   140,    15,  3054,     1,   321,    22,  4827,
        28571,   346,     5,  3093,  2094,     1, 18970, 18062,    42,
         8165,    46,    33,   236,    29,   370,     5,   130,    56,
           22,     1,  1928,     7,     7,    19,    48,    46,    21,
           70,   345,     3,  2102,     5,   408,    22,     1,  1928,
           16],
       [ 4504,   505,    15,     3,  3352,   162,  8369,  1655,     6,
         4860,    56,    17,  4513,  5629,   140, 11938,     5,   996,
         4969,  2947,  4464,   566,  1202,    36,     6,  1520,    96,
            3,   744,     4, 28265,    13,     5,    27,  3464,     9,
        10794,     4,     8,   111,  3024,     5,     1,  1027,    15,
            3,  4400,    82,    22,  2058,     6,  4464,   538,  2773,
         7123, 41932,    41,   463,     1,  8369, 62989,   302,   123,
           15,  4230,    19,  1671,   923,     1,  1655,     6,  6166,
        20692,    34,     1,   980,  1751, 22804,   646, 25292,    27,
          106, 11901,    13, 14278, 15496, 18701,  2459,   466, 21189,
           36,  3267,     1,  6436,  1020,    45,    17,  2700,  2500,
           33],
       [    0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,   520,   119,   113,    34,
        16673,  1817,  3744,   117,   885, 22019,   721,    10,    28,
          124,   108,     2,   115,   137,     9,  1626,  7742,    26,
          330,     5,   590,     1,  6165,    22,   386,     6,     3,
          349,    15,    50,    15,   231,     9,  7484, 11646,     1,
          191,    22,  8994,     6,    82,   881,   101,   111,  3590,
            4],
       [    0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,
           11,    20,  3662,   141,    10,   423,    23,   272,    60,
         4362,    22,    32,    84,  3305,    22,     1,   172,     4,
            1,   952,   507,    11,  4996,  5387,     5,   574,     4,
         1154,    54,    53,  5328,     1,   261,    17,    41,   952,
          125,    59,     1,   712,   137,   379,   627,    15,   111,
         1511],
       [    0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,    11,     6,   692,     1,    90,
         2158,    20, 11793,     1,  2818,  5218,   249,    92,  3007,
            8,   126,    24,   201,     3,   803,   634,     4, 23249,
         1002],
       [    0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,   786,   295,    10,   122,    11,     6,   418,
            5,    29,    35,   482,    20,    19,  1285,    33,   142,
           28,  2667,    45,  1844,    32,     1,  2790,    37,    78,
           97,  2439,    67,  3952,    45,     2,    24,   105,   256,
            1,   134,  1572,     2, 12612,   452,    14,   319,    11,
           63,     6,    98,  1322,     5,   105,     1,  3766,     4,
            3],
       [    0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,    11,     6,
           24,     1,   779,  3708,  2818,    20,     8,    14,    74,
          325,  2739,    73,    90,     4,    27,    99,     2,   165,
           68],
       [   54,    10,    14,   116,    60,   798,   552,    71,   364,
            5,     1,   731,     5,    66,  8089,     8,    14,    30,
            4,   109,    99,    10,   293,    17,    60,   798,    19,
           11,    14,     1,    64,    30,    69,  2506,    45,     4,
          234,    93,    10,    68,   114,   108,  8089,   363,    43,
         1009,     2,    10,    97,    28,  1431,    45,     1,   357,
            4,    60,   110,   205,     8,    48,     3,  1929, 11029,
            2,  2127,   354,   412,     4,    13,  6676,     2,  2975,
         5174,  2125,  1371,     6,    30,     4,    60,   502,   876,
           19,  8089,     6,    34,   227,     1,   247,   412,     4,
          582,     4,    27,   599,     9,     1, 13829,   395,     4,
        14175]])

Training, Validation, Test

With our data in nice shape, we'll split it into training, validation, and test sets.

Exercise: Create the training, validation, and test sets here. You'll need to create sets for the features and the labels, train_x and train_y for example. Define a split fraction, split_frac as the fraction of data to keep in the training set. Usually this is set to 0.8 or 0.9. The rest of the data will be split in half to create the validation and testing data.


In [160]:
split_frac = 0.8

def split_data(data, frac):
    first_group_size = int(round(len(data) * frac))
    group_a, group_b = data[:first_group_size], data[first_group_size:]
    return group_a, group_b

train_x, val_x = split_data(features, split_frac)
train_y, val_y = split_data(labels, split_frac)

val_x, test_x = split_data(val_x, 0.5)
val_y, test_y = split_data(val_y, 0.5)

print("\t\t\tFeature Shapes:")
print("Train set: \t\t{}".format(train_x.shape),
      "\nValidation set: \t{}".format(val_x.shape),
      "\nTest set: \t\t{}".format(test_x.shape))


			Feature Shapes:
Train set: 		(20000, 200) 
Validation set: 	(2500, 200) 
Test set: 		(2500, 200)

With train, validation, and text fractions of 0.8, 0.1, 0.1, the final shapes should look like:

                    Feature Shapes:
Train set:       (20000, 200) 
Validation set:     (2500, 200) 
Test set:         (2501, 200)

Build the graph

Here, we'll build the graph. First up, defining the hyperparameters.

  • lstm_size: Number of units in the hidden layers in the LSTM cells. Usually larger is better performance wise. Common values are 128, 256, 512, etc.
  • lstm_layers: Number of LSTM layers in the network. I'd start with 1, then add more if I'm underfitting.
  • batch_size: The number of reviews to feed the network in one training pass. Typically this should be set as high as you can go without running out of memory.
  • learning_rate: Learning rate

In [161]:
lstm_size = 256
lstm_layers = 1
batch_size = 500
learning_rate = 0.001

For the network itself, we'll be passing in our 200 element long review vectors. Each batch will be batch_size vectors. We'll also be using dropout on the LSTM layer, so we'll make a placeholder for the keep probability.

Exercise: Create the inputs_, labels_, and drop out keep_prob placeholders using tf.placeholder. labels_ needs to be two-dimensional to work with some functions later. Since keep_prob is a scalar (a 0-dimensional tensor), you shouldn't provide a size to tf.placeholder.


In [165]:
n_words = len(vocabs)
print(n_words)
# Create the graph object
graph = tf.Graph()
# Add nodes to the graph
with graph.as_default():
    inputs_ = tf.placeholder(tf.int32, [batch_size, n_words], name='inputs')
    labels_ = tf.placeholder(tf.int32, [batch_size, n_words], name='labels')
    keep_prob = tf.placeholder(tf.float32, name='keep_prob')


74072

Embedding

Now we'll add an embedding layer. We need to do this because there are 74000 words in our vocabulary. It is massively inefficient to one-hot encode our classes here. You should remember dealing with this problem from the word2vec lesson. Instead of one-hot encoding, we can have an embedding layer and use that layer as a lookup table. You could train an embedding layer using word2vec, then load it here. But, it's fine to just make a new layer and let the network learn the weights.

Exercise: Create the embedding lookup matrix as a tf.Variable. Use that embedding matrix to get the embedded vectors to pass to the LSTM cell with tf.nn.embedding_lookup. This function takes the embedding matrix and an input tensor, such as the review vectors. Then, it'll return another tensor with the embedded vectors. So, if the embedding layer has 200 units, the function will return a tensor with size [batch_size, 200].


In [166]:
# Size of the embedding vectors (number of units in the embedding layer)
embed_size = 300 

with graph.as_default():
    embedding = tf.Variable(tf.random_uniform([n_words, embed_size], minval=-1, maxval=1)) # create embedding weight matrix here
    embed = tf.nn.embedding_lookup(embedding, inputs_)# use tf.nn.embedding_lookup to get the hidden layer output

LSTM cell

Next, we'll create our LSTM cells to use in the recurrent network (TensorFlow documentation). Here we are just defining what the cells look like. This isn't actually building the graph, just defining the type of cells we want in our graph.

To create a basic LSTM cell for the graph, you'll want to use tf.contrib.rnn.BasicLSTMCell. Looking at the function documentation:

tf.contrib.rnn.BasicLSTMCell(num_units, forget_bias=1.0, input_size=None, state_is_tuple=True, activation=<function tanh at 0x109f1ef28>)

you can see it takes a parameter called num_units, the number of units in the cell, called lstm_size in this code. So then, you can write something like

lstm = tf.contrib.rnn.BasicLSTMCell(num_units)

to create an LSTM cell with num_units. Next, you can add dropout to the cell with tf.contrib.rnn.DropoutWrapper. This just wraps the cell in another cell, but with dropout added to the inputs and/or outputs. It's a really convenient way to make your network better with almost no effort! So you'd do something like

drop = tf.contrib.rnn.DropoutWrapper(cell, output_keep_prob=keep_prob)

Most of the time, you're network will have better performance with more layers. That's sort of the magic of deep learning, adding more layers allows the network to learn really complex relationships. Again, there is a simple way to create multiple layers of LSTM cells with tf.contrib.rnn.MultiRNNCell:

cell = tf.contrib.rnn.MultiRNNCell([drop] * lstm_layers)

Here, [drop] * lstm_layers creates a list of cells (drop) that is lstm_layers long. The MultiRNNCell wrapper builds this into multiple layers of RNN cells, one for each cell in the list.

So the final cell you're using in the network is actually multiple (or just one) LSTM cells with dropout. But it all works the same from an achitectural viewpoint, just a more complicated graph in the cell.

Exercise: Below, use tf.contrib.rnn.BasicLSTMCell to create an LSTM cell. Then, add drop out to it with tf.contrib.rnn.DropoutWrapper. Finally, create multiple LSTM layers with tf.contrib.rnn.MultiRNNCell.

Here is a tutorial on building RNNs that will help you out.


In [170]:
with graph.as_default():
    # Your basic LSTM cell
    lstm = tf.contrib.rnn.BasicLSTMCell(lstm_size)
    #lstm = tf.nn.rnn_cell.BasicLSTMCell(lstm_size)
    
    # Add dropout to the cell
    drop = tf.contrib.rnn.DropoutWrapper(lstm, output_keep_prob=keep_prob)
    #drop = tf.nn.rnn_cell.DropoutWrapper(lstm, output_keep_prob=keep_prob)
    
    # Stack up multiple LSTM layers, for deep learning
    cell = tf.contrib.rnn.MultiRNNCell([drop] * lstm_layers)
    #cell = tf.nn.rnn_cell.MultiRNNCell([drop] * lstm_layers)
    
    # Getting an initial state of all zeros
    initial_state = cell.zero_state(batch_size, tf.float32)

RNN forward pass

Now we need to actually run the data through the RNN nodes. You can use tf.nn.dynamic_rnn to do this. You'd pass in the RNN cell you created (our multiple layered LSTM cell for instance), and the inputs to the network.

outputs, final_state = tf.nn.dynamic_rnn(cell, inputs, initial_state=initial_state)

Above I created an initial state, initial_state, to pass to the RNN. This is the cell state that is passed between the hidden layers in successive time steps. tf.nn.dynamic_rnn takes care of most of the work for us. We pass in our cell and the input to the cell, then it does the unrolling and everything else for us. It returns outputs for each time step and the final_state of the hidden layer.

Exercise: Use tf.nn.dynamic_rnn to add the forward pass through the RNN. Remember that we're actually passing in vectors from the embedding layer, embed.


In [174]:
with graph.as_default():
    outputs, final_state = tf.nn.dynamic_rnn(cell, inputs_, initial_state=initial_state)


---------------------------------------------------------------------------
InvalidArgumentError                      Traceback (most recent call last)
/Users/kondo.kazuhiro/anaconda/envs/tensorflow/lib/python3.5/site-packages/tensorflow/python/framework/common_shapes.py in _call_cpp_shape_fn_impl(op, input_tensors_needed, input_tensors_as_shapes_needed, debug_python_shape_fn, require_shape_fn)
    669           node_def_str, input_shapes, input_tensors, input_tensors_as_shapes,
--> 670           status)
    671   except errors.InvalidArgumentError as err:

/Users/kondo.kazuhiro/anaconda/envs/tensorflow/lib/python3.5/contextlib.py in __exit__(self, type, value, traceback)
     65             try:
---> 66                 next(self.gen)
     67             except StopIteration:

/Users/kondo.kazuhiro/anaconda/envs/tensorflow/lib/python3.5/site-packages/tensorflow/python/framework/errors_impl.py in raise_exception_on_not_ok_status()
    468           compat.as_text(pywrap_tensorflow.TF_Message(status)),
--> 469           pywrap_tensorflow.TF_GetCode(status))
    470   finally:

InvalidArgumentError: Dimension must be 2 but is 3 for 'transpose' (op: 'Transpose') with input shapes: [500,74072], [3].

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
<ipython-input-174-982ab764cc4b> in <module>()
      4     #rnn_inputs = [tf.squeeze(i, squeeze_dims=[1]) for i in tf.split(1, num_steps, x_one_hot)]
      5 
----> 6     outputs, final_state = tf.nn.dynamic_rnn(cell, inputs_, initial_state=initial_state)

/Users/kondo.kazuhiro/anaconda/envs/tensorflow/lib/python3.5/site-packages/tensorflow/python/ops/rnn.py in dynamic_rnn(cell, inputs, sequence_length, initial_state, dtype, parallel_iterations, swap_memory, time_major, scope)
    487     # (B,T,D) => (T,B,D)
    488     flat_input = tuple(array_ops.transpose(input_, [1, 0, 2])
--> 489                        for input_ in flat_input)
    490 
    491   parallel_iterations = parallel_iterations or 32

/Users/kondo.kazuhiro/anaconda/envs/tensorflow/lib/python3.5/site-packages/tensorflow/python/ops/rnn.py in <genexpr>(.0)
    487     # (B,T,D) => (T,B,D)
    488     flat_input = tuple(array_ops.transpose(input_, [1, 0, 2])
--> 489                        for input_ in flat_input)
    490 
    491   parallel_iterations = parallel_iterations or 32

/Users/kondo.kazuhiro/anaconda/envs/tensorflow/lib/python3.5/site-packages/tensorflow/python/ops/array_ops.py in transpose(a, perm, name)
   1286         ret.set_shape(input_shape[::-1])
   1287     else:
-> 1288       ret = gen_array_ops.transpose(a, perm, name=name)
   1289     return ret
   1290 

/Users/kondo.kazuhiro/anaconda/envs/tensorflow/lib/python3.5/site-packages/tensorflow/python/ops/gen_array_ops.py in transpose(x, perm, name)
   3839     A `Tensor`. Has the same type as `x`.
   3840   """
-> 3841   result = _op_def_lib.apply_op("Transpose", x=x, perm=perm, name=name)
   3842   return result
   3843 

/Users/kondo.kazuhiro/anaconda/envs/tensorflow/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py in apply_op(self, op_type_name, name, **keywords)
    761         op = g.create_op(op_type_name, inputs, output_types, name=scope,
    762                          input_types=input_types, attrs=attr_protos,
--> 763                          op_def=op_def)
    764         if output_structure:
    765           outputs = op.outputs

/Users/kondo.kazuhiro/anaconda/envs/tensorflow/lib/python3.5/site-packages/tensorflow/python/framework/ops.py in create_op(self, op_type, inputs, dtypes, input_types, name, attrs, op_def, compute_shapes, compute_device)
   2395                     original_op=self._default_original_op, op_def=op_def)
   2396     if compute_shapes:
-> 2397       set_shapes_for_outputs(ret)
   2398     self._add_op(ret)
   2399     self._record_op_seen_by_control_dependencies(ret)

/Users/kondo.kazuhiro/anaconda/envs/tensorflow/lib/python3.5/site-packages/tensorflow/python/framework/ops.py in set_shapes_for_outputs(op)
   1755       shape_func = _call_cpp_shape_fn_and_require_op
   1756 
-> 1757   shapes = shape_func(op)
   1758   if shapes is None:
   1759     raise RuntimeError(

/Users/kondo.kazuhiro/anaconda/envs/tensorflow/lib/python3.5/site-packages/tensorflow/python/framework/ops.py in call_with_requiring(op)
   1705 
   1706   def call_with_requiring(op):
-> 1707     return call_cpp_shape_fn(op, require_shape_fn=True)
   1708 
   1709   _call_cpp_shape_fn_and_require_op = call_with_requiring

/Users/kondo.kazuhiro/anaconda/envs/tensorflow/lib/python3.5/site-packages/tensorflow/python/framework/common_shapes.py in call_cpp_shape_fn(op, input_tensors_needed, input_tensors_as_shapes_needed, debug_python_shape_fn, require_shape_fn)
    608     res = _call_cpp_shape_fn_impl(op, input_tensors_needed,
    609                                   input_tensors_as_shapes_needed,
--> 610                                   debug_python_shape_fn, require_shape_fn)
    611     if not isinstance(res, dict):
    612       # Handles the case where _call_cpp_shape_fn_impl calls unknown_shape(op).

/Users/kondo.kazuhiro/anaconda/envs/tensorflow/lib/python3.5/site-packages/tensorflow/python/framework/common_shapes.py in _call_cpp_shape_fn_impl(op, input_tensors_needed, input_tensors_as_shapes_needed, debug_python_shape_fn, require_shape_fn)
    673       missing_shape_fn = True
    674     else:
--> 675       raise ValueError(err.message)
    676 
    677   if missing_shape_fn:

ValueError: Dimension must be 2 but is 3 for 'transpose' (op: 'Transpose') with input shapes: [500,74072], [3].

Output

We only care about the final output, we'll be using that as our sentiment prediction. So we need to grab the last output with outputs[:, -1], the calculate the cost from that and labels_.


In [36]:
with graph.as_default():
    predictions = tf.contrib.layers.fully_connected(outputs[:, -1], 1, activation_fn=tf.sigmoid)
    cost = tf.losses.mean_squared_error(labels_, predictions)
    
    optimizer = tf.train.AdamOptimizer(learning_rate).minimize(cost)

Validation accuracy

Here we can add a few nodes to calculate the accuracy which we'll use in the validation pass.


In [37]:
with graph.as_default():
    correct_pred = tf.equal(tf.cast(tf.round(predictions), tf.int32), labels_)
    accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))

Batching

This is a simple function for returning batches from our data. First it removes data such that we only have full batches. Then it iterates through the x and y arrays and returns slices out of those arrays with size [batch_size].


In [38]:
def get_batches(x, y, batch_size=100):
    
    n_batches = len(x)//batch_size
    x, y = x[:n_batches*batch_size], y[:n_batches*batch_size]
    for ii in range(0, len(x), batch_size):
        yield x[ii:ii+batch_size], y[ii:ii+batch_size]

Training

Below is the typical training code. If you want to do this yourself, feel free to delete all this code and implement it yourself. Before you run this, make sure the checkpoints directory exists.


In [ ]:
epochs = 10

with graph.as_default():
    saver = tf.train.Saver()

with tf.Session(graph=graph) as sess:
    sess.run(tf.global_variables_initializer())
    iteration = 1
    for e in range(epochs):
        state = sess.run(initial_state)
        
        for ii, (x, y) in enumerate(get_batches(train_x, train_y, batch_size), 1):
            feed = {inputs_: x,
                    labels_: y[:, None],
                    keep_prob: 0.5,
                    initial_state: state}
            loss, state, _ = sess.run([cost, final_state, optimizer], feed_dict=feed)
            
            if iteration%5==0:
                print("Epoch: {}/{}".format(e, epochs),
                      "Iteration: {}".format(iteration),
                      "Train loss: {:.3f}".format(loss))

            if iteration%25==0:
                val_acc = []
                val_state = sess.run(cell.zero_state(batch_size, tf.float32))
                for x, y in get_batches(val_x, val_y, batch_size):
                    feed = {inputs_: x,
                            labels_: y[:, None],
                            keep_prob: 1,
                            initial_state: val_state}
                    batch_acc, val_state = sess.run([accuracy, final_state], feed_dict=feed)
                    val_acc.append(batch_acc)
                print("Val acc: {:.3f}".format(np.mean(val_acc)))
            iteration +=1
    saver.save(sess, "checkpoints/sentiment.ckpt")

Testing


In [ ]:
test_acc = []
with tf.Session(graph=graph) as sess:
    saver.restore(sess, tf.train.latest_checkpoint('/output/checkpoints'))
    test_state = sess.run(cell.zero_state(batch_size, tf.float32))
    for ii, (x, y) in enumerate(get_batches(test_x, test_y, batch_size), 1):
        feed = {inputs_: x,
                labels_: y[:, None],
                keep_prob: 1,
                initial_state: test_state}
        batch_acc, test_state = sess.run([accuracy, final_state], feed_dict=feed)
        test_acc.append(batch_acc)
    print("Test accuracy: {:.3f}".format(np.mean(test_acc)))