In [1]:

    
import graphlab

Read product review data



In [20]:

    
products = graphlab.SFrame('amazon_baby.gl/')



In [21]:

    
products.head()









    Out[21]:





    
        name
        review
        rating
    
    
        Planetwise Flannel Wipes
        These flannel wipes are
OK, but in my opinion ...
        3.0
    
    
        Planetwise Wipe Pouch
        it came early and was not
disappointed. i love ...
        5.0
    
    
        Annas Dream Full Quilt
with 2 Shams ...
        Very soft and comfortable
and warmer than it ...
        5.0
    
    
        Stop Pacifier Sucking
without tears with ...
        This is a product well
worth the purchase.  I ...
        5.0
    
    
        Stop Pacifier Sucking
without tears with ...
        All of my kids have cried
non-stop when I tried to ...
        5.0
    
    
        Stop Pacifier Sucking
without tears with ...
        When the Binky Fairy came
to our house, we didn't ...
        5.0
    
    
        A Tale of Baby's Days
with Peter Rabbit ...
        Lovely book, it's bound
tightly so you may no ...
        4.0
    
    
        Baby Tracker&reg; - Daily
Childcare Journal, ...
        Perfect for new parents.
We were able to keep ...
        5.0
    
    
        Baby Tracker&reg; - Daily
Childcare Journal, ...
        A friend of mine pinned
this product on Pinte ...
        5.0
    
    
        Baby Tracker&reg; - Daily
Childcare Journal, ...
        This has been an easy way
for my nanny to record ...
        4.0
    

[10 rows x 3 columns]

Build word count vector



In [22]:

    
products['word_count'] = graphlab.text_analytics.count_words(products['review'])



In [23]:

    
products.head()









    Out[23]:





    
        name
        review
        rating
        word_count
    
    
        Planetwise Flannel Wipes
        These flannel wipes are
OK, but in my opinion ...
        3.0
        {'and': 5, 'stink': 1,
'because': 1, 'ordered': ...
    
    
        Planetwise Wipe Pouch
        it came early and was not
disappointed. i love ...
        5.0
        {'and': 3, 'love': 1,
'it': 2, 'highly': 1, ...
    
    
        Annas Dream Full Quilt
with 2 Shams ...
        Very soft and comfortable
and warmer than it ...
        5.0
        {'and': 2, 'quilt': 1,
'it': 1, 'comfortable': ...
    
    
        Stop Pacifier Sucking
without tears with ...
        This is a product well
worth the purchase.  I ...
        5.0
        {'ingenious': 1, 'and':
3, 'love': 2, ...
    
    
        Stop Pacifier Sucking
without tears with ...
        All of my kids have cried
non-stop when I tried to ...
        5.0
        {'and': 2, 'parents!!':
1, 'all': 2, 'puppet.': ...
    
    
        Stop Pacifier Sucking
without tears with ...
        When the Binky Fairy came
to our house, we didn't ...
        5.0
        {'and': 2, 'cute': 1,
'help': 2, 'doll': 1, ...
    
    
        A Tale of Baby's Days
with Peter Rabbit ...
        Lovely book, it's bound
tightly so you may no ...
        4.0
        {'shop': 1, 'be': 1,
'is': 1, 'it': 1, 'as': ...
    
    
        Baby Tracker&reg; - Daily
Childcare Journal, ...
        Perfect for new parents.
We were able to keep ...
        5.0
        {'feeding,': 1, 'and': 2,
'all': 1, 'right': 1, ...
    
    
        Baby Tracker&reg; - Daily
Childcare Journal, ...
        A friend of mine pinned
this product on Pinte ...
        5.0
        {'and': 1, 'help': 1,
'give': 1, 'is': 1, ...
    
    
        Baby Tracker&reg; - Daily
Childcare Journal, ...
        This has been an easy way
for my nanny to record ...
        4.0
        {'journal.': 1, 'all': 1,
'standarad': 1, ...
    

[10 rows x 4 columns]



In [24]:

    
graphlab.canvas.set_target('ipynb')



In [25]:

    
products['name'].show()

Explore Vulli Sophie



In [26]:

    
giraffe_reviews = products[products['name'] == 'Vulli Sophie the Giraffe Teether']



In [27]:

    
len(giraffe_reviews)









    Out[27]:





785



In [28]:

    
giraffe_reviews.head()









    Out[28]:





    
        name
        review
        rating
        word_count
    
    
        Vulli Sophie the Giraffe
Teether ...
        He likes chewing on all
the parts especially the ...
        5.0
        {'and': 1, 'all': 1,
'because': 1, 'it': 1, ...
    
    
        Vulli Sophie the Giraffe
Teether ...
        My son loves this toy and
fits great in the diaper ...
        5.0
        {'and': 1, 'right': 1,
'help': 1, 'just': 1, ...
    
    
        Vulli Sophie the Giraffe
Teether ...
        There really should be a
large warning on the  ...
        1.0
        {'and': 2, 'all': 1,
'latex.': 1, 'being': 1, ...
    
    
        Vulli Sophie the Giraffe
Teether ...
        All the moms in my moms'
group got Sophie for ...
        5.0
        {'and': 2, 'one!': 1,
'all': 1, 'love': 1, ...
    
    
        Vulli Sophie the Giraffe
Teether ...
        I was a little skeptical
on whether Sophie was ...
        5.0
        {'and': 3, 'all': 1,
'old': 1, 'her.': 1, ...
    
    
        Vulli Sophie the Giraffe
Teether ...
        I have been reading about
Sophie and was going  ...
        5.0
        {'and': 6, 'seven': 1,
'already': 1, 'love': 1, ...
    
    
        Vulli Sophie the Giraffe
Teether ...
        My neice loves her sophie
and has spent hours ...
        5.0
        {'and': 4, 'drooling,':
1, 'love': 1, 'her.': 1, ...
    
    
        Vulli Sophie the Giraffe
Teether ...
        What a friendly face!
And those mesmerizing ...
        5.0
        {'and': 3, 'chew': 1,
"don't": 1, 'is': 1, ...
    
    
        Vulli Sophie the Giraffe
Teether ...
        We got this just for my
son to chew on instea ...
        5.0
        {'chew': 2, 'because': 1,
'just': 2, 'what': 1, ...
    
    
        Vulli Sophie the Giraffe
Teether ...
        My baby seems to like
this toy, but I could ...
        3.0
        {'and': 2, 'already': 1,
'in': 1, 'some': 1, ' ...
    

[10 rows x 4 columns]



In [29]:

    
giraffe_reviews['rating'].show(view='Categorical')

Build a sentiment classifier



In [30]:

    
products['rating'].show(view='Categorical')

Define +ve and -ve sentiment



In [31]:

    
# ignore products with rating 3*
products = products[products['rating']!=3]



In [32]:

    
# positive sentiment := 4* and 5*
products['sentiment'] = products['rating'] >= 4



In [33]:

    
products.head()









    Out[33]:





    
        name
        review
        rating
        word_count
        sentiment
    
    
        Planetwise Wipe Pouch
        it came early and was not
disappointed. i love ...
        5.0
        {'and': 3, 'love': 1,
'it': 2, 'highly': 1, ...
        1
    
    
        Annas Dream Full Quilt
with 2 Shams ...
        Very soft and comfortable
and warmer than it ...
        5.0
        {'and': 2, 'quilt': 1,
'it': 1, 'comfortable': ...
        1
    
    
        Stop Pacifier Sucking
without tears with ...
        This is a product well
worth the purchase.  I ...
        5.0
        {'ingenious': 1, 'and':
3, 'love': 2, ...
        1
    
    
        Stop Pacifier Sucking
without tears with ...
        All of my kids have cried
non-stop when I tried to ...
        5.0
        {'and': 2, 'parents!!':
1, 'all': 2, 'puppet.': ...
        1
    
    
        Stop Pacifier Sucking
without tears with ...
        When the Binky Fairy came
to our house, we didn't ...
        5.0
        {'and': 2, 'cute': 1,
'help': 2, 'doll': 1, ...
        1
    
    
        A Tale of Baby's Days
with Peter Rabbit ...
        Lovely book, it's bound
tightly so you may no ...
        4.0
        {'shop': 1, 'be': 1,
'is': 1, 'it': 1, 'as': ...
        1
    
    
        Baby Tracker&reg; - Daily
Childcare Journal, ...
        Perfect for new parents.
We were able to keep ...
        5.0
        {'feeding,': 1, 'and': 2,
'all': 1, 'right': 1, ...
        1
    
    
        Baby Tracker&reg; - Daily
Childcare Journal, ...
        A friend of mine pinned
this product on Pinte ...
        5.0
        {'and': 1, 'help': 1,
'give': 1, 'is': 1, ...
        1
    
    
        Baby Tracker&reg; - Daily
Childcare Journal, ...
        This has been an easy way
for my nanny to record ...
        4.0
        {'journal.': 1, 'all': 1,
'standarad': 1, ...
        1
    
    
        Baby Tracker&reg; - Daily
Childcare Journal, ...
        I love this journal and
our nanny uses it ...
        4.0
        {'all': 1, 'forget': 1,
'just': 1, "daughter's": ...
        1
    

[10 rows x 5 columns]

Training



In [34]:

    
train_data, test_data = products.random_split(0.8, seed=0)



In [35]:

    
sentiment_mode = graphlab.logistic_classifier.create(train_data,
                                                    target='sentiment',
                                                    features=['word_count'],
                                                    validation_set=test_data)









    



PROGRESS: Logistic regression:
PROGRESS: --------------------------------------------------------
PROGRESS: Number of examples          : 133448
PROGRESS: Number of classes           : 2
PROGRESS: Number of feature columns   : 1
PROGRESS: Number of unpacked features : 219217
PROGRESS: Number of coefficients    : 219218
PROGRESS: Starting L-BFGS
PROGRESS: --------------------------------------------------------
PROGRESS: +-----------+----------+-----------+--------------+-------------------+---------------------+
PROGRESS: | Iteration | Passes   | Step size | Elapsed Time | Training-accuracy | Validation-accuracy |
PROGRESS: +-----------+----------+-----------+--------------+-------------------+---------------------+
PROGRESS: | 1         | 5        | 0.000002  | 2.509592     | 0.841481          | 0.839989            |
PROGRESS: | 2         | 9        | 3.000000  | 3.988028     | 0.947425          | 0.894877            |
PROGRESS: | 3         | 10       | 3.000000  | 4.608776     | 0.923768          | 0.866232            |
PROGRESS: | 4         | 11       | 3.000000  | 5.185821     | 0.971779          | 0.912743            |
PROGRESS: | 5         | 12       | 3.000000  | 5.760209     | 0.975511          | 0.908900            |
PROGRESS: | 6         | 13       | 3.000000  | 6.320578     | 0.899991          | 0.825967            |
PROGRESS: | 10        | 18       | 1.000000  | 8.766595     | 0.988715          | 0.916256            |
PROGRESS: +-----------+----------+-----------+--------------+-------------------+---------------------+

Evaluate the sentiment model



In [36]:

    
sentiment_mode.evaluate(test_data, metric='roc_curve')









    Out[36]:





{'roc_curve': Columns:
 	threshold	float
 	fpr	float
 	tpr	float
 	p	int
 	n	int
 
 Rows: 1001
 
 Data:
 +------------------+----------------+------------------+-------+------+
 |    threshold     |      fpr       |       tpr        |   p   |  n   |
 +------------------+----------------+------------------+-------+------+
 |       0.0        | 0.224034495688 | 0.00368028013006 | 27987 | 5334 |
 | 0.0010000000475  | 0.775965504312 |  0.99631971987   | 27987 | 5334 |
 | 0.00200000009499 | 0.736032995876 |  0.995247793618  | 27987 | 5334 |
 | 0.00300000002608 | 0.713348331459 |  0.994640368743  | 27987 | 5334 |
 | 0.00400000018999 | 0.697787776528 |  0.994175867367  | 27987 | 5334 |
 | 0.00499999988824 | 0.686539182602 |  0.993818558617  | 27987 | 5334 |
 | 0.00600000005215 |  0.6767904012  |  0.993282595491  | 27987 | 5334 |
 | 0.00700000021607 | 0.666291713536 |  0.99292528674   | 27987 | 5334 |
 | 0.00800000037998 | 0.655980502437 |  0.992675170615  | 27987 | 5334 |
 | 0.00899999961257 | 0.648668916385 |  0.992353592739  | 27987 | 5334 |
 +------------------+----------------+------------------+-------+------+
 [1001 rows x 5 columns]
 Note: Only the head of the SFrame is printed.
 You can use print_rows(num_rows=m, num_columns=n) to print more rows and columns.}



In [37]:

    
sentiment_mode.show(view='Evaluation')

Apply the model to understand sentiment for Giraffe



In [40]:

    
giraffe_reviews['predicted_sentiment'] = sentiment_mode.predict(giraffe_reviews, output_type='probability')



In [41]:

    
giraffe_reviews.head()









    Out[41]:





    
        name
        review
        rating
        word_count
        predicted_sentiment
    
    
        Vulli Sophie the Giraffe
Teether ...
        He likes chewing on all
the parts especially the ...
        5.0
        {'and': 1, 'all': 1,
'because': 1, 'it': 1, ...
        0.999513023521
    
    
        Vulli Sophie the Giraffe
Teether ...
        My son loves this toy and
fits great in the diaper ...
        5.0
        {'and': 1, 'right': 1,
'help': 1, 'just': 1, ...
        0.999320678306
    
    
        Vulli Sophie the Giraffe
Teether ...
        There really should be a
large warning on the  ...
        1.0
        {'and': 2, 'all': 1,
'latex.': 1, 'being': 1, ...
        0.013558811687
    
    
        Vulli Sophie the Giraffe
Teether ...
        All the moms in my moms'
group got Sophie for ...
        5.0
        {'and': 2, 'one!': 1,
'all': 1, 'love': 1, ...
        0.995769474148
    
    
        Vulli Sophie the Giraffe
Teether ...
        I was a little skeptical
on whether Sophie was ...
        5.0
        {'and': 3, 'all': 1,
'old': 1, 'her.': 1, ...
        0.662374415673
    
    
        Vulli Sophie the Giraffe
Teether ...
        I have been reading about
Sophie and was going  ...
        5.0
        {'and': 6, 'seven': 1,
'already': 1, 'love': 1, ...
        0.999997148186
    
    
        Vulli Sophie the Giraffe
Teether ...
        My neice loves her sophie
and has spent hours ...
        5.0
        {'and': 4, 'drooling,':
1, 'love': 1, 'her.': 1, ...
        0.989190989536
    
    
        Vulli Sophie the Giraffe
Teether ...
        What a friendly face!
And those mesmerizing ...
        5.0
        {'and': 3, 'chew': 1,
"don't": 1, 'is': 1, ...
        0.999563518413
    
    
        Vulli Sophie the Giraffe
Teether ...
        We got this just for my
son to chew on instea ...
        5.0
        {'chew': 2, 'because': 1,
'just': 2, 'what': 1, ...
        0.970160542725
    
    
        Vulli Sophie the Giraffe
Teether ...
        My baby seems to like
this toy, but I could ...
        3.0
        {'and': 2, 'already': 1,
'in': 1, 'some': 1, ' ...
        0.195367644588
    

[10 rows x 5 columns]

Sort the reviews and explore based on predicted sentiment



In [42]:

    
giraffe_reviews = giraffe_reviews.sort('predicted_sentiment', ascending=False)



In [43]:

    
giraffe_reviews.head()









    Out[43]:





    
        name
        review
        rating
        word_count
        predicted_sentiment
    
    
        Vulli Sophie the Giraffe
Teether ...
        Sophie, oh Sophie, your
time has come. My ...
        5.0
        {'giggles': 1, 'all': 1,
"violet's": 2, 'food' ...
        1.0
    
    
        Vulli Sophie the Giraffe
Teether ...
        I'm not sure why Sophie
is such a hit with the ...
        4.0
        {'peace': 1, 'month': 1,
'bright': 1, 'softer' ...
        0.999999999703
    
    
        Vulli Sophie the Giraffe
Teether ...
        I'll be honest...I bought
this toy because all the ...
        4.0
        {'all': 2, 'pops': 1,
'existence.': 1, ...
        0.999999999392
    
    
        Vulli Sophie the Giraffe
Teether ...
        We got this little
giraffe as a gift from a ...
        5.0
        {'all': 2, "don't": 1,
'(literally).so': 1, ...
        0.99999999919
    
    
        Vulli Sophie the Giraffe
Teether ...
        As a mother of 16month
old twins; I bought ...
        5.0
        {'cute': 1, 'all': 1,
'reviews.': 2, 'just' ...
        0.999999998657
    
    
        Vulli Sophie the Giraffe
Teether ...
        Sophie the Giraffe is the
perfect teething toy. ...
        5.0
        {'just': 2, 'both': 1,
'month': 1, 'ears,': 1, ...
        0.999999997108
    
    
        Vulli Sophie the Giraffe
Teether ...
        Sophie la giraffe is
absolutely the best toy ...
        5.0
        {'and': 5, 'the': 1,
'all': 1, 'that': 2, ...
        0.999999995589
    
    
        Vulli Sophie the Giraffe
Teether ...
        My 5-mos old son took to
this immediately. The ...
        5.0
        {'just': 1, 'shape': 2,
'mutt': 1, '"dog': 1, ...
        0.999999995573
    
    
        Vulli Sophie the Giraffe
Teether ...
        My nephews and my four
kids all had Sophie in ...
        5.0
        {'and': 4, 'chew': 1,
'all': 1, 'perfect;': 1, ...
        0.999999989527
    
    
        Vulli Sophie the Giraffe
Teether ...
        Never thought I'd see my
son French kissing a ...
        5.0
        {'giggles': 1, 'all': 1,
'out,': 1, 'over': 1, ...
        0.999999985069
    

[10 rows x 5 columns]



In [44]:

    
giraffe_reviews[0]['review']









    Out[44]:





"Sophie, oh Sophie, your time has come. My granddaughter, Violet is 5 months old and starting to teeth. What joy little Sophie brings to Violet. Sophie is made of a very pliable rubber that is sturdy but not tough. It is quite easy for Violet to twist Sophie into unheard of positions to get Sophie into her mouth. The little nose and hooves fit perfectly into small mouths, and the drooling has purpose. The paint on Sophie is food quality.Sophie was born in 1961 in France. The maker had wondered why there was nothing available for babies and made Sophie from the finest rubber, phthalate-free on St Sophie's Day, thus the name was born. Since that time millions of Sophie's populate the world. She is soft and for babies little hands easy to grasp. Violet especially loves the bumpy head and horns of Sophie. Sophie has a long neck that easy to grasp and twist. She has lovely, sizable spots that attract Violet's attention. Sophie has happy little squeaks that bring squeals of delight from Violet. She is able to make Sophie squeak and that brings much joy. Sophie's smooth skin is soothing to Violet's little gums. Sophie is 7 inches tall and is the exact correct size for babies to hold and love.As you well know the first thing babies grasp, goes into their mouths- how wonderful to have a toy that stimulates all of the senses and helps with the issue of teething. Sophie is small enough to fit into any size pocket or bag. Sophie is the perfect find for babies from a few months to a year old. How wonderful to hear the giggles and laughs that emanate from babies who find Sophie irresistible. Viva La Sophie!Highly Recommended.  prisrob 12-11-09"



In [45]:

    
giraffe_reviews[1]['review']









    Out[45]:





"I'm not sure why Sophie is such a hit with the little ones, but my 7 month old baby girl is one of her adoring fans.  The rubber is softer and more pleasant to handle, and my daughter has enjoyed chewing on her legs and the nubs on her head even before she started teething.  She also loves the squeak that Sophie makes when you squeeze her.  Not sure what it is but if Sophie is amongst a pile of her other toys, my daughter will more often than not reach for Sophie.  And I have the peace of mind of knowing that only edible and safe paints and materials have been used to make Sophie, as opposed to Bright Starts and other baby toys made in China.  Now that the research is out on phthalates and other toxic substances in baby toys, I think it's more important than ever to find good quality toys that are also safe for our babies to handle and put in their mouths.  Sophie is a must-have for every new mom in my opinion.  Even if your kid is one of the few that can take or leave her, it's worth a try.  Vulli, the makers of Sophie, also make natural rubber teething rings that my daughter loves as well."



In [46]:

    
giraffe_reviews[-1]['review']









    Out[46]:





"My son (now 2.5) LOVED his Sophie, and I bought one for every baby shower I've gone to. Now, my daughter (6 months) just today nearly choked on it and I will never give it to her again. Had I not been within hearing range it could have been fatal. The strange sound she was making caught my attention and when I went to her and found the front curved leg shoved well down her throat and her face a purply/blue I panicked. I pulled it out and she vomited all over the carpet before screaming her head off. I can't believe how my opinion of this toy has changed from a must-have to a must-not-use. Please don't disregard any of the choking hazard comments, they are not over exaggerated!"



In [47]:

    
giraffe_reviews[-2]['review']









    Out[47]:





"This children's toy is nostalgic and very cute. However, there is a distinct rubber smell and a very odd taste, yes I tried it, that my baby did not enjoy. Also, if it is soiled it is extremely difficult to clean as the rubber is a kind of porus material and does not clean well. The final thing is the squeaking device inside which stopped working after the first couple of days. I returned this item feeling I had overpaid for a toy that was defective and did not meet my expectations. Please do not be swayed by the cute packaging and hype surounding it as I was. One more thing, I was given a full refund from Amazon without any problem."



In [48]:

    
selected_words = ['awesome', 'great', 'fantastic', 'amazing', 'love', 'horrible', 'bad', 'terrible', 'awful', 'wow', 'hate']



In [49]:

    
selected_words









    Out[49]:





['awesome',
 'great',
 'fantastic',
 'amazing',
 'love',
 'horrible',
 'bad',
 'terrible',
 'awful',
 'wow',
 'hate']



In [52]:

    
products.head(n=2)









    Out[52]:





    
        name
        review
        rating
        word_count
        sentiment
    
    
        Planetwise Wipe Pouch
        it came early and was not
disappointed. i love ...
        5.0
        {'and': 3, 'love': 1,
'it': 2, 'highly': 1, ...
        1
    
    
        Annas Dream Full Quilt
with 2 Shams ...
        Very soft and comfortable
and warmer than it ...
        5.0
        {'and': 2, 'quilt': 1,
'it': 1, 'comfortable': ...
        1
    

[2 rows x 5 columns]



In [53]:

    
def awesome_count(word_count):
    if 'awesome' in word_count:
        return word_count['awesome']
    else:
        return 0



In [57]:

    
myprodawesome = products['word_count'].apply(awesome_count)



In [71]:

    
myprodawesome.sum()









    Out[71]:





2002



In [72]:

    
products['awesome'] = products['word_count'].apply(awesome_count)



In [74]:

    
selected_words









    Out[74]:





['awesome',
 'great',
 'fantastic',
 'amazing',
 'love',
 'horrible',
 'bad',
 'terrible',
 'awful',
 'wow',
 'hate']



In [92]:

    
def awesome_count(word_count):
    #if 'awesome' in word_count:
    #    return word_count['awesome']
    #if 'great' in word_count:
    #    return word_count['great']
    #if 'fantastic' in word_count:
    #    return word_count['fantastic']
    #if 'amazing' in word_count:
    #    return word_count['amazing']
    #if 'love' in word_count:
    #    return word_count['love']
    #if 'horrible' in word_count:
    #    return word_count['horrible']
    #if 'bad' in word_count:
    #    return word_count['bad']
    #if 'terrible' in word_count:
    #    return word_count['terrible']
    #if 'awful' in word_count:
    #    return word_count['awful']
    #if 'wow' in word_count:
    #    return word_count['wow']
    if 'hate' in word_count:
        return word_count['hate']
    else:
        return 0



In [93]:

    
products['hate'] = products['word_count'].apply(awesome_count)



In [94]:

    
products.head(n=1)









    Out[94]:





    
        name
        review
        rating
        word_count
        sentiment
        awesome
        great
    
    
        Planetwise Wipe Pouch
        it came early and was not
disappointed. i love ...
        5.0
        {'and': 3, 'love': 1,
'it': 2, 'highly': 1, ...
        1
        0
        0
    


    
        fantastic
        amazing
        love
        horrible
        bad
        terrible
        awful
        wow
        hate
    
    
        0
        0
        1
        0
        0
        0
        0
        0
        0
    

[1 rows x 16 columns]



In [95]:

    
selected_words









    Out[95]:





['awesome',
 'great',
 'fantastic',
 'amazing',
 'love',
 'horrible',
 'bad',
 'terrible',
 'awful',
 'wow',
 'hate']



In [96]:

    
for wrd in selected_words:
    print wrd, " : ", products[wrd].sum()









    



awesome  :  2002
great  :  42420
fantastic  :  873
amazing  :  1305
love  :  40277
horrible  :  659
bad  :  3197
terrible  :  3197
awful  :  345
wow  :  131
hate  :  1057



In [99]:

    
selected_word_model = graphlab.logistic_classifier.create(train_data,
                                                    target='sentiment',
                                                    features=selected_words,
                                                    validation_set=test_data)









    



PROGRESS: Logistic regression:
PROGRESS: --------------------------------------------------------
PROGRESS: Number of examples          : 133448
PROGRESS: Number of classes           : 2
PROGRESS: Number of feature columns   : 11
PROGRESS: Number of unpacked features : 11
PROGRESS: Number of coefficients    : 12
PROGRESS: Starting Newton Method
PROGRESS: --------------------------------------------------------
PROGRESS: +-----------+----------+--------------+-------------------+---------------------+
PROGRESS: | Iteration | Passes   | Elapsed Time | Training-accuracy | Validation-accuracy |
PROGRESS: +-----------+----------+--------------+-------------------+---------------------+
PROGRESS: | 1         | 2        | 0.985888     | 0.843175          | 0.841911            |
PROGRESS: | 2         | 3        | 1.721230     | 0.843137          | 0.841941            |
PROGRESS: | 3         | 4        | 2.497520     | 0.843302          | 0.842241            |
PROGRESS: | 4         | 5        | 3.299918     | 0.843302          | 0.842241            |
PROGRESS: | 5         | 6        | 3.969235     | 0.843302          | 0.842241            |
PROGRESS: | 6         | 7        | 4.543724     | 0.843302          | 0.842241            |
PROGRESS: +-----------+----------+--------------+-------------------+---------------------+



In [98]:

    
train_data,test_data = products.random_split(.8, seed=0)



In [104]:

    
selected_word_model['coefficients'].sort('value',ascending=False).print_rows(num_rows=15)









    



+-------------+-------+-------+-----------------+
|     name    | index | class |      value      |
+-------------+-------+-------+-----------------+
|     love    |  None |   1   |  1.39620863453  |
| (intercept) |  None |   1   |  1.35597524881  |
|   awesome   |  None |   1   |  1.05525711479  |
|   amazing   |  None |   1   |  0.902067324413 |
|  fantastic  |  None |   1   |  0.886492700437 |
|    great    |  None |   1   |  0.880504439057 |
|     wow     |  None |   1   | -0.046766898143 |
|     bad     |  None |   1   | -0.499535549882 |
|   terrible  |  None |   1   | -0.499535549882 |
|     hate    |  None |   1   |  -1.44886716719 |
|    awful    |  None |   1   |  -1.79812112705 |
|   horrible  |  None |   1   |  -2.01376938905 |
+-------------+-------+-------+-----------------+
[12 rows x 4 columns]



In [105]:

    
selected_word_model.evaluate(test_data)









    Out[105]:





{'accuracy': 0.8422411722315638, 'confusion_matrix': Columns:
 	target_label	int
 	predicted_label	int
 	count	int
 
 Rows: 4
 
 Data:
 +--------------+-----------------+-------+
 | target_label | predicted_label | count |
 +--------------+-----------------+-------+
 |      0       |        0        |  177  |
 |      1       |        0        |  103  |
 |      0       |        1        |  5151 |
 |      1       |        1        | 27873 |
 +--------------+-----------------+-------+
 [4 rows x 3 columns]}



In [106]:

    
sentiment_mode.evaluate(test_data)









    Out[106]:





{'accuracy': 0.916256305548883, 'confusion_matrix': Columns:
 	target_label	int
 	predicted_label	int
 	count	int
 
 Rows: 4
 
 Data:
 +--------------+-----------------+-------+
 | target_label | predicted_label | count |
 +--------------+-----------------+-------+
 |      1       |        0        |  1461 |
 |      0       |        1        |  1328 |
 |      0       |        0        |  4000 |
 |      1       |        1        | 26515 |
 +--------------+-----------------+-------+
 [4 rows x 3 columns]}



In [107]:

    
diaper_champ_reviews = products[products['name'] == 'Baby Trend Diaper Champ']



In [108]:

    
diaper_champ_reviews['predicted_sentiment'] = sentiment_mode.predict(diaper_champ_reviews, output_type='probability')
diaper_champ_reviews = diaper_champ_reviews.sort('predicted_sentiment', ascending=False)



In [109]:

    
diaper_champ_reviews.head()









    Out[109]:





    
        name
        review
        rating
        word_count
        sentiment
        awesome
    
    
        Baby Trend Diaper Champ
        Baby Luke can turn a
clean diaper to a dirty ...
        5.0
        {'all': 1, 'less': 1,
"friend's": 1, '(which': ...
        1
        0
    
    
        Baby Trend Diaper Champ
        I LOOOVE this diaper
pail!  Its the easies ...
        5.0
        {'just': 1, 'over': 1,
'rweek': 1, 'sooo': 1, ...
        1
        0
    
    
        Baby Trend Diaper Champ
        We researched all of the
different types of di ...
        4.0
        {'all': 2, 'just': 4,
"don't": 2, 'one,': 1, ...
        1
        0
    
    
        Baby Trend Diaper Champ
        My baby is now 8 months
and the can has been ...
        5.0
        {"don't": 1, 'when': 1,
'over': 1, 'soon': 1, ...
        1
        0
    
    
        Baby Trend Diaper Champ
        This is absolutely, by
far, the best diaper  ...
        5.0
        {'just': 3, 'money': 1,
'not': 2, 'mechanism' ...
        1
        0
    
    
        Baby Trend Diaper Champ
        Diaper Champ or Diaper
Genie? That was my ...
        5.0
        {'all': 1, 'bags.': 1,
'son,': 1, '(i': 1, ...
        1
        0
    
    
        Baby Trend Diaper Champ
        Wow!  This is fabulous.
It was a toss-up between ...
        5.0
        {'and': 4, '"genie".': 1,
'since': 1, 'garbage' ...
        1
        0
    
    
        Baby Trend Diaper Champ
        I originally put this
item on my baby registry ...
        5.0
        {'lysol': 1, 'all': 2,
'bags.': 1, 'feedback': ...
        1
        0
    
    
        Baby Trend Diaper Champ
        Two girlfriends and two
family members put me ...
        5.0
        {'just': 1, 'when': 1,
'both': 1, 'results': 1, ...
        1
        0
    
    
        Baby Trend Diaper Champ
        I am one of those super-
critical shoppers who ...
        5.0
        {'taller': 1, 'bags.': 1,
'just': 1, "don't": 4, ...
        1
        0
    


    
        great
        fantastic
        amazing
        love
        horrible
        bad
        terrible
        awful
        wow
        hate
        predicted_sentiment
    
    
        0
        0
        0
        0
        0
        0
        0
        0
        0
        0
        0.999999937267
    
    
        0
        0
        0
        1
        0
        0
        0
        0
        0
        0
        0.999999917406
    
    
        0
        0
        0
        0
        0
        1
        1
        0
        0
        0
        0.999999899509
    
    
        2
        0
        0
        0
        0
        1
        1
        0
        0
        0
        0.999999836182
    
    
        0
        0
        0
        2
        0
        0
        0
        0
        0
        0
        0.999999824745
    
    
        0
        0
        0
        0
        0
        0
        0
        0
        0
        0
        0.999999759315
    
    
        0
        0
        0
        0
        0
        0
        0
        0
        0
        0
        0.999999692111
    
    
        0
        0
        0
        0
        0
        0
        0
        0
        0
        0
        0.999999642488
    
    
        0
        0
        0
        0
        1
        0
        0
        0
        0
        0
        0.999999604504
    
    
        0
        0
        0
        1
        0
        0
        0
        0
        0
        0
        0.999999486804
    

[10 rows x 17 columns]



In [111]:

    
selected_word_model.predict(diaper_champ_reviews[0:1], output_type='probability')









    Out[111]:





dtype: float
Rows: 1
[0.7951047915052162]



In [113]:

    
[test_data['sentiment'] == 1]









    Out[113]:





[dtype: int
 Rows: 33304
 [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 0, 1, 1, 0, 1, 1, 1, 1, 0, 1, 1, 0, 1, 0, 1, 0, 1, 1, 1, 1, 1, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 0, 1, 1, ... ]]



In [114]:

    
len(test_data)









    Out[114]:





33304



In [115]:

    
test_data['sentiment'].sum()









    Out[115]:





27976



In [117]:

    
float(27976)/33304









    Out[117]:





0.8400192169108815



In [ ]:

name	review	rating
Planetwise Flannel Wipes	These flannel wipes are OK, but in my opinion ...	3.0
Planetwise Wipe Pouch	it came early and was not disappointed. i love ...	5.0
Annas Dream Full Quilt with 2 Shams ...	Very soft and comfortable and warmer than it ...	5.0
Stop Pacifier Sucking without tears with ...	This is a product well worth the purchase. I ...	5.0
Stop Pacifier Sucking without tears with ...	All of my kids have cried non-stop when I tried to ...	5.0
Stop Pacifier Sucking without tears with ...	When the Binky Fairy came to our house, we didn't ...	5.0
A Tale of Baby's Days with Peter Rabbit ...	Lovely book, it's bound tightly so you may no ...	4.0
Baby Tracker® - Daily Childcare Journal, ...	Perfect for new parents. We were able to keep ...	5.0
Baby Tracker® - Daily Childcare Journal, ...	A friend of mine pinned this product on Pinte ...	5.0
Baby Tracker® - Daily Childcare Journal, ...	This has been an easy way for my nanny to record ...	4.0

name	review	rating	word_count
Planetwise Flannel Wipes	These flannel wipes are OK, but in my opinion ...	3.0	{'and': 5, 'stink': 1, 'because': 1, 'ordered': ...
Planetwise Wipe Pouch	it came early and was not disappointed. i love ...	5.0	{'and': 3, 'love': 1, 'it': 2, 'highly': 1, ...
Annas Dream Full Quilt with 2 Shams ...	Very soft and comfortable and warmer than it ...	5.0	{'and': 2, 'quilt': 1, 'it': 1, 'comfortable': ...
Stop Pacifier Sucking without tears with ...	This is a product well worth the purchase. I ...	5.0	{'ingenious': 1, 'and': 3, 'love': 2, ...
Stop Pacifier Sucking without tears with ...	All of my kids have cried non-stop when I tried to ...	5.0	{'and': 2, 'parents!!': 1, 'all': 2, 'puppet.': ...
Stop Pacifier Sucking without tears with ...	When the Binky Fairy came to our house, we didn't ...	5.0	{'and': 2, 'cute': 1, 'help': 2, 'doll': 1, ...
A Tale of Baby's Days with Peter Rabbit ...	Lovely book, it's bound tightly so you may no ...	4.0	{'shop': 1, 'be': 1, 'is': 1, 'it': 1, 'as': ...
Baby Tracker® - Daily Childcare Journal, ...	Perfect for new parents. We were able to keep ...	5.0	{'feeding,': 1, 'and': 2, 'all': 1, 'right': 1, ...
Baby Tracker® - Daily Childcare Journal, ...	A friend of mine pinned this product on Pinte ...	5.0	{'and': 1, 'help': 1, 'give': 1, 'is': 1, ...
Baby Tracker® - Daily Childcare Journal, ...	This has been an easy way for my nanny to record ...	4.0	{'journal.': 1, 'all': 1, 'standarad': 1, ...

name	review	rating	word_count
Vulli Sophie the Giraffe Teether ...	He likes chewing on all the parts especially the ...	5.0	{'and': 1, 'all': 1, 'because': 1, 'it': 1, ...
Vulli Sophie the Giraffe Teether ...	My son loves this toy and fits great in the diaper ...	5.0	{'and': 1, 'right': 1, 'help': 1, 'just': 1, ...
Vulli Sophie the Giraffe Teether ...	There really should be a large warning on the ...	1.0	{'and': 2, 'all': 1, 'latex.': 1, 'being': 1, ...
Vulli Sophie the Giraffe Teether ...	All the moms in my moms' group got Sophie for ...	5.0	{'and': 2, 'one!': 1, 'all': 1, 'love': 1, ...
Vulli Sophie the Giraffe Teether ...	I was a little skeptical on whether Sophie was ...	5.0	{'and': 3, 'all': 1, 'old': 1, 'her.': 1, ...
Vulli Sophie the Giraffe Teether ...	I have been reading about Sophie and was going ...	5.0	{'and': 6, 'seven': 1, 'already': 1, 'love': 1, ...
Vulli Sophie the Giraffe Teether ...	My neice loves her sophie and has spent hours ...	5.0	{'and': 4, 'drooling,': 1, 'love': 1, 'her.': 1, ...
Vulli Sophie the Giraffe Teether ...	What a friendly face! And those mesmerizing ...	5.0	{'and': 3, 'chew': 1, "don't": 1, 'is': 1, ...
Vulli Sophie the Giraffe Teether ...	We got this just for my son to chew on instea ...	5.0	{'chew': 2, 'because': 1, 'just': 2, 'what': 1, ...
Vulli Sophie the Giraffe Teether ...	My baby seems to like this toy, but I could ...	3.0	{'and': 2, 'already': 1, 'in': 1, 'some': 1, ' ...

name	review	rating	word_count	predicted_sentiment
Vulli Sophie the Giraffe Teether ...	Sophie, oh Sophie, your time has come. My ...	5.0	{'giggles': 1, 'all': 1, "violet's": 2, 'food' ...	1.0
Vulli Sophie the Giraffe Teether ...	I'm not sure why Sophie is such a hit with the ...	4.0	{'peace': 1, 'month': 1, 'bright': 1, 'softer' ...	0.999999999703
Vulli Sophie the Giraffe Teether ...	I'll be honest...I bought this toy because all the ...	4.0	{'all': 2, 'pops': 1, 'existence.': 1, ...	0.999999999392
Vulli Sophie the Giraffe Teether ...	We got this little giraffe as a gift from a ...	5.0	{'all': 2, "don't": 1, '(literally).so': 1, ...	0.99999999919
Vulli Sophie the Giraffe Teether ...	As a mother of 16month old twins; I bought ...	5.0	{'cute': 1, 'all': 1, 'reviews.': 2, 'just' ...	0.999999998657
Vulli Sophie the Giraffe Teether ...	Sophie the Giraffe is the perfect teething toy. ...	5.0	{'just': 2, 'both': 1, 'month': 1, 'ears,': 1, ...	0.999999997108
Vulli Sophie the Giraffe Teether ...	Sophie la giraffe is absolutely the best toy ...	5.0	{'and': 5, 'the': 1, 'all': 1, 'that': 2, ...	0.999999995589
Vulli Sophie the Giraffe Teether ...	My 5-mos old son took to this immediately. The ...	5.0	{'just': 1, 'shape': 2, 'mutt': 1, '"dog': 1, ...	0.999999995573
Vulli Sophie the Giraffe Teether ...	My nephews and my four kids all had Sophie in ...	5.0	{'and': 4, 'chew': 1, 'all': 1, 'perfect;': 1, ...	0.999999989527
Vulli Sophie the Giraffe Teether ...	Never thought I'd see my son French kissing a ...	5.0	{'giggles': 1, 'all': 1, 'out,': 1, 'over': 1, ...	0.999999985069

name	review	rating	word_count	sentiment
Baby Trend Diaper Champ	Baby Luke can turn a clean diaper to a dirty ...	5.0	{'all': 1, 'less': 1, "friend's": 1, '(which': ...	1
Baby Trend Diaper Champ	I LOOOVE this diaper pail! Its the easies ...	5.0	{'just': 1, 'over': 1, 'rweek': 1, 'sooo': 1, ...	1
Baby Trend Diaper Champ	We researched all of the different types of di ...	4.0	{'all': 2, 'just': 4, "don't": 2, 'one,': 1, ...	1
Baby Trend Diaper Champ	My baby is now 8 months and the can has been ...	5.0	{"don't": 1, 'when': 1, 'over': 1, 'soon': 1, ...	1
Baby Trend Diaper Champ	This is absolutely, by far, the best diaper ...	5.0	{'just': 3, 'money': 1, 'not': 2, 'mechanism' ...	1
Baby Trend Diaper Champ	Diaper Champ or Diaper Genie? That was my ...	5.0	{'all': 1, 'bags.': 1, 'son,': 1, '(i': 1, ...	1
Baby Trend Diaper Champ	Wow! This is fabulous. It was a toss-up between ...	5.0	{'and': 4, '"genie".': 1, 'since': 1, 'garbage' ...	1
Baby Trend Diaper Champ	I originally put this item on my baby registry ...	5.0	{'lysol': 1, 'all': 2, 'bags.': 1, 'feedback': ...	1
Baby Trend Diaper Champ	Two girlfriends and two family members put me ...	5.0	{'just': 1, 'when': 1, 'both': 1, 'results': 1, ...	1
Baby Trend Diaper Champ	I am one of those super- critical shoppers who ...	5.0	{'taller': 1, 'bags.': 1, 'just': 1, "don't": 4, ...	1

great	love	horrible	bad	terrible	predicted_sentiment
0	0	0	0	0	0.999999937267
0	1	0	0	0	0.999999917406
0	0	0	1	1	0.999999899509
2	0	0	1	1	0.999999836182
0	2	0	0	0	0.999999824745
0	0	0	0	0	0.999999759315
0	0	0	0	0	0.999999692111
0	0	0	0	0	0.999999642488
0	0	1	0	0	0.999999604504
0	1	0	0	0	0.999999486804