Sentiment Analysis for Twitter

Overview

This tutorial is going to introduce some simple tools for detecting sentiment in Tweets. We will be using a set of tools called the Natural Language Toolkit (NLTK). This is collection of software written in the Python programming language. An important design goal behind Python is that it should be easy to read and fun to use, so well-suited for beginners. A similar motivation inspired NLTK: it should make complex tasks easy to carry out, and it should be written in a way that would allow users to inspect and understand the code.

Why is this relevant? Well, a lot of software these days is built to be easy to use, but hard to inspect. For example, smartphones have a lot of slick apps on them, but very few people have the expertise to look under the hood to find out how they work. NLTK has quite the opposite approach: you are actively encouraged to discover how the code works. However, your level of understanding will depend heavily on how far you get to grips with Python itself.

This tutorial is written using the IPython framework. This allows text to be interspersed by fragments of code, occuring in special "cells". Just below is a cell where we are using Python to do a simple calculation:



In [100]:

    
3 + 4









    Out[100]:





7

Some of the cells will contain snippets of code that are necessary for the big story to work, but which you don't need to understand. We'll try to make it clear when it's important for you to pay attention to one of the cells.

Twitter

As you know, people are tweeting all the time. The rate varies, with about 6,000 per second being the average, but when I last checked, the rate was over 10,000 Tweets per second. So, a lot. Twitter kindly allows people to tap into a small sample of this stream — unless you're able to pay, the sample is at most 1% of the total stream.

Here's a tiny snapshow to Tweets, reflecting the Twitter public stream at the point this tutorial was last executed. By using the keywords 'love, hate', we restrict our sample to just those Tweets containing one or both of those words.



In [68]:

    
import nltk # load up the NLTK library
from nltk.twitter import Twitter
tw = Twitter() # start a new client that connects to Twitter
tw.tweets(keywords='love, hate', limit=25) #filter Tweets from the public stream









    



@wesleystromberg good morning, my love 🌞💛
RT @jetblackstyIes: @Ashton5SOS I love SGFG so much. every lyric, melody and chord is perfect. you boys never fail to impress us. we are so…
RT @FreddyAmazin: I love Kourtney 😂 https://t.co/ecoO9bpZrs
https://t.co/0CsvdCzYeZ
@Michael5SOS 
X205
I love Ashton so fucking much what the fuck
@LittleMix 💑↔️✈️🙍🏼 (love me or leave me)
RT @5SOS: @lesleeeeey love our malaysian fans heaps. hope we can come there soon :)
Listen to Forever More (Love Songs, Hits &amp; Duets) by James Ingram on @AppleMusic.
https://t.co/njTvJ2ZT3f https://t.co/F58tIyzWT9
RT @ProudOfKathniel: KathNiel will flood our december of love ❤❤❤
#PSYPagtakas -🍯 https://t.co/GpPe4jqe6X
Love him with everything I have 💕 #love #heamazesme #heamazesmeeveryday #lovehim #selfie #kiss… https://t.co/RqEzDwznVH
Jack and Jack ,Latin America needs you🙏🏻
Will we have Tour Dates soon? 
We love you❤️
@JackJackJohnson @jackgilinsky 

Ecuador needs you🇪🇨21
RT @OhBaeMsgs: For every day, I miss you. For every hour, I need you. For every minute, I feel you. For every second, I want you. Forever, …
今日はCSで、SWEET LOVE SHOWER 2015を見てたけど、
やっぱり、ゲスの極みって、
時代の半歩先をリードしている感じがする☆

そりゃ、ミスチルの桜井さんが今一番注目するはずだわ
RT @beyondxdarkness: i hate getting close to people these days... i always regret sharing too much, caring too much, doing too much, and fe…
@officialheart All I wanna do is make love to you, is probably the best song in the history of the world.
RT @maindcm: THIS MUST BE LOVE AHIHIHI,  #ExcitedForGMAChristmasStationID
RT @jetblackstyIes: @Ashton5SOS I love SGFG so much. every lyric, melody and chord is perfect. you boys never fail to impress us. we are so…
RT @mkaitlinharris: When Sinqua signed my sheet I pointed out Gage's sig "there's your boo" he said "I love my boo" kissed his hand &amp; put t…
RT @jetblackstyIes: @Ashton5SOS I love SGFG so much. every lyric, melody and chord is perfect. you boys never fail to impress us. we are so…
【参戦済】
03/21 AnimeJapan2015
03/22 Free!es岩鳶鮫柄合同文化祭
05/23 美男高校地球防衛部LOVE!LIVE!
06/19 OCD"ONE PLEDGES" ZeppNagoya
07/19 Free!-Swimming All Night
RT @PassionPosts: did you ever notice how much it hurts missing someone who will never love you, at all in any way
RT @_benjvmins_: I hate havin feelings. I wish I aint have these hoes .
Don't know her but i kinda hate her. .lol https://t.co/7Juqnoyzzb
@Jayson_Greene Love when P4k reviews have personal anecdotes/reflections. Makes it much more relatable and I'm more inclined to listen.
RT @partylikeits07: When you love yourself http://t.co/wkTRX5iPvc
Written 25 Tweets

Using a Twitter corpus

You too can sample Tweets in this way, but you'll need to set up your Twitter API keys according to these instructions, and also install NLTK (and IPython if you want) on your own computer. Since this is a bit of hassle, for the rest of this tutorial, we'll focus our attention on a sample of 20,000 English-language Tweets that were collected at the end of April 2015. In order focus on Tweets about the UK general election, the public stream was filtered with the following set of terms:

david cameron, miliband, milliband, sturgeon, clegg, farage, tory, tories, ukip, snp, libdem

The following code cell allows us to get hold of this collection, and prints out the text of the first 15. You don't need to worry about the details of how this happens.



In [4]:

    
from nltk.corpus import twitter_samples
strings = twitter_samples.strings('tweets.20150430-223406.json')



In [5]:

    
for string in strings[:20]:
    print(string)









    



RT @KirkKus: Indirect cost of the UK being in the EU is estimated to be costing Britain £170 billion per year! #BetterOffOut #UKIP
VIDEO: Sturgeon on post-election deals http://t.co/BTJwrpbmOY
RT @LabourEoin: The economy was growing 3 times faster on the day David Cameron became Prime Minister than it is today.. #BBCqt http://t.co…
RT @GregLauder: the UKIP east lothian candidate looks about 16 and still has an msn addy http://t.co/7eIU0c5Fm1
RT @thesundaypeople: UKIP's housing spokesman rakes in £800k in housing benefit from migrants.  http://t.co/GVwb9Rcb4w http://t.co/c1AZxcLh…
RT @Nigel_Farage: Make sure you tune in to #AskNigelFarage tonight on BBC 1 at 22:50! #UKIP http://t.co/ogHSc2Rsr2
RT @joannetallis: Ed Milliband is an embarrassment. Would you want him representing the UK?!  #bbcqt vote @Conservatives
RT @abstex: The FT is backing the Tories. On an unrelated note, here's a photo of FT leader writer Jonathan Ford (next to Boris) http://t.c…
RT @NivenJ1: “@George_Osborne: Ed Miliband proved tonight why he's not up to the job” Tbf you've spent 5 years doing that you salivating do…
LOLZ to Trickle Down Wealth. It's never trickling past their own wallets. Greed always wins $$$ for the greedy.  https://t.co/X7deoPbS97
SNP leader faces audience questions http://t.co/TYClKltSpW
RT @cononeilluk: Cameron "Ed Milliband hanging out with Russell Brand. He is a joke. This is an election. This is about real people' http:/…
RT @politicshome: Ed Miliband: Last Labour government did not overspend http://t.co/W9RJ2aSH6o http://t.co/4myFekg5ex
If Miliband is refusing to do any deal with the SNP, how does he plan on forming a government?
RT @scotnotbritt: Well thats it. LABOUR would rather have a TORY government rather than work with the SNP. http://t.co/SNMkRDCe9f
Cameron wins last TV contest of election campaign - poll: LONDON (Reuters) - Prime Minister David Cameron won the… http://t.co/aUMOoYWOSk
RT @stephen501: @dunleavy138 @CrillyBobc @theSNP @UKLabour I would be happy to do a deal with the SNP, but @Ed_Miliband was clear. If you w…
How dare @EdMiliband_MP force Socialists to chose between the English LP and the SNP! The #SNP are the last, true Socialist party in the UK
Watch: Ed Miliband trips off the stage following Question Time leaders special http://t.co/X61IGbe07R
RT @abstex: The FT is backing the Tories. On an unrelated note, here's a photo of FT leader writer Jonathan Ford (next to Boris) http://t.c…

Sentiment Analysis

When we talk about understanding natural language, we often focus on 'who did what to whom'. Yet in many situations, we are more interested in attitudes and opinions. When someone writes about a movie, did they like it or hate it? Is a product review for a water bottle on Amazon positive or negative? Is a Tweet about the US President supportive or critical? We might also care about the intensity of the views expressed: "this is a fine movie" is different from "WOW! This movie is soooooo great!!!!" even though both are positive.

Sentiment analysis (or opinion mining) is a broad term for a range of techniques that try to identify the subjective views expressed in texts. Many organisations care deeply about public opinion — whether these concern commercial products, creative works, or political parties and policies — and have consequently turned to sentiment analysis as a way of gleaning valuable insights from voluminous bodies of online text. This in turn has stimulated much activity in the area, ranging from academic research to commercial applications and industry-focussed conferences.

However, it's worth saying at the outset that sentiment analysis is hard. Although it is designed to work with written text, the way in which people express their feelings is often goes far beyond what they literally say. In spoken language, intonation will be important. And of course we often express emotion using no words at all, as illustrated in this picture from Darwin's book The Expression of the Emotions.

Classifying sentences

Let's say that we want to classify a sentence into one of three categories: positive, negative or neutral. Each of these can be illustrated by posts on Twitter collected during the UK General Election in 2015.

positive:: Good stuff from Clegg. Clear, passionate & honest about the difficulties of govt but also the difference @LibDems have made.
negative:: Hmm. Ed Miliband being against SNP is a bad move I think. It'll cost him n it is a dumb choice.
neutral:: Why is Ed Milliband trending when him name is Ed Miliband?

The term polarity is often used to refer to whether a piece of text is judged to be positive or negative.

The easiest approach to classifying examples like these is to get hold of two lists of words, positive ones such as good, excellent, fine, triumph, well, succeed, ... and negative ones such as bad, poor, dismal, lying, fail, disaster, .... We figure out an overall polarity score based on the ratio of positive tokens to negative ones in a given string. A sentence with neither positive or negative tokens (or possibly an equal number of each) will be categorised as neutral. This simple approach is likely to yield the roughly correct results for the Twitter examples above.

Things become more complicated when negation enter into the picture. The next example is mildly positive (at least in British English), so we need to ensure that not reverses the polarity of bad in appropriate contexts,

Given Miliband personal ratings still 20 points behind Cameron, I'd say that not a bad margin for Labour leader https://t.co/ILQP93VYLF

Classifying Tweets with VADER

VADER is a system for determining the sentiment of texts which has been incorporated into NLTK. It is based on the idea of looking for positive and negative words, but adds to important new elements. First, it uses a lexicon of 7,500 items which have been manually annotated for both polarity and intensity. Second, the overall score for an input text is computed by using a complex set of rules that take into account not just words (and negation), but also the boosting effect of devices like capitalisation and punctuation.



In [69]:

    
from nltk.sentiment import SentimentIntensityAnalyzer
sia = SentimentIntensityAnalyzer()



In [80]:

    
sia.polarity_scores("I REALLY adore Starwars!!!!! :-)")









    Out[80]:





{'compound': 0.8755, 'neg': 0.0, 'neu': 0.182, 'pos': 0.818}



In [6]:

    
full_tweets = twitter_samples.docs('tweets.20150430-223406.json')

In the next example, we are going to create a table of Tweets using the pandas library. We will use the term data to refer to this table



In [105]:

    
import pandas as pd
from numpy import nan
data = pd.DataFrame()
data['text'] = [t['text'] for t in full_tweets] # add a column corresponding to the text of each Tweet

Next, we will try to add labels for political parties and party leaders in a way that corresponds to the text of the Tweets. However, in some cases, it may not be possible or appropriate to add a label and instead we want to have a 'blank cell' that will be ignored by pandas. We'll do this by inserting a value NaN (Not a Number).



In [108]:

    
parties = {}
parties['conservative'] = set(['osborne', 'portillo', 'pickles', 'tory', 'tories',
                                'torie', 'voteconservative', 'conservative', 'conservatives', 'bullingdon', 'telegraph'])
parties['labour'] = set(['uklabour', 'scottishlabour', 'labour', 'lab', 'murphy'])
parties['libdem'] = set(['libdem', 'libdems', 'dems', 'alexander'])
parties['ukip'] = set(['ukip', 'davidcoburnukip'])
parties['snp'] = set(['salmond', 'snp', 'snpwin', 'votesnp', 'snpbecause', 'scotland',
                       'scotlands', 'scottish', 'indyref', 'independence', 'celebs4indy'])

leaders = {}
leaders['cameron'] = set(['cameron', 'david_cameron', 'davidcameron','dave', 'davecamm'])
leaders['miliband'] = set(['miliband', 'ed_miliband', 'edmiliband', 'edm', 'milliband', 'ed', 'edforchange', 'edforpm', 'milifandom'])
leaders['clegg'] = set(['clegg'])
leaders['farage'] = set(['farage', 'nigel_farage', 'nsegel', 'askfarage', 'asknigelfarage', 'asknigelfar'])
leaders['sturgeon'] = set(['sturgeon', 'nicola_sturgeon', 'nicolasturgeon', 'nicola'])

def tweet_classify(text, keywords):
    label = nan
    from nltk.tokenize import wordpunct_tokenize
    import operator
    toks = wordpunct_tokenize(text)
    toks_lower = [t.lower() for t in toks]
    d = {}
    for k in keywords:
        d[k] = len(keywords[k] & set(toks_lower))      
    best = max(d.items(), key=operator.itemgetter(1))
    if best[1] > 0:
        label = best[0]
    return label


data['party'] = [tweet_classify(row['text'], parties) for index, row in data.iterrows()]
data['leader'] = [tweet_classify(row['text'], leaders) for index, row in data.iterrows()]
data.head(25)









    Out[108]:






  
    
      
      text
      party
      leader
    
  
  
    
      0
      RT @KirkKus: Indirect cost of the UK being in ...
      ukip
      NaN
    
    
      1
      VIDEO: Sturgeon on post-election deals http://...
      NaN
      sturgeon
    
    
      2
      RT @LabourEoin: The economy was growing 3 time...
      NaN
      cameron
    
    
      3
      RT @GregLauder: the UKIP east lothian candidat...
      ukip
      NaN
    
    
      4
      RT @thesundaypeople: UKIP's housing spokesman ...
      ukip
      NaN
    
    
      5
      RT @Nigel_Farage: Make sure you tune in to #As...
      ukip
      farage
    
    
      6
      RT @joannetallis: Ed Milliband is an embarrass...
      conservative
      miliband
    
    
      7
      RT @abstex: The FT is backing the Tories. On a...
      conservative
      NaN
    
    
      8
      RT @NivenJ1: “@George_Osborne: Ed Miliband pro...
      NaN
      miliband
    
    
      9
      LOLZ to Trickle Down Wealth. It's never trickl...
      NaN
      NaN
    
    
      10
      SNP leader faces audience questions http://t.c...
      snp
      NaN
    
    
      11
      RT @cononeilluk: Cameron "Ed Milliband hanging...
      NaN
      miliband
    
    
      12
      RT @politicshome: Ed Miliband: Last Labour gov...
      labour
      miliband
    
    
      13
      If Miliband is refusing to do any deal with th...
      snp
      miliband
    
    
      14
      RT @scotnotbritt: Well thats it. LABOUR would ...
      labour
      NaN
    
    
      15
      Cameron wins last TV contest of election campa...
      NaN
      cameron
    
    
      16
      RT @stephen501: @dunleavy138 @CrillyBobc @theS...
      labour
      miliband
    
    
      17
      How dare @EdMiliband_MP force Socialists to ch...
      snp
      NaN
    
    
      18
      Watch: Ed Miliband trips off the stage followi...
      NaN
      miliband
    
    
      19
      RT @abstex: The FT is backing the Tories. On a...
      conservative
      NaN
    
    
      20
      @B0MBSKARE the anti-Scottish feeling is largel...
      snp
      NaN
    
    
      21
      Miliband stumbles, Cameron dodges http://t.co/...
      NaN
      miliband
    
    
      22
      Miliband - I'd pass on PM job rather than do d...
      labour
      miliband
    
    
      23
      RT @GloriaDePiero: Nick Clegg is just as respo...
      conservative
      clegg
    
    
      24
      RT @mykilmarnock: Will the person who dropped ...
      snp
      NaN

To add a sentiment column, we will use the polarity_scores() method from VADER that we briefly described earlier. We'll only look at the overall 'compound' polarity score.



In [109]:

    
data['sentiment'] = [sia.polarity_scores(row['text'])['compound'] for index, row in data.iterrows()]



In [48]:

    
data.describe() # summarise the table









    Out[48]:






  
    
      
      sentiment
    
  
  
    
      count
      20000.000000
    
    
      mean
      0.060084
    
    
      std
      0.415121
    
    
      min
      -0.982100
    
    
      25%
      -0.205700
    
    
      50%
      0.000000
    
    
      75%
      0.401900
    
    
      max
      0.957100

Let's inspect the 25 most positive Tweets:



In [57]:

    
data.sort_index(by="sentiment", ascending=False).head(25)









    Out[57]:






  
    
      
      text
      party
      leader
      sentiment
    
  
  
    
      15079
      Labours new song :  We love Tories more than S...
      conservative
      NaN
      0.9571
    
    
      15598
      RT @ScotlandClare: Labours new song :  We love...
      conservative
      NaN
      0.9571
    
    
      18720
      RT @ScotlandClare: Labours new song :  We love...
      conservative
      NaN
      0.9571
    
    
      16339
      #bbcqt  Labours new song :  We love Tories mor...
      conservative
      NaN
      0.9571
    
    
      15420
      RT @ScotlandClare: Labours new song :  We love...
      conservative
      NaN
      0.9571
    
    
      19438
      RT @ScotlandClare: Labours new song :  We love...
      conservative
      NaN
      0.9571
    
    
      15285
      RT @ScotlandClare: Labours new song :  We love...
      conservative
      NaN
      0.9571
    
    
      18791
      RT @ScotlandClare: Labours new song :  We love...
      conservative
      NaN
      0.9571
    
    
      15593
      @Nigel_Farage @LorraChaplin Excellent job on B...
      NaN
      farage
      0.9413
    
    
      4651
      Please don't let the Tories win. Please don't ...
      conservative
      NaN
      0.9413
    
    
      7494
      RT @tonyfernandes: David Cameron was excellent...
      NaN
      cameron
      0.9410
    
    
      17131
      RT @tonyfernandes: David Cameron was excellent...
      NaN
      cameron
      0.9410
    
    
      6489
      RT @tonyfernandes: David Cameron was excellent...
      NaN
      cameron
      0.9410
    
    
      15031
      RT @Always_a_Yes: Love to play Poker with Ed M...
      snp
      miliband
      0.9392
    
    
      15413
      RT @Always_a_Yes: Love to play Poker with Ed M...
      snp
      miliband
      0.9392
    
    
      17847
      RT @Always_a_Yes: Love to play Poker with Ed M...
      snp
      miliband
      0.9392
    
    
      4812
      @BillDudleyNorth @UKIP wow. With good looking ...
      ukip
      NaN
      0.9348
    
    
      8791
      @kdugdalemsp @Ed_Miliband SHE COULDNAE GIVE A ...
      snp
      miliband
      0.9332
    
    
      131
      @johnhemming4mp Forgot to copy @jessphillips i...
      libdem
      NaN
      0.9325
    
    
      13757
      @BBCPolitics @BBCNews people on benefits get b...
      conservative
      NaN
      0.9319
    
    
      17691
      RT @gavtheukip: #FARAGE DID AMAZINGLY WELL EVE...
      NaN
      farage
      0.9303
    
    
      15400
      RT @gavtheukip: #FARAGE DID AMAZINGLY WELL EVE...
      NaN
      farage
      0.9303
    
    
      10558
      RT @CllrJStockton: Very impressive performance...
      NaN
      miliband
      0.9286
    
    
      3501
      RT @CllrJStockton: Very impressive performance...
      NaN
      miliband
      0.9286
    
    
      5583
      RT @CllrJStockton: Very impressive performance...
      NaN
      miliband
      0.9286

Let's print out the text of the Tweet in row 15079.



In [63]:

    
print(data.iloc[15079]['text'])









    



Labours new song :  We love Tories more than Scots..we love Tories more than Scots we love Tories we love Tories...

Now let's have a peek at the 25 most negative Tweets.



In [110]:

    
data.sort_index(by="sentiment").head(25)









    Out[110]:






  
    
      
      text
      party
      leader
      sentiment
    
  
  
    
      9451
      RT @BryonyKimmings: Fuck the Tories\nFuck the ...
      conservative
      NaN
      -0.9821
    
    
      15707
      RT @ShabnumMustapha: SNP record on the NHS: cr...
      snp
      NaN
      -0.9538
    
    
      16928
      RT @ShabnumMustapha: SNP record on the NHS: cr...
      snp
      NaN
      -0.9538
    
    
      16765
      RT @ShabnumMustapha: SNP record on the NHS: cr...
      snp
      NaN
      -0.9538
    
    
      15789
      RT @ShabnumMustapha: SNP record on the NHS: cr...
      snp
      NaN
      -0.9538
    
    
      14036
      RT @karendonaldson2: @guardian @Juliet777777 I...
      snp
      NaN
      -0.9497
    
    
      10439
      @guardian @Juliet777777 I WILL NOT do a deal w...
      snp
      NaN
      -0.9497
    
    
      13836
      RT @MariaWNorris: Nigel Farage is so full of b...
      NaN
      farage
      -0.9435
    
    
      14587
      RT @MariaWNorris: Nigel Farage is so full of b...
      NaN
      farage
      -0.9435
    
    
      11431
      Nigel Farage is so full of bullshit. Hateful, ...
      NaN
      farage
      -0.9435
    
    
      13385
      RT @MariaWNorris: Nigel Farage is so full of b...
      NaN
      farage
      -0.9435
    
    
      13375
      RT @MariaWNorris: Nigel Farage is so full of b...
      NaN
      farage
      -0.9435
    
    
      8800
      'Vote UKIP, he wants all the foreigners gone' ...
      ukip
      NaN
      -0.9426
    
    
      7586
      Fuel poverty killed 15,000 people last winter ...
      conservative
      NaN
      -0.9274
    
    
      6564
      #isitok to think that ed miliband looks like t...
      NaN
      miliband
      -0.9274
    
    
      9721
      RT @qeensdale: Fuel poverty killed 15,000 peop...
      conservative
      NaN
      -0.9274
    
    
      2465
      RT @Barnabyspeak: Ever noticed the pushy, aggr...
      conservative
      NaN
      -0.9231
    
    
      1803
      RT @Barnabyspeak: Ever noticed the pushy, aggr...
      conservative
      NaN
      -0.9231
    
    
      1537
      RT @Barnabyspeak: Ever noticed the pushy, aggr...
      conservative
      NaN
      -0.9231
    
    
      4935
      RT @shocker38: There has been a murder, Ed Mil...
      labour
      miliband
      -0.9231
    
    
      1433
      @paul7day She can bloody stay in Scotland Fuck...
      labour
      miliband
      -0.9231
    
    
      5069
      RT @Barnabyspeak: Ever noticed the pushy, aggr...
      conservative
      NaN
      -0.9231
    
    
      2264
      RT @Barnabyspeak: Ever noticed the pushy, aggr...
      conservative
      NaN
      -0.9231
    
    
      5822
      RT @Barnabyspeak: Ever noticed the pushy, aggr...
      conservative
      NaN
      -0.9231
    
    
      5771
      …in court over 'threat to behead' #UKIP candid...
      ukip
      NaN
      -0.9231

And here is the text of the Tweet at row 5069:



In [84]:

    
print(data.iloc[5069]['text'])









    



RT @Barnabyspeak: Ever noticed the pushy, aggressive, opinionated, overbearing, uncaring, rude, nasty bastards often turn out to be Tories?

In the next few examples, we group the Tweets together either by leader or by party, and then look at some summary statistics.



In [34]:

    
grouped_leader = data['sentiment'].groupby(data['leader'])
grouped_leader.mean()









    Out[34]:





leader
cameron     0.047176
clegg       0.103216
farage      0.088027
miliband    0.086034
sturgeon    0.041670
Name: sentiment, dtype: float64



In [30]:

    
grouped_party = data['sentiment'].groupby(data['party'])
grouped_party.mean()









    Out[30]:





party
conservative    0.098122
labour          0.001554
libdem          0.068198
snp             0.047162
ukip            0.103672
Name: sentiment, dtype: float64



In [35]:

    
grouped_leader.count()









    Out[35]:





leader
cameron     1472
clegg        688
farage      3079
miliband    7340
sturgeon     630
Name: sentiment, dtype: int64



In [36]:

    
grouped_party.count()









    Out[36]:





party
conservative    3130
labour          3403
libdem            58
snp             4100
ukip            2768
Name: sentiment, dtype: int64



In [86]:

    
grouped_party.max()









    Out[86]:





party
conservative    0.9571
labour          0.9285
libdem          0.9325
snp             0.9392
ukip            0.9348
Name: sentiment, dtype: float64



In [87]:

    
grouped_leader.max()









    Out[87]:





leader
cameron     0.9410
clegg       0.9047
farage      0.9413
miliband    0.9392
sturgeon    0.9127
Name: sentiment, dtype: float64

Challenges

It's not hard to find examples where something close to full natural language understanding is required to determine the correct polarity.

David Cameron doesn't seem to have done too badly until now. Otherwise #milifandom and #cleggers would be attacking him for these bad things.
Even though I don't like UKIP I'm hating them less and less every day, they do actually have very some good policies.



In [113]:

    
sia.polarity_scores("David Cameron doesn't seem to have done too badly until now." + 
"Otherwise #milifandom and #cleggers would be attacking him for these bad things.")









    Out[113]:





{'compound': -0.8625, 'neg': 0.336, 'neu': 0.664, 'pos': 0.0}

A further challenge in sentiment analysis is deciding the right level of granularity for the topic under discussion. Often, we can agree in the overall polarity of a sentence (or even of larger texts) because there is a single dominant topic. But in a list-like construction such as the following, different sentiments are associated with different entities, and there is no sensible way of aggregating this into a combined polarity score for the text as a whole:

<i>@hugorifkind Audience - good. Mili - bad. Clegg - a bit sad. Cam - unscathed</i>

Finally, as we have already seen, current approaches to language processing struggle with sarcasm, irony and satire, since these (intentionally) agin lead to polarity reversals.

LOVE being sat on a plane for 4 hours after a 10 hour flight !! Soooo fun !
The wrong spelling of Ed Miliband is trending, but not the correct one. Good job, Britain.



In [ ]:

	text	party	leader
0	RT @KirkKus: Indirect cost of the UK being in ...	ukip	NaN
1	VIDEO: Sturgeon on post-election deals http://...	NaN	sturgeon
2	RT @LabourEoin: The economy was growing 3 time...	NaN	cameron
3	RT @GregLauder: the UKIP east lothian candidat...	ukip	NaN
4	RT @thesundaypeople: UKIP's housing spokesman ...	ukip	NaN
5	RT @Nigel_Farage: Make sure you tune in to #As...	ukip	farage
6	RT @joannetallis: Ed Milliband is an embarrass...	conservative	miliband
7	RT @abstex: The FT is backing the Tories. On a...	conservative	NaN
8	RT @NivenJ1: “@George_Osborne: Ed Miliband pro...	NaN	miliband
9	LOLZ to Trickle Down Wealth. It's never trickl...	NaN	NaN
10	SNP leader faces audience questions http://t.c...	snp	NaN
11	RT @cononeilluk: Cameron "Ed Milliband hanging...	NaN	miliband
12	RT @politicshome: Ed Miliband: Last Labour gov...	labour	miliband
13	If Miliband is refusing to do any deal with th...	snp	miliband
14	RT @scotnotbritt: Well thats it. LABOUR would ...	labour	NaN
15	Cameron wins last TV contest of election campa...	NaN	cameron
16	RT @stephen501: @dunleavy138 @CrillyBobc @theS...	labour	miliband
17	How dare @EdMiliband_MP force Socialists to ch...	snp	NaN
18	Watch: Ed Miliband trips off the stage followi...	NaN	miliband
19	RT @abstex: The FT is backing the Tories. On a...	conservative	NaN
20	@B0MBSKARE the anti-Scottish feeling is largel...	snp	NaN
21	Miliband stumbles, Cameron dodges http://t.co/...	NaN	miliband
22	Miliband - I'd pass on PM job rather than do d...	labour	miliband
23	RT @GloriaDePiero: Nick Clegg is just as respo...	conservative	clegg
24	RT @mykilmarnock: Will the person who dropped ...	snp	NaN

	sentiment
count	20000.000000
mean	0.060084
std	0.415121
min	-0.982100
25%	-0.205700
50%	0.000000
75%	0.401900
max	0.957100

	text	party	leader	sentiment
15079	Labours new song : We love Tories more than S...	conservative	NaN	0.9571
15598	RT @ScotlandClare: Labours new song : We love...	conservative	NaN	0.9571
18720	RT @ScotlandClare: Labours new song : We love...	conservative	NaN	0.9571
16339	#bbcqt Labours new song : We love Tories mor...	conservative	NaN	0.9571
15420	RT @ScotlandClare: Labours new song : We love...	conservative	NaN	0.9571
19438	RT @ScotlandClare: Labours new song : We love...	conservative	NaN	0.9571
15285	RT @ScotlandClare: Labours new song : We love...	conservative	NaN	0.9571
18791	RT @ScotlandClare: Labours new song : We love...	conservative	NaN	0.9571
15593	@Nigel_Farage @LorraChaplin Excellent job on B...	NaN	farage	0.9413
4651	Please don't let the Tories win. Please don't ...	conservative	NaN	0.9413
7494	RT @tonyfernandes: David Cameron was excellent...	NaN	cameron	0.9410
17131	RT @tonyfernandes: David Cameron was excellent...	NaN	cameron	0.9410
6489	RT @tonyfernandes: David Cameron was excellent...	NaN	cameron	0.9410
15031	RT @Always_a_Yes: Love to play Poker with Ed M...	snp	miliband	0.9392
15413	RT @Always_a_Yes: Love to play Poker with Ed M...	snp	miliband	0.9392
17847	RT @Always_a_Yes: Love to play Poker with Ed M...	snp	miliband	0.9392
4812	@BillDudleyNorth @UKIP wow. With good looking ...	ukip	NaN	0.9348
8791	@kdugdalemsp @Ed_Miliband SHE COULDNAE GIVE A ...	snp	miliband	0.9332
131	@johnhemming4mp Forgot to copy @jessphillips i...	libdem	NaN	0.9325
13757	@BBCPolitics @BBCNews people on benefits get b...	conservative	NaN	0.9319
17691	RT @gavtheukip: #FARAGE DID AMAZINGLY WELL EVE...	NaN	farage	0.9303
15400	RT @gavtheukip: #FARAGE DID AMAZINGLY WELL EVE...	NaN	farage	0.9303
10558	RT @CllrJStockton: Very impressive performance...	NaN	miliband	0.9286
3501	RT @CllrJStockton: Very impressive performance...	NaN	miliband	0.9286
5583	RT @CllrJStockton: Very impressive performance...	NaN	miliband	0.9286

	text	party	leader	sentiment
9451	RT @BryonyKimmings: Fuck the Tories\nFuck the ...	conservative	NaN	-0.9821
15707	RT @ShabnumMustapha: SNP record on the NHS: cr...	snp	NaN	-0.9538
16928	RT @ShabnumMustapha: SNP record on the NHS: cr...	snp	NaN	-0.9538
16765	RT @ShabnumMustapha: SNP record on the NHS: cr...	snp	NaN	-0.9538
15789	RT @ShabnumMustapha: SNP record on the NHS: cr...	snp	NaN	-0.9538
14036	RT @karendonaldson2: @guardian @Juliet777777 I...	snp	NaN	-0.9497
10439	@guardian @Juliet777777 I WILL NOT do a deal w...	snp	NaN	-0.9497
13836	RT @MariaWNorris: Nigel Farage is so full of b...	NaN	farage	-0.9435
14587	RT @MariaWNorris: Nigel Farage is so full of b...	NaN	farage	-0.9435
11431	Nigel Farage is so full of bullshit. Hateful, ...	NaN	farage	-0.9435
13385	RT @MariaWNorris: Nigel Farage is so full of b...	NaN	farage	-0.9435
13375	RT @MariaWNorris: Nigel Farage is so full of b...	NaN	farage	-0.9435
8800	'Vote UKIP, he wants all the foreigners gone' ...	ukip	NaN	-0.9426
7586	Fuel poverty killed 15,000 people last winter ...	conservative	NaN	-0.9274
6564	#isitok to think that ed miliband looks like t...	NaN	miliband	-0.9274
9721	RT @qeensdale: Fuel poverty killed 15,000 peop...	conservative	NaN	-0.9274
2465	RT @Barnabyspeak: Ever noticed the pushy, aggr...	conservative	NaN	-0.9231
1803	RT @Barnabyspeak: Ever noticed the pushy, aggr...	conservative	NaN	-0.9231
1537	RT @Barnabyspeak: Ever noticed the pushy, aggr...	conservative	NaN	-0.9231
4935	RT @shocker38: There has been a murder, Ed Mil...	labour	miliband	-0.9231
1433	@paul7day She can bloody stay in Scotland Fuck...	labour	miliband	-0.9231
5069	RT @Barnabyspeak: Ever noticed the pushy, aggr...	conservative	NaN	-0.9231
2264	RT @Barnabyspeak: Ever noticed the pushy, aggr...	conservative	NaN	-0.9231
5822	RT @Barnabyspeak: Ever noticed the pushy, aggr...	conservative	NaN	-0.9231
5771	…in court over 'threat to behead' #UKIP candid...	ukip	NaN	-0.9231