Analyzing convention speeches with Google Language API

This notebook accompanies my Medium article: Is this presidential election more negative then years past? Yes.

API Key

To repeat my analysis, you need a Google Cloud Platform account (use the free trial). Then, visit API console, choose "Credentials" on the left-hand menu. Choose "API Key" and generate a server key for your application. You should probably restrict it by IP address to prevent abuse, but for now, just leave that field blank and delete the API key after trying out this demo.

Copy-paste your API Key here:


In [2]:
APIKEY="AIzaSyBNa0Hw5_SZpmQP2-iXgUfchVHa4Ot956M"

Note: Make sure you got an API Key and pasted it above. Mine won't work for you

From the same API console, choose "Dashboard" on the left-hand menu and "Enable API".

Finally, because we are calling the APIs from Python (clients in many other languages are available), let's install the Python package (it's not installed by default on Datalab)


In [3]:
# Copyright 2016 Google Inc.
# Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
# http://www.apache.org/licenses/LICENSE-2.0
# Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
!pip install --upgrade google-api-python-client


Requirement already up-to-date: google-api-python-client in /usr/local/lib/python2.7/dist-packages
Requirement already up-to-date: six<2,>=1.6.1 in /usr/local/lib/python2.7/dist-packages (from google-api-python-client)
Requirement already up-to-date: uritemplate<1,>=0.6 in /usr/local/lib/python2.7/dist-packages (from google-api-python-client)
Requirement already up-to-date: httplib2<1,>=0.8 in /usr/local/lib/python2.7/dist-packages (from google-api-python-client)
Collecting oauth2client (from google-api-python-client)
  Downloading oauth2client-3.0.0.tar.gz (77kB)
    100% |################################| 81kB 660kB/s 
Requirement already up-to-date: simplejson>=2.5.0 in /usr/local/lib/python2.7/dist-packages (from uritemplate<1,>=0.6->google-api-python-client)
Requirement already up-to-date: pyasn1>=0.1.7 in /usr/local/lib/python2.7/dist-packages (from oauth2client->google-api-python-client)
Requirement already up-to-date: pyasn1-modules>=0.0.5 in /usr/local/lib/python2.7/dist-packages (from oauth2client->google-api-python-client)
Requirement already up-to-date: rsa>=3.1.4 in /usr/local/lib/python2.7/dist-packages (from oauth2client->google-api-python-client)
Building wheels for collected packages: oauth2client
  Running setup.py bdist_wheel for oauth2client ... - done
  Stored in directory: /root/.cache/pip/wheels/b3/44/e9/e56c5a2ca8869305f38254df012fb16b2807eb9d5d55291e8b
Successfully built oauth2client
Installing collected packages: oauth2client
  Found existing installation: oauth2client 2.2.0
    Uninstalling oauth2client-2.2.0:
      Successfully uninstalled oauth2client-2.2.0
Successfully installed oauth2client-3.0.0
You are using pip version 8.1.1, however version 8.1.2 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.

Sentiment analysis with Language API

Let's evaluate the sentiment of nomination acceptance speeches for the last 5 election cycles. The text of the speeches will be downloaded from U. California, Santa Barbara.


In [4]:
from googleapiclient.discovery import build
import pandas as pd
import numpy as np
import urllib2

lservice = build('language', 'v1beta1', developerKey=APIKEY)
speeches = [
  ['Hillary Clinton', 'D', 2016, 'http://www.presidency.ucsb.edu/ws/index.php?pid=118051'],
  ['Donald Trump', 'R', 2016, 'http://www.presidency.ucsb.edu/ws/index.php?pid=117935'],
  ['Barack Obama', 'D', 2012, 'http://www.presidency.ucsb.edu/ws/index.php?pid=101968'],
  ['Mitt Romney', 'R', 2012, 'http://www.presidency.ucsb.edu/ws/index.php?pid=101966'],
  ['Barack Obama', 'D', 2008, 'http://www.presidency.ucsb.edu/ws/index.php?pid=78284'],
  ['John McCain', 'R', 2008, 'http://www.presidency.ucsb.edu/ws/index.php?pid=78576'],
  ['John Kerry', 'D', 2004, 'http://www.presidency.ucsb.edu/ws/index.php?pid=25971'],
  ['George W Bush', 'R', 2004, 'http://www.presidency.ucsb.edu/ws/index.php?pid=72727'],
  ['Al Gore', 'D', 2000, 'http://www.presidency.ucsb.edu/ws/index.php?pid=25963'],
  ['George W Bush', 'R', 2000, 'http://www.presidency.ucsb.edu/ws/index.php?pid=25954']
]

sentiment = []
for (speaker, party, year, url) in speeches:
  text_of_speech = urllib2.urlopen(url).read()
  response = lservice.documents().analyzeSentiment(
    body={
      'document': {
         'type': 'HTML',
         'content': unicode(text_of_speech, errors='ignore')
      }
    }).execute()
  polarity = response['documentSentiment']['polarity']
  magnitude = response['documentSentiment']['magnitude']
  print('POLARITY=%s MAGNITUDE=%s SPEAKER=%s' % (polarity, magnitude, speaker))
  sentiment.extend([speaker, party, year, float(polarity), float(magnitude)])


POLARITY=0.2 MAGNITUDE=37.4 SPEAKER=Hillary Clinton
POLARITY=0 MAGNITUDE=33.5 SPEAKER=Donald Trump
POLARITY=0.3 MAGNITUDE=40.7 SPEAKER=Barack Obama
POLARITY=0.4 MAGNITUDE=31.7 SPEAKER=Mitt Romney
POLARITY=0.1 MAGNITUDE=28.1 SPEAKER=Barack Obama
POLARITY=0.3 MAGNITUDE=39.7 SPEAKER=John McCain
POLARITY=0.4 MAGNITUDE=39.8 SPEAKER=John Kerry
POLARITY=0.3 MAGNITUDE=37.8 SPEAKER=George W Bush
POLARITY=0.4 MAGNITUDE=33.2 SPEAKER=Al Gore
POLARITY=0.4 MAGNITUDE=40.7 SPEAKER=George W Bush

In [6]:
df = pd.DataFrame(data=np.array(sentiment).reshape(10,5),
                  columns=['speaker', 'party', 'year', 'polarity', 'magnitude'])
for col in ['year', 'polarity', 'magnitude']:
  df[col] = pd.to_numeric(df[col])
print df


           speaker party  year  polarity  magnitude
0  Hillary Clinton     D  2016       0.2       37.4
1     Donald Trump     R  2016       0.0       33.5
2     Barack Obama     D  2012       0.3       40.7
3      Mitt Romney     R  2012       0.4       31.7
4     Barack Obama     D  2008       0.1       28.1
5      John McCain     R  2008       0.3       39.7
6       John Kerry     D  2004       0.4       39.8
7    George W Bush     R  2004       0.3       37.8
8          Al Gore     D  2000       0.4       33.2
9    George W Bush     R  2000       0.4       40.7

In [7]:
df = df.sort_values('year')
df.plot(x='speaker',  y='polarity', kind='bar')


Out[7]:
<matplotlib.axes._subplots.AxesSubplot at 0x7fa538eaefd0>

In [8]:
df.plot(x='speaker',  y='magnitude', kind='bar')


Out[8]:
<matplotlib.axes._subplots.AxesSubplot at 0x7fa53833fc50>

In [11]:
df[df['party'] == 'D'].mean()


Out[11]:
year         2008.00
polarity        0.28
magnitude      35.84
dtype: float64

In [12]:
df[df['party'] == 'R'].mean()


Out[12]:
year         2008.00
polarity        0.28
magnitude      36.68
dtype: float64

In [ ]: