Assignment 4

  • Using data from this FiveThirtyEight post, write code to calculate the correlation of the responses from the poll.
  • Respond to the story in your PR. Is this a good example of data journalism? Why or why not?

In [11]:
import pandas as pd
%matplotlib inline
import matplotlib.pyplot as plt 
import matplotlib
import statsmodels.formula.api as smf 
plt.style.use('ggplot')

In [2]:
df = pd.read_excel("obama.xlsx")

In [3]:
df.head()


Out[3]:
category Obama Favor Iran Deal Confident in adm
0 Dem 78 60 78
1 Rep 10 34 17
2 Ind 37 44 44
3 Men 41 46 45
4 Women 47 47 52

In [4]:
df.corr()['Obama'].sort_values(ascending=False)


Out[4]:
Obama               1.000000
Confident in adm    0.991553
Favor Iran Deal     0.913868
Name: Obama, dtype: float64

In [5]:
from pandas.tools.plotting import scatter_matrix

In [15]:
scatter_matrix(df[[ u'Obama', u'Favor Iran Deal', u'Confident in adm',
       ]],alpha=1, figsize=(10,10), diagonal='kde')


Out[15]:
array([[<matplotlib.axes._subplots.AxesSubplot object at 0x10c9d4a58>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x10ca195c0>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x10ca68198>],
       [<matplotlib.axes._subplots.AxesSubplot object at 0x10caa14a8>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x10caecf28>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x10cb2c198>],
       [<matplotlib.axes._subplots.AxesSubplot object at 0x10cb74e48>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x10cbb0f60>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x10cbffd30>]], dtype=object)

In [ ]: