Using data from this FiveThirtyEight post (http://fivethirtyeight.com/datalab/opinions-about-the-iran-deal-are-more-about-obama-than-iran/) , write code to calculate the correlation of the responses from the poll. Respond to the story in your PR. Is this a good example of data journalism? Why or why not?


In [11]:
import pandas as pd
import numpy as np
%matplotlib inline
import matplotlib.pyplot as plt
import statsmodels.formula.api as smf

In [12]:
df=pd.read_csv('foxnewspoll.csv')

In [13]:
lm = smf.ols(formula="Favor_IranDeal~Approve_Obama",data=df).fit()

In [14]:
intercept, slope = lm.params

In [15]:
fig, ax = plt.subplots(figsize=(7,7))
plt.style.use('fivethirtyeight')

ax = df.plot(ax = ax, kind='scatter', x="Approve_Obama", y="Favor_IranDeal")
plt.plot(df['Approve_Obama'],slope*df['Approve_Obama']+intercept, color="red", linewidth=2)

ax.set_xlim(0,1)
ax.set_ylim(0,1)

ax.set_ylabel("Favor Iran deal")
ax.set_xlabel("Approve of Obama")


Out[15]:
<matplotlib.text.Text at 0x10e2270f0>

In [16]:
df['Confident_IranNeg'] = df["Very confident_IranNeg"] + df['Somewhat confident_IranNeg']
lm2 = smf.ols(formula="Confident_IranNeg~Approve_Obama",data=df).fit()

In [17]:
intercept, slope = lm2.params

In [18]:
fig, ax = plt.subplots(figsize=(7,7))
plt.style.use('fivethirtyeight')

ax = df.plot(ax = ax, kind='scatter', x="Approve_Obama", y="Confident_IranNeg")
plt.plot(df['Approve_Obama'],slope*df['Approve_Obama']+intercept, color="red", linewidth=2)

ax.set_xlim(0,1)
ax.set_ylim(0,1)

ax.set_ylabel("Confident in administration's Iran negotiations")
ax.set_xlabel("Approve of Obama")


Out[18]:
<matplotlib.text.Text at 0x10e26b048>