In [10]:
!pip install xlrd
!pip install matplotlib
import matplotlib.pyplot as plt
In [11]:
import pandas as pd
%matplotlib inline
In [12]:
df = pd.read_excel("2013_NYC_CD_MedianIncome_Recycle.xlsx")
In [20]:
df
Out[20]:
In [14]:
recycling_rate= df['RecycleRate']
recycling_rate.head()
Out[14]:
In [15]:
median_income= df['MdHHIncE']
median_income.head()
Out[15]:
In [16]:
df.corr()
Out[16]:
In [18]:
plt.style.use('fivethirtyeight')
In [1]:
x= df.plot(kind='scatter', y='RecycleRate', x='MdHHIncE', color='orange', figsize= (7,5), )
x.set_ylim([0.05, 0.35])
x.set_xlim([10000, 140000])
x.set_title("The more people earn, the more they recycle")
x.set_xlabel("Median Income")
x.set_ylabel("Recycle Rate")
The correlation coeficient is 88%. I titled my graph 'The more people earn, the more they recycle', because that is what the data shows us but now that I am writing it for a second time it sounds too general and I am making it seem as if earning more money makes people recycle (causation). Recycling does not depend on how much money someone makes but as we can see, median income and recycling rates are correlated, so that could have something to do with what people earning more money know about recycling and their views on it.