Calculate the correlation between the recycling rate and the median income. Discuss your findings in your PR.


In [10]:
!pip install xlrd
!pip install matplotlib
import matplotlib.pyplot as plt


Requirement already satisfied (use --upgrade to upgrade): xlrd in /usr/local/lib/python3.5/site-packages
Requirement already satisfied (use --upgrade to upgrade): matplotlib in /usr/local/lib/python3.5/site-packages
Requirement already satisfied (use --upgrade to upgrade): numpy>=1.6 in /usr/local/lib/python3.5/site-packages (from matplotlib)
Requirement already satisfied (use --upgrade to upgrade): python-dateutil in /usr/local/lib/python3.5/site-packages (from matplotlib)
Requirement already satisfied (use --upgrade to upgrade): pytz in /usr/local/lib/python3.5/site-packages (from matplotlib)
Requirement already satisfied (use --upgrade to upgrade): pyparsing!=2.0.0,!=2.0.4,>=1.5.6 in /usr/local/lib/python3.5/site-packages (from matplotlib)
Requirement already satisfied (use --upgrade to upgrade): cycler in /usr/local/lib/python3.5/site-packages (from matplotlib)
Requirement already satisfied (use --upgrade to upgrade): six>=1.5 in /usr/local/lib/python3.5/site-packages (from python-dateutil->matplotlib)

In [11]:
import pandas as pd
%matplotlib inline

In [12]:
df = pd.read_excel("2013_NYC_CD_MedianIncome_Recycle.xlsx")

In [20]:
df


Out[20]:
CD_Name MdHHIncE RecycleRate
0 Battery Park City, Greenwich Village & Soho 119596 0.286771
1 Battery Park City, Greenwich Village & Soho 119596 0.264074
2 Chinatown & Lower East Side 40919 0.156485
3 Chelsea, Clinton & Midtown Business Distric 92583 0.235125
4 Chelsea, Clinton & Midtown Business Distric 92583 0.246725
5 Murray Hill, Gramercy & Stuyvesant Town 101769 0.222046
6 Upper West Side & West Side 96009 0.256809
7 Upper East Side 104602 0.253719
8 Hamilton Heights, Manhattanville & West Harlem 41736 0.155888
9 Central Harlem 36468 0.133018
10 East Harlem 30335 0.140438
11 Washington Heights, Inwood & Marble Hill 37685 0.149605
12 Hunts Point, Longwood & Melrose 21318 0.104569
13 Hunts Point, Longwood & Melrose 21318 0.103643
14 Belmont, Crotona Park East & East Tremont 22343 0.119219
15 Concourse, Highbridge & Mount Eden 25745 0.103573
16 Morris Heights, Fordham South & Mount Hope 24517 0.119646
17 Belmont, Crotona Park East & East Tremont 22343 0.110713
18 Bedford Park, Fordham North & Norwood 30541 0.136455
19 Riverdale, Fieldston & Kingsbridge 56877 0.221890
20 Castle Hill, Clason Point & Parkchester 34779 0.105807
21 Co-op City, Pelham Bay & Schuylerville 54685 0.214509
22 Pelham Parkway, Morris Park & Laconia 43503 0.163576
23 Wakefield, Williamsbridge & Woodlawn 43541 0.182580
24 Greenpoint & Williamsburg 50778 0.141621
25 Brooklyn Heights & Fort Greene 73290 0.237205
26 Bedford-Stuyvesant 36528 0.125818
27 Bushwick 38274 0.132463
28 East New York & Starrett City 33700 0.114030
29 Park Slope, Carroll Gardens & Red Hook 93969 0.302798
30 Sunset Park & Windsor Terrace 43351 0.197697
31 Crown Heights North & Prospect Heights 41075 0.156241
32 Crown Heights South, Prospect Lefferts & Wingate 41095 0.115119
33 Bay Ridge & Dyker Heights 57006 0.220855
34 Bensonhurst & Bath Beach 48252 0.183393
35 Borough Park, Kensington & Ocean Parkway 38215 0.156080
36 Brighton Beach & Coney Island 30159 0.134260
37 Flatbush & Midwood 41681 0.145995
38 Sheepshead Bay, Gerritsen Beach & Homecrest 49392 0.193802
39 Brownsville & Ocean Hill 27772 0.091464
40 East Flatbush, Farragut & Rugby 45954 0.134002
41 Canarsie & Flatlands 63106 0.174876
42 Astoria & Long Island City 50716 0.215254
43 Sunnyside & Woodside 54136 0.198388
44 Jackson Heights & North Corona 47555 0.137919
45 Elmhurst & South Corona 45661 0.130604
46 Ridgewood, Glendale & Middle Village 54924 0.214185
47 Forest Hills & Rego Park 64372 0.210247
48 Flushing, Murray Hill & Whitestone 51251 0.192124
49 Briarwood, Fresh Meadows & Hillcrest 59124 0.194293
50 Richmond Hill & Woodhaven 58578 0.187987
51 Howard Beach & Ozone Park 60828 0.183898
52 Bayside, Douglaston & Little Neck 74960 0.253064
53 Jamaica, Hollis & St. Albans 51251 0.157345
54 Queens Village, Cambria Heights & Rosedale 76002 0.196679
55 Far Rockaway, Breezy Point & Broad Channel 46944 0.123351
56 Port Richmond, Stapleton & Mariner's Harbor 57975 0.196748
57 New Springville & South Beach 71925 0.211485
58 Tottenville, Great Kills & Annadale 84670 0.210379

In [14]:
recycling_rate= df['RecycleRate']
recycling_rate.head()


Out[14]:
0    0.286771
1    0.264074
2    0.156485
3    0.235125
4    0.246725
Name: RecycleRate, dtype: float64

In [15]:
median_income= df['MdHHIncE']
median_income.head()


Out[15]:
0    119596
1    119596
2     40919
3     92583
4     92583
Name: MdHHIncE, dtype: int64

In [16]:
df.corr()


Out[16]:
MdHHIncE RecycleRate
MdHHIncE 1.000000 0.884783
RecycleRate 0.884783 1.000000

In [18]:
plt.style.use('fivethirtyeight')

In [1]:
x= df.plot(kind='scatter', y='RecycleRate', x='MdHHIncE', color='orange', figsize= (7,5), )
x.set_ylim([0.05, 0.35])
x.set_xlim([10000, 140000])
x.set_title("The more people earn, the more they recycle")
x.set_xlabel("Median Income")
x.set_ylabel("Recycle Rate")


---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-1-f6486e4619d3> in <module>()
----> 1 x= df.plot(kind='scatter', y='RecycleRate', x='MdHHIncE', color='orange', figsize= (7,5), )
      2 x.set_ylim([0.05, 0.35])
      3 x.set_xlim([10000, 140000])
      4 x.set_title("The more people earn, the more they recycle")
      5 x.set_xlabel("Median Income")

NameError: name 'df' is not defined

The correlation coeficient is 88%. I titled my graph 'The more people earn, the more they recycle', because that is what the data shows us but now that I am writing it for a second time it sounds too general and I am making it seem as if earning more money makes people recycle (causation). Recycling does not depend on how much money someone makes but as we can see, median income and recycling rates are correlated, so that could have something to do with what people earning more money know about recycling and their views on it.