Pandas is a very rich library for working with tabular data. It's especially good at dealing with timeseries and hierarchical indexes. The full documentation for version 0.15.2 is here.
Note that you'll need to git annex get data/2015-01-04-carto-export.csv
before this will work!
In [1]:
# This is a not-very-pleasant way to look at the data
!head ../data/2015-01-04-carto-export.csv
post_id,post_title,post_date,post_excerpt,lat,lng,location_id,location_name,location_notes,street_address,city,state,zip,county,total_cost,start_year,completion_year,new_deal_agencies,new_deal_categories,artists,contractors,designers,status,menu_order,image
49119,"Snohomish County Drainage Improvements - Monroe WA",2014/10/25,""In Snohomish County, farm land conditions will be improved by a drainage system affecting seven sections of land between the cities of Snohomish and Monroe. Cooperating with the State of...",47.876538,-122.034003,,,"General marker for area within Snohomish County",,Monroe,WA,,Snohomish,,1937,,"Works Progress Administration (WPA)","Flood erosion and control, Utilities and Infrastructure, Water disposal",,,,,,
36443,"Plum Bayou Resettlement Project - Plum Bayou AR",2014/02/03,"Plum Bayou was the first settlement in Arkansas and in the United States (Arkansas Historic Preservation Program). Resettlement Administrator Rexford G. Tugwell, was present at the opening dedication...",34.295303,-91.91833,,,"Location marker approximate. Exact coordinates needed.",,"Plum Bayou",Arkansas,72182,Jefferson,,1935,1936,"Resettlement Administration (RA)","Resettlement Communities",,,,,,http://livingnewdeal.org/wp-content/uploads/2014/02/plum_bayou1_f-300x218.jpg
14921,"Temple Street Bridge - Los Angeles CA",2014/01/21,"The PWA built this large concrete bridge over Figueroa St.",34.0599094,-118.24851030000002,,,,"765-799 W Temple St","Los Angeles ",CA,90012,,,,1939,"Public Works Administration (PWA)","Roads, highways and bridges, Utilities and Infrastructure",,,,Marked,,http://livingnewdeal.org/wp-content/uploads/2013/07/Temple04-289x225.jpg
49364,"Husky Stadium Expansion - Seattle WA",2014/10/27,"The University of Washington's Husky Stadium was expanded during the 1930s as a result of WPA funding assistance and efforts. A WPA press release from Dec. 1937 announced $23,345 in funds for...",47.6503,-122.3015,2934,"University of Washington - Seattle WA",,,Seattle,WA,,,,,,"Works Progress Administration (WPA)","Parks and recreation, Stadiums",,,,,,
40496,"Nelson W. Aldrich High School - Warwick RI",2014/05/11,"A long, low Colonial Revival school with a portico and pediment. One of the last major commissions of its architects, William R. Walker & Son. Has served as a junior high school since...",41.754381,-71.41475400000002,,,,"789 Post Road",Warwick,RI,02888,Kent,,1934,1935,"Works Progress Administration (WPA)","Education, Schools",,,"William R. Walker & Son",,,
40497,"Oakland Beach School - Warwick RI",2014/05/11,"A mundane Colonial Revival structure serving the Oakland Beach neighborhood of Warwick. The architects were William R. Walker & Son of...",41.698839,-71.39910299999997,,,,"383 Oakland Beach Avenue",Warwick,RI,02889,Kent,,1933,1934,"Works Progress Administration (WPA)","Education, Schools",,,"William R. Walker & Son",,,
40498,"Municipal Utility Improvements - Auburn ME",2014/05/11,"According to an article in the Lewiston Evening Journal of January 3, 1935 by Gerald Reed, extensive utility work was undertaken in the city by a combination of the CWA, FERA, & ERA agencies....",44.0976659,-70.232664,,,,"268 Court St.",Auburn,ME,04210,,,1933,,"Civil Works Administration (CWA), Federal Emergency Relief Administration (FERA)","Public utilities and sanitation, Utilities and Infrastructure, Water disposal",,,,,,http://livingnewdeal.org/wp-content/uploads/2014/05/AuburnWS-300x214.jpg
40509,"Municipal improvements - Auburn ME",2014/05/11,"The Lewiston Evening Journal reported that by 1935, a combination of the CWA, FERA, and ERA had completed numerous work projects in Auburn Maine: A two mile hiking trail along the Little...",44.0978509,-70.23116549999997,,,"General marker for city of Auburn.",,Auburn,ME,04210,,,1933,1935,"Civil Works Administration (CWA), Federal Emergency Relief Administration (FERA)","Education, New Deal Work Site, Parks and recreation, Public buildings, Schools, Stadiums, Trails",,,,,,
40501,"Suburban Parkway Landscaping - Warwick RI",2014/05/11,"By 1940, the tracks of the former Warwick Railroad had been removed from the center of Suburban Parkway in Oakland Beach. As a WPA project, this center strip was landscaped.",41.6868847,-71.39776840000002,,,,"Suburban Parkway",Warwick,RI,02889,Kent,,1940,,"Works Progress Administration (WPA)","Roads, highways and bridges, Utilities and Infrastructure",,,,,,
In [9]:
import pandas
# I always print these versions in my notebooks to make sure things haven't changed on me...
print(pandas.__version__)
0.15.2
In [2]:
# This is much nicer!
carto_df = pandas.read_csv('../data/2015-01-04-carto-export.csv')
carto_df.head()
Out[2]:
post_id
post_title
post_date
post_excerpt
lat
lng
location_id
location_name
location_notes
street_address
...
start_year
completion_year
new_deal_agencies
new_deal_categories
artists
contractors
designers
status
menu_order
image
0
49119
Snohomish County Drainage Improvements - Monro...
2014/10/25
"In Snohomish County, farm land condition...
47.876538
-122.034003
NaN
NaN
General marker for area within Snohomish County
NaN
...
1937
NaN
Works Progress Administration (WPA)
Flood erosion and control, Utilities and Infra...
NaN
NaN
NaN
NaN
NaN
NaN
1
36443
Plum Bayou Resettlement Project - Plum Bayou AR
2014/02/03
Plum Bayou was the first settlement in Arkansa...
34.295303
-91.91833
NaN
NaN
Location marker approximate. Exact coordinates...
NaN
...
1935
1936
Resettlement Administration (RA)
Resettlement Communities
NaN
NaN
NaN
NaN
NaN
http://livingnewdeal.org/wp-content/uploads/20...
2
14921
Temple Street Bridge - Los Angeles CA
2014/01/21
The PWA built this large concrete bridge over ...
34.0599094
-118.24851030000002
NaN
NaN
NaN
765-799 W Temple St
...
NaN
1939
Public Works Administration (PWA)
Roads, highways and bridges, Utilities and Inf...
NaN
NaN
NaN
Marked
NaN
http://livingnewdeal.org/wp-content/uploads/20...
3
49364
Husky Stadium Expansion - Seattle WA
2014/10/27
The University of Washington's Husky Stad...
47.6503
-122.3015
2934
University of Washington - Seattle WA
NaN
NaN
...
NaN
NaN
Works Progress Administration (WPA)
Parks and recreation, Stadiums
NaN
NaN
NaN
NaN
NaN
NaN
4
40496
Nelson W. Aldrich High School - Warwick RI
2014/05/11
A long, low Colonial Revival school with a por...
41.754381
-71.41475400000002
NaN
NaN
NaN
789 Post Road
...
1934
1935
Works Progress Administration (WPA)
Education, Schools
NaN
NaN
William R. Walker & Son
NaN
NaN
NaN
5 rows × 25 columns
In [5]:
# This gives a description of columns that pandas understands
carto_df.describe()
Out[5]:
post_id
location_id
menu_order
count
8943.000000
4143.000000
983.000000
mean
24169.356144
5141.478639
1.355036
std
17733.598439
2646.650055
2.486848
min
1.000000
730.000000
1.000000
25%
7387.500000
2777.500000
1.000000
50%
21266.000000
5045.000000
1.000000
75%
41384.500000
6839.000000
1.000000
max
56469.000000
11345.000000
27.000000
That's not many columns! What's going on here?
In [6]:
carto_df.dtypes
Out[6]:
post_id int64
post_title object
post_date object
post_excerpt object
lat object
lng object
location_id float64
location_name object
location_notes object
street_address object
city object
state object
zip object
county object
total_cost object
start_year object
completion_year object
new_deal_agencies object
new_deal_categories object
artists object
contractors object
designers object
status object
menu_order float64
image object
dtype: object
So, pandas doesn't know what those object
columns are. We have some data cleaning to do!
In [ ]:
Content source: marwahaha/python-fundamentals
Similar notebooks: