An overview of some of the data management tools in Python's Pandas package. Includes:
Selecting observations
Indexing
Groupby
Stacking
Doubly indexed dataframes
Combining dataframes (concat)
This notebook was written by Dave Backus for the NYU Stern course Data Bootcamp.
In [1]:
import pandas as pd
%matplotlib inline
In [37]:
data = {'countrycode': ['CHN', 'CHN', 'CHN', 'FRA', 'FRA', 'FRA', 'USA', 'USA', 'USA'],
'pop': [1124.7939240000001, 1246.8400649999999, 1318.1701519999999, 58.183173999999994,
60.764324999999999, 64.731126000000003, 253.33909699999998, 282.49630999999999,
310.38394799999998],
'rgdpe': [2611027.0, 4951485.0, 11106452.0, 1293837.0, 1752570.125, 2031723.25,
7964788.5, 11494606.0, 13151344.0],
'year': [1990, 2000, 2010, 1990, 2000, 2010, 1990, 2000, 2010]}
pwt = pd.DataFrame(data)
pwt
Out[37]:
In [ ]:
In [ ]:
### UN Population Data
In [ ]:
In [ ]: