Python for Data Analysis Lightning Tutorials is a series of tutorials in Data Analysis, Statistics, and Graphics using Python. The Pandas Cookbook series of tutorials provides recipes for common tasks and moves on to more advanced topics in statistics and time series analysis.
Created by Alfred Essa, Dec 15th, 2013
Note: IPython Notebook and Data files can be found at my Github Site: http://github/alfredessa
The DataFrame data structure in Pandas is a two-dimensional labeled array.
Here's an example where we have set the Dates column to be the index and label for the rows.
In [ ]:
import pandas as pd
import datetime
In [ ]:
# create a list containing dates from 12-01 to 12-07
dt = datetime.datetime(2013,12,1)
end = datetime.datetime(2013,12,8)
step = datetime.timedelta(days=1)
dates = []
In [ ]:
# populate the list
while dt < end:
dates.append(dt.strftime('%m-%d'))
dt += step
In [ ]:
dates
In [ ]:
d = {'Date': dates, 'Tokyo' : [15,19,15,11,9,8,13], 'Paris': [-2,0,2,5,7,-5,-3], 'Mumbai':[20,18,23,19,25,27,23]}
In [ ]:
d
In [ ]:
temps = pd.DataFrame(d)
In [ ]:
ntemp = temps['Mumbai']
In [ ]:
ntemp
In [ ]:
temps = temps.set_index('Date')
In [ ]:
temps
In [ ]:
titanic = pd.read_csv('data/titanic.csv')
In [ ]:
titanic.Survived.value_counts()
In [ ]:
medals=pd.read_csv('data/olympicmedals.csv')
In [ ]:
medals.tail()
In [ ]:
medals.Sport.value_counts()
In [ ]: