In [ ]:
%matplotlib inline
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
I hope you actually tried the examples in
lesson 1
because you are required to use the sflights.h5
file you have created
as you read along the
lesson 1
notebook. When we use the sflights.h5
file, the data
DataFrame
looks as follows:
print(data.head())
month Day dTime aDelay dDelay depart arrive distance
0 1 3 1806 -3 -4 BWI CLT 361
1 1 4 1805 4 -5 BWI CLT 361
2 1 5 1821 23 11 BWI CLT 361
3 1 6 1807 10 -3 BWI CLT 361
4 1 7 1810 20 0 BWI CLT 361
[5 rows x 8 columns]
In [ ]:
data = pd.read_hdf('/data/database/sflights.h5', 'table')
print(data.head())
Write a function named pivot_table()
that takes a DataFrame.
It should return a DataFrame that has the *number count of flights that
month
and Day
.You should also pivot the data to make the pivot table
so that we can use the resulting DataFrame in sns.heatmap()
.
Week 12 lesson 2
shows you how to do this.
When I ran
pivoted = pivot_table(data)
print(pivoted)
I got
Day 1 2 3 4 5 6 7
month
1 1112 984 1157 599 1205 368 800
2 436 550 978 1097 1071 774 950
3 874 500 481 1317 1332 727 807
4 1147 678 903 1080 948 881 1077
5 1433 852 840 1126 1184 349 596
6 679 1410 1031 911 1110 853 651
7 1018 1296 1138 751 457 752 1683
8 673 733 1001 1866 1240 1199 822
9 588 331 373 447 772 655 1150
10 597 498 822 811 537 544 646
11 474 338 318 592 644 535 404
12 420 304 909 640 611 613 1035
[12 rows x 7 columns]
In [ ]:
def pivot_table(df):
'''
Takes a DataFrame and returns a DataFrame.
Parameters
----------
df: A Pandas DataFrame, e.g. pd.read_hdf('/data/database/sflights.h5', 'table')
Returns
-------
A Pandas DataFrame. It's a pivot table of delayed flihts
as a function of month and Day. See assignment 12.2 instructions.
'''
#### your code goes here ###
return result
Run the following code cell to test your function.
In [ ]:
pivoted = pivot_table(data)
print(pivoted)
In [ ]:
#### your code goes here ####