Pandas Aggregation Exercises

See also: Groupby: Split-Apply-Combine



In [ ]:
# 1. Import pandas as pd, numpy as np, and assign the result of pd.read_csv()the CFPB CSV to 'df'.

In [ ]:
# 2. Use the dataframe's groupby() method to create a groupby object on 'Product'. Assign it to 'gb'.

In [ ]:
# 3. Look at your groupby object's .groups attribute. What type of is it? Use this attribute's keys() method.

In [ ]:
# 4. Now trigger the .groups.values() method. What are those values for?

In [ ]:
# 5. Assign gb.groups.keys() to variable 'groups'. Use the Python built-in list to turn this into group_list.

In [ ]:
# 6. Use the gb.get_group() method to pull up a dataframe of 'Debt collection' items

In [ ]:
# 7. Use the size() method to get a count of each and every product

In [ ]:
# 8. The gb object has columns like a dataframe. Get the max of each group for 'Amount Received'.

In [ ]:
# 9. Use the gb's agg() method to run a list of functions [np.mean, np.std, np.median] and create a dataframe.

In [ ]:
# 10. Make a new groupby from 'df' using the list['Product', 'Sub-product] and assign it to 'gb2'

In [ ]:
# 11. Take gb2 and get the size(). Put the result in 'gb2_size'.

In [ ]:
# 12. Get the mean of gb2. Save it using the to_excel method of the dataframe.

In [ ]:
# 13. Take gb2_size and call its unstack() method. You can fillna() and stack() for an easy way to add zeros.

In [ ]:
# 14. Create a groupby object using a function or attribute.

In [ ]:
# 15. Get average monthly results by groupby df[DATE_COLUMN].dt.month