This notebook will walk you through some exercises to get practice using Pandas for data manipulation.
As you use this, feel free to make ample use of the Pandas Documentation, the Pandas StackOverflow Channel, and your favorite search engine. For example, if you search phrases like "Pandas sum all columns", you're very likely to find an answer to the question you have in mind.
Also, if it comes down to it, note that solutions are available in the Git repository.
In [1]:
# Start with our normal batch of imports and settings
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
# Following is optional: set plotting styles
import seaborn; seaborn.set()
In [ ]:
In [ ]:
This is a bit tricky: you might be tempted to use a groupby and apply over the multiple indices ['year', 'gender', 'name'], but if you try this you'll find that it's very computationally intensive.
I'd suggest doing the following:
Is a name more likely to transition from female to male, or from male to female?
In [ ]: