Topic: Help with extremely simple reshaping of dataframe:

I'm taking advantage of this time to try and understand pandas a little better.

I'm trying to do something that seems extremely simple. I have a dataframe with this fake data about a number of people (say, participants in a study). I want to make a frequency matrix like shown below. I've tried various grouping, reindexing, and pivoting with no luck.


In [1]:
from pandas import DataFrame

df = DataFrame({'gender':['F','M','F','M','M','F','M'],
                'major':['Biology','Chemistry','Biology','English','Biology'
                         ,'Chemistry','English']})
df


Out[1]:
gender major
0 F Biology
1 M Chemistry
2 F Biology
3 M English
4 M Biology
5 F Chemistry
6 M English

7 rows × 2 columns


In [2]:
df.groupby(['gender','major']).agg('count').unstack()['gender'].fillna(0)


Out[2]:
major Biology Chemistry English
gender
F 2 1 0
M 1 1 2

2 rows × 3 columns


In [3]:
df.pivot_table(rows='gender',cols='major', aggfunc='count')


Out[3]:
gender major
major Biology Chemistry English Biology Chemistry English
gender
F 2 1 NaN 2 1 NaN
M 1 1 2 1 1 2

2 rows × 6 columns


In [4]:
df.pivot_table(rows='gender',cols='major', aggfunc='count')['gender']


Out[4]:
major Biology Chemistry English
gender
F 2 1 NaN
M 1 1 2

2 rows × 3 columns


In [5]:
df.pivot_table(rows='gender',cols='major', aggfunc='count')['gender'].fillna(0)


Out[5]:
major Biology Chemistry English
gender
F 2 1 0
M 1 1 2

2 rows × 3 columns


In [9]:
# maybe a clearer solution if you use 'values' -- 
# which can be either 'major' or 'gender' in this case

df.pivot_table(values='major', rows='gender',cols='major', aggfunc='count').fillna(0)


Out[9]:
major Biology Chemistry English
gender
F 2 1 0
M 1 1 2

2 rows × 3 columns