In [ ]:
import pandas as pd
import numpy as np

In [ ]:
df = pd.read_excel('sales-funnel.xlsx')
df.head()

In [ ]:
df["Status"] = df["Status"].astype("category")
df["Status"].cat.set_categories(["won","pending","presented","declined"],inplace=True)
df.head()

In [ ]:
df.pivot_table(index="Name")  # change index and show aggregate mean of numerical features

In [ ]:
pivot1 = df.pivot_table(index=["Name", "Rep", "Manager"])   # can have multi index
pivot1

How pivot tables work

Sets the index to the desired features, then groupby the new multi index and aggregate and sort the features.


In [ ]:
df.set_index(["Name", "Rep", "Manager"])

In [ ]:
df2 = df.set_index(["Name", "Rep", "Manager"])
gr = df2.groupby(level=[0,1,2])
pivot2 = gr.agg('mean').sort_index(axis=1)
pivot2

In [ ]:
all(pivot2 == pivot1)

Can change the default aggregate function

Default is mean by can change that to something else.


In [ ]:
pd.pivot_table(df, index=["Manager","Rep"], values=["Price"], aggfunc='sum')

Can also provide apply a list of aggregate functions (though not a list of strings, nor a mix of strings and functions)


In [ ]:
pd.pivot_table(df, index=["Manager","Rep"], values=["Price"], aggfunc=[np.mean, np.sum])

Can provide a dict that maps features to one or more aggregate functions.


In [ ]:
table = pd.pivot_table(df,index=["Manager","Status"],columns=["Product"],values=["Quantity","Price"],
               aggfunc={"Quantity":len,"Price":[np.sum,np.mean]},fill_value=0)
table

In [ ]:
pd.pivot_table(df, index=["Manager","Rep"], aggfunc={"Quantity":len,"Price":[np.sum,np.mean]})

Can decompose aggregate values accross columns by another feature


In [ ]:
pd.pivot_table(df, index=["Manager","Rep"], aggfunc={"Quantity":len,"Price":[np.sum,np.mean]}, columns="Product")

Can replace the "scary" NaN values by something else.


In [ ]:
table = pd.pivot_table(df, index=["Manager","Rep"], aggfunc={"Quantity":len,"Price":[np.sum,np.mean]}, columns="Product", fill_value=0)

In [ ]: