Purchase frequency

In this notebook we create a table grouping transaction information by customers’ purchase frequency. This is done with functions from the pandas librairie such as df.groupby() and df.cut().



In [30]:

    
import pandas as pd
import contiamo

Query data into Contiamo



In [31]:

    
transactions = %contiamo query query:sql:48590597:411:g71GXzJjsx4Uvad11ouKjoYbQUNNPy-qRMKkBNZfyx4
customers = %contiamo query query:sql:48590597:441:MG5W2dMjXzYgsHsgdQYzmhv44dxEQX2Lodu5Uh2Hx_s
applications = %contiamo query query:sql:48590597:442:-gz3nbw1fdmtSXkD4zGNA-cVa7s6sQtRn8upCSn6uys

Select data from march 2017



In [32]:

    
transactions = transactions.loc[(transactions['Field transaction date date'] <= 20170331) & (transactions['Field transaction date date'] >= 20170301)]



In [27]:

    
df = pd.DataFrame({
        'customer_id' : transactions['Field customer id'],
        'revenue' : pd.to_numeric(transactions['Field app price']),
        'n_of_purchases' : [1]*len(transactions)
    })
df.head()









    Out[27]:






  
    
      
      customer_id
      n_of_purchases
      revenue
    
  
  
    
      0
      2F65fA7Cd10cbFFD
      1
      4.99
    
    
      1
      2F65fA7Cd10cbFFD
      1
      0.19
    
    
      2
      2F65fA7Cd10cbFFD
      1
      0.99
    
    
      5
      44dEeeb9eEF3ff64
      1
      14.99
    
    
      8
      7FeFbCBe6147a9ff
      1
      0.19

Create an intermediate table



In [28]:

    
total_users = df['customer_id'].nunique()

#Grouping by customer id
pf = df.groupby('customer_id').agg({
        'revenue' : 'sum',
        'n_of_purchases' : 'sum'
    }).reset_index()

#Selection of data to create 'No. of Purchases', 'No. of UAUs'and 'Total Revenue'columns 
x = pf.groupby('n_of_purchases').agg({
    'customer_id': 'size',
    'revenue' : 'sum',
})

x = x.reset_index()

#Renaming columns
x.rename(columns = {'customer_id': 'n_of_users'}, inplace=True)

#Creation of '% of Total UAUs' and 'Total transations'columns
x['percentage_of_users'] = 100 * x['n_of_users'] / total_users

x['total_transactions'] = x['n_of_purchases'] * x['n_of_users']

#Creation of 'No. of Purchases' buckets 
x['n_of_purchases'] = pd.cut(x['n_of_purchases'], bins=[0,1,2,3,5,8,13,21,34,55,float('inf')]) 


x.head()









    Out[28]:






  
    
      
      n_of_purchases
      n_of_users
      revenue
      percentage_of_users
      total_transactions
    
  
  
    
      0
      (0, 1]
      13656
      108827.53
      45.504832
      13656
    
    
      1
      (1, 2]
      4848
      67683.44
      16.154615
      9696
    
    
      2
      (2, 3]
      2697
      60800.95
      8.987004
      8091
    
    
      3
      (3, 5]
      1734
      40058.73
      5.778074
      6936
    
    
      4
      (3, 5]
      1621
      47674.62
      5.401533
      8105

Create final table



In [29]:

    
#Grouping data by 'No. of Purchases' buckets
y = x.groupby('n_of_purchases').agg({
    'revenue': 'sum',
    'n_of_users': 'sum',
    'percentage_of_users' : 'sum',
    'total_transactions': 'sum',
})

y = y.reset_index()


#Creation of 'Average revenue per user' and 'Average transaction price' colums
y['average_revenue_per_user'] = y['revenue'] / y['n_of_users']
y['average_transaction_price'] = y['revenue'] / y['total_transactions']


#Putting the colums in the right order
cols = ['n_of_purchases','n_of_users', 'percentage_of_users', 'revenue', 'average_revenue_per_user', 'total_transactions', 'average_transaction_price']
y = y[cols]

#Renaming the 'No. of Purchases' buckets
y['n_of_purchases']=y['n_of_purchases'].astype(str)
y = y.replace({'n_of_purchases': {
            '(0, 1]': '1',
            '(1, 2]' : '2',
            '(2, 3]' : '3', 
            '(3, 5]' : '4-5',
            '(5, 8]' : '6-8',
            '(8, 13]' : '9-13',
            '(13, 21]' : '14-21',
            '(21, 34]' : '22-34',
            '(34, 55]' : '35-55',
            '(55, inf]' : 'Above 55',
        }})

y









    Out[29]:






  
    
      
      n_of_purchases
      n_of_users
      percentage_of_users
      revenue
      average_revenue_per_user
      total_transactions
      average_transaction_price
    
  
  
    
      0
      1
      13656
      45.504832
      108827.53
      7.969210
      13656
      7.969210
    
    
      1
      2
      4848
      16.154615
      67683.44
      13.961106
      9696
      6.980553
    
    
      2
      3
      2697
      8.987004
      60800.95
      22.543919
      8091
      7.514640
    
    
      3
      4-5
      3355
      11.179607
      87733.35
      26.150030
      15041
      5.832947
    
    
      4
      6-8
      2160
      7.197601
      67805.61
      31.391486
      15094
      4.492223
    
    
      5
      9-13
      1501
      5.001666
      52535.05
      35.000033
      16605
      3.163809
    
    
      6
      14-21
      927
      3.088970
      43901.67
      47.358867
      16169
      2.715175
    
    
      7
      22-34
      575
      1.916028
      41079.43
      71.442487
      15946
      2.576159
    
    
      8
      35-55
      200
      0.666445
      23325.54
      116.627700
      9070
      2.571724
    
    
      9
      Above 55
      91
      0.303232
      43065.94
      473.252088
      20571
      2.093527

Interpretation: The maximum revenue is generated by people buying apps only once a month (low frequency buyers). We can now identify buying behaviours according to purchase frequency.

	customer_id	n_of_purchases	revenue
0	2F65fA7Cd10cbFFD	1	4.99
1	2F65fA7Cd10cbFFD	1	0.19
2	2F65fA7Cd10cbFFD	1	0.99
5	44dEeeb9eEF3ff64	1	14.99
8	7FeFbCBe6147a9ff	1	0.19

	n_of_purchases	n_of_users	revenue	percentage_of_users	total_transactions
0	(0, 1]	13656	108827.53	45.504832	13656
1	(1, 2]	4848	67683.44	16.154615	9696
2	(2, 3]	2697	60800.95	8.987004	8091
3	(3, 5]	1734	40058.73	5.778074	6936
4	(3, 5]	1621	47674.62	5.401533	8105