Prediction

If the average lifetime of a customer is 90 days since their first deposit :
build a model to predict the Lifetime Value of a customer given his first 15 days of activity. What other data could be pertinent to answer this question? Given the data provided is the assumption of 90 days valid?



In [115]:

    
import pandas as pd
import pandas_profiling
import matplotlib
from datetime import datetime
import numpy as np
import matplotlib.pyplot as plt
from datetime import timedelta
from sklearn.preprocessing import MinMaxScaler
from sklearn.preprocessing import RobustScaler
from sklearn.cluster import KMeans

pd.options.display.max_columns = None

matplotlib.style.use('ggplot')
%matplotlib inline



In [116]:

    
# get dataset
df_bc = pd.read_csv('data/dataset_betclic.csv')

# convert string date to date
df_bc['BirthDate'] = pd.to_datetime(df_bc['BirthDate'], format='%Y-%m-%dT%H:%M:%S.%fZ')
df_bc['FirstDepositDate'] = pd.to_datetime(df_bc['FirstDepositDate'], format='%Y-%m-%dT%H:%M:%S.%fZ')
df_bc['BetDate'] = pd.to_datetime(df_bc['BetDate'], format='%Y%m%d')

Variables

Lifetime

The lifetime of a customer is the difference between the date of the last bet and the first deposit date
$max(BetDate) - min(FirstDepositDate)$



In [117]:

    
df_max_BetDate = df_bc.groupby(['UserId'])['BetDate'].max().reset_index()
df_max_BetDate.columns = ['UserId', 'max_BetDate']

df_min_FirstDepositDate = df_bc.groupby(['UserId'])['FirstDepositDate'].min().reset_index()
df_min_FirstDepositDate.columns = ['UserId', 'min_FirstDepositDate']

df_ltv = pd.merge(df_bc, 
                  df_max_BetDate, 
                  how='left', 
                  left_on=['UserId'], 
                  right_on=['UserId'])

df_ltv = pd.merge(df_ltv, 
                  df_min_FirstDepositDate, 
                  how='left', 
                  left_on=['UserId'], 
                  right_on=['UserId'])

df_ltv['lifetime'] = df_ltv['max_BetDate'] - df_ltv['min_FirstDepositDate']

df_ltv.head()









    Out[117]:







  
    
      
      UserId
      Country
      Gender
      BirthDate
      PartnerType
      FirstDepositDate
      BetDate
      BetId
      Application
      ProductName
      IsLive
      BetAmount
      AmountWon
      Odds
      Cashout
      max_BetDate
      min_FirstDepositDate
      lifetime
    
  
  
    
      0
      10783501
      Portugal
      M
      1992-06-08
      PPC
      2017-01-03
      2017-01-03
      607023702
      BETCLIC.PT
      SportsBook
      False
      5.00
      0.0
      124.76
      0.0
      2017-09-10
      2017-01-03
      250 days
    
    
      1
      10782912
      Ivory Coast
      M
      1996-03-18
      Internal
      2017-01-03
      2017-01-03
      607023878
      BETCLIC.COM
      SportsBook
      False
      5.00
      23.0
      4.60
      0.0
      2017-10-21
      2017-01-03
      291 days
    
    
      2
      10780813
      France
      F
      1998-04-05
      PPC
      2017-01-01
      2017-01-03
      607024082
      BETCLIC.FR
      SportsBook
      False
      5.00
      0.0
      8.00
      0.0
      2017-09-04
      2017-01-01
      246 days
    
    
      3
      10783215
      Portugal
      M
      1976-12-17
      Internal
      2017-01-03
      2017-01-03
      607024088
      BETCLIC.PT
      SportsBook
      False
      1.00
      0.0
      41.08
      0.0
      2017-11-19
      2017-01-03
      320 days
    
    
      4
      10783401
      Poland
      M
      1995-04-16
      Affiliation
      2017-01-03
      2017-01-03
      607024113
      EXPEKT.COM
      SportsBook
      False
      5.67
      0.0
      3.75
      0.0
      2017-01-09
      2017-01-03
      6 days

We check if the average lifetime of a customer given by Betclick, 90 jours, is the same in our data :



In [118]:

    
df_ltv.drop_duplicates(subset='UserId')['lifetime'].mean()









    Out[118]:





Timedelta('98 days 01:44:26.035675')

We find an average at 98 days

Value

I consider the Value of a customer the amount he lost during his active lifetime on Betclic
$BetAmount - AmountWon$



In [119]:

    
df_bet_amount = df_bc.groupby(['UserId'])['BetAmount'].sum().reset_index()
df_bet_amount.columns = ['UserId', 'tot_bet_amount']

df_ltv = pd.merge(df_ltv, 
                  df_bet_amount, 
                  how='left', 
                  left_on=['UserId'], 
                  right_on=['UserId'])

df_amount_won = df_bc.groupby(['UserId'])['AmountWon'].sum().reset_index()
df_amount_won.columns = ['UserId', 'tot_amount_won']

df_ltv = pd.merge(df_ltv, 
                  df_amount_won, 
                  how='left', 
                  left_on=['UserId'], 
                  right_on=['UserId'])

df_ltv['value_brut'] = df_ltv['tot_bet_amount'] - df_ltv['tot_amount_won']

df_ltv.head()









    Out[119]:







  
    
      
      UserId
      Country
      Gender
      BirthDate
      PartnerType
      FirstDepositDate
      BetDate
      BetId
      Application
      ProductName
      IsLive
      BetAmount
      AmountWon
      Odds
      Cashout
      max_BetDate
      min_FirstDepositDate
      lifetime
      tot_bet_amount
      tot_amount_won
      value_brut
    
  
  
    
      0
      10783501
      Portugal
      M
      1992-06-08
      PPC
      2017-01-03
      2017-01-03
      607023702
      BETCLIC.PT
      SportsBook
      False
      5.00
      0.0
      124.76
      0.0
      2017-09-10
      2017-01-03
      250 days
      30.00
      0.00
      30.00
    
    
      1
      10782912
      Ivory Coast
      M
      1996-03-18
      Internal
      2017-01-03
      2017-01-03
      607023878
      BETCLIC.COM
      SportsBook
      False
      5.00
      23.0
      4.60
      0.0
      2017-10-21
      2017-01-03
      291 days
      1044.00
      900.16
      143.84
    
    
      2
      10780813
      France
      F
      1998-04-05
      PPC
      2017-01-01
      2017-01-03
      607024082
      BETCLIC.FR
      SportsBook
      False
      5.00
      0.0
      8.00
      0.0
      2017-09-04
      2017-01-01
      246 days
      966.71
      481.26
      485.45
    
    
      3
      10783215
      Portugal
      M
      1976-12-17
      Internal
      2017-01-03
      2017-01-03
      607024088
      BETCLIC.PT
      SportsBook
      False
      1.00
      0.0
      41.08
      0.0
      2017-11-19
      2017-01-03
      320 days
      503.30
      320.30
      183.00
    
    
      4
      10783401
      Poland
      M
      1995-04-16
      Affiliation
      2017-01-03
      2017-01-03
      607024113
      EXPEKT.COM
      SportsBook
      False
      5.67
      0.0
      3.75
      0.0
      2017-01-09
      2017-01-03
      6 days
      186.77
      169.85
      16.92

Lifetime Value

I consider the lifetime value as :
$\frac{Value}{Lifetime}$
I think this value is more interesting than :
$BetAmount - AmountWon$
as it allows us to compare two customers independantly of the time spend on the website.



In [120]:

    
# value/lifetime
df_ltv['lifetime_int'] = df_ltv['lifetime'].astype('<m8[D]')
# turn 1 the 0 lifetime
df_ltv['lifetime_int'] = np.where(df_ltv['lifetime_int'] == 0, 1, df_ltv['lifetime_int'])
df_ltv['lifetime_value'] = df_ltv['value_brut'] / df_ltv['lifetime_int']



In [121]:

    
df_ltv.head()









    Out[121]:







  
    
      
      UserId
      Country
      Gender
      BirthDate
      PartnerType
      FirstDepositDate
      BetDate
      BetId
      Application
      ProductName
      IsLive
      BetAmount
      AmountWon
      Odds
      Cashout
      max_BetDate
      min_FirstDepositDate
      lifetime
      tot_bet_amount
      tot_amount_won
      value_brut
      lifetime_int
      lifetime_value
    
  
  
    
      0
      10783501
      Portugal
      M
      1992-06-08
      PPC
      2017-01-03
      2017-01-03
      607023702
      BETCLIC.PT
      SportsBook
      False
      5.00
      0.0
      124.76
      0.0
      2017-09-10
      2017-01-03
      250 days
      30.00
      0.00
      30.00
      250.0
      0.120000
    
    
      1
      10782912
      Ivory Coast
      M
      1996-03-18
      Internal
      2017-01-03
      2017-01-03
      607023878
      BETCLIC.COM
      SportsBook
      False
      5.00
      23.0
      4.60
      0.0
      2017-10-21
      2017-01-03
      291 days
      1044.00
      900.16
      143.84
      291.0
      0.494296
    
    
      2
      10780813
      France
      F
      1998-04-05
      PPC
      2017-01-01
      2017-01-03
      607024082
      BETCLIC.FR
      SportsBook
      False
      5.00
      0.0
      8.00
      0.0
      2017-09-04
      2017-01-01
      246 days
      966.71
      481.26
      485.45
      246.0
      1.973374
    
    
      3
      10783215
      Portugal
      M
      1976-12-17
      Internal
      2017-01-03
      2017-01-03
      607024088
      BETCLIC.PT
      SportsBook
      False
      1.00
      0.0
      41.08
      0.0
      2017-11-19
      2017-01-03
      320 days
      503.30
      320.30
      183.00
      320.0
      0.571875
    
    
      4
      10783401
      Poland
      M
      1995-04-16
      Affiliation
      2017-01-03
      2017-01-03
      607024113
      EXPEKT.COM
      SportsBook
      False
      5.67
      0.0
      3.75
      0.0
      2017-01-09
      2017-01-03
      6 days
      186.77
      169.85
      16.92
      6.0
      2.820000



In [122]:

    
# age
now = pd.Timestamp(datetime.now())
df_ltv['age'] = (now - df_bc['BirthDate']).astype('<m8[Y]')

15 jours



In [123]:

    
df_ltv_15 = df_ltv[df_ltv['BetDate'] <= (df_ltv['FirstDepositDate'] + timedelta(days=15))]

Building feature

Features on bets



In [124]:

    
mean_bet_by_user = df_ltv_15.groupby('UserId')['BetAmount'].mean().reset_index()
mean_bet_by_user.columns = ['UserId', 'mean_bet']

min_bet_by_user = df_ltv_15.groupby('UserId')['BetAmount'].min().reset_index()
min_bet_by_user.columns = ['UserId', 'min_bet']

max_bet_by_user = df_ltv_15.groupby('UserId')['BetAmount'].max().reset_index()
max_bet_by_user.columns = ['UserId', 'max_bet']

median_bet_by_user = df_ltv_15.groupby('UserId')['BetAmount'].median().reset_index()
median_bet_by_user.columns = ['UserId', 'median_bet']

sum_bet_by_user = df_ltv_15.groupby('UserId')['BetAmount'].sum().reset_index()
sum_bet_by_user.columns = ['UserId', 'sum_bet']

count_bet_by_user = df_ltv_15.groupby('UserId')['BetId'].count().reset_index()
count_bet_by_user.columns = ['UserId', 'count_bet']

Feature on bet won



In [125]:

    
mean_bet_won_by_user = df_ltv_15.groupby('UserId')['AmountWon'].mean().reset_index()
mean_bet_won_by_user.columns = ['UserId', 'mean_bet_won']

min_bet_won_by_user = df_ltv_15.groupby('UserId')['AmountWon'].min().reset_index()
min_bet_won_by_user.columns = ['UserId', 'min_bet_won']

max_bet_won_by_user = df_ltv_15.groupby('UserId')['AmountWon'].max().reset_index()
max_bet_won_by_user.columns = ['UserId', 'max_bet_won']

median_bet_won_by_user = df_ltv_15.groupby('UserId')['AmountWon'].median().reset_index()
median_bet_won_by_user.columns = ['UserId', 'median_bet_won']

sum_bet_won_by_user = df_ltv_15.groupby('UserId')['AmountWon'].sum().reset_index()
sum_bet_won_by_user.columns = ['UserId', 'sum_bet_won']

count_bet_won_by_user = df_ltv_15[df_ltv_15['AmountWon']>0].groupby('UserId')['BetId'].count().reset_index()
count_bet_won_by_user.columns = ['UserId', 'count_bet_won']

Other feature



In [126]:

    
mean_ods_by_user = df_ltv_15.groupby('UserId')['Odds'].mean()



In [127]:

    
islive_by_user = df_ltv_15.groupby('UserId')['IsLive'].max()

Categorical features



In [128]:

    
application_used = pd.get_dummies(df_ltv_15[['UserId', 'Application']]).groupby('UserId').sum()

partnertype_used = pd.get_dummies(df_ltv_15[['UserId', 'PartnerType']]).groupby('UserId').sum()

gender_used = pd.get_dummies(df_ltv_15[['UserId', 'Gender']]).groupby('UserId').sum()
gender_used = gender_used / gender_used
gender_used = gender_used.fillna(0)

Country_used = pd.get_dummies(df_ltv_15[['UserId', 'Country']]).groupby('UserId').sum()
Country_used = Country_used / Country_used
Country_used = Country_used.fillna(0)

Merging all features



In [129]:

    
df_res = pd.merge(mean_bet_by_user, 
                  min_bet_by_user, 
                  how='left', 
                  left_on=['UserId'], 
                  right_on=['UserId'])
df_res = pd.merge(df_res, 
                  max_bet_by_user, 
                  how='left', 
                  left_on=['UserId'], 
                  right_on=['UserId'])
df_res = pd.merge(df_res, 
                  median_bet_by_user, 
                  how='left', 
                  left_on=['UserId'], 
                  right_on=['UserId'])
df_res = pd.merge(df_res, 
                  sum_bet_by_user, 
                  how='left', 
                  left_on=['UserId'], 
                  right_on=['UserId'])
df_res = pd.merge(df_res, 
                  count_bet_by_user, 
                  how='left', 
                  left_on=['UserId'], 
                  right_on=['UserId'])
df_res = pd.merge(df_res, 
                  mean_bet_won_by_user, 
                  how='left', 
                  left_on=['UserId'], 
                  right_on=['UserId'])
df_res = pd.merge(df_res, 
                  min_bet_won_by_user, 
                  how='left', 
                  left_on=['UserId'], 
                  right_on=['UserId'])
df_res = pd.merge(df_res, 
                  max_bet_won_by_user, 
                  how='left', 
                  left_on=['UserId'], 
                  right_on=['UserId'])
df_res = pd.merge(df_res, 
                  median_bet_won_by_user, 
                  how='left', 
                  left_on=['UserId'], 
                  right_on=['UserId'])
df_res = pd.merge(df_res, 
                  sum_bet_won_by_user, 
                  how='left', 
                  left_on=['UserId'], 
                  right_on=['UserId'])
df_res = pd.merge(df_res, 
                  count_bet_won_by_user, 
                  how='left', 
                  left_on=['UserId'], 
                  right_on=['UserId'])
df_res = pd.merge(df_res, 
                  application_used, 
                  how='left', 
                  left_on=['UserId'], 
                  right_on=['UserId'])
df_res = pd.merge(df_res, 
                  mean_ods_by_user, 
                  how='left', 
                  left_on=['UserId'], 
                  right_on=['UserId'])
df_res = pd.merge(df_res, 
                  islive_by_user, 
                  how='left', 
                  left_on=['UserId'], 
                  right_on=['UserId'])
df_res = pd.merge(df_res, 
                  partnertype_used, 
                  how='left', 
                  left_on=['UserId'], 
                  right_on=['UserId'])
df_res = pd.merge(df_res, 
                  gender_used, 
                  how='left', 
                  left_on=['UserId'], 
                  right_on=['UserId'])
df_res = pd.merge(df_res, 
                  Country_used, 
                  how='left', 
                  left_on=['UserId'], 
                  right_on=['UserId'])

Adding interesting feature



In [130]:

    
df_res['ratio_win'] = df_res['count_bet_won'] / df_res['count_bet']

Final cleaning



In [131]:

    
df_res = df_res.fillna(0)



In [132]:

    
ltv_age = df_ltv[['UserId', 'age']].drop_duplicates()



In [133]:

    
df_final = pd.merge(df_res, 
                    ltv_age, 
                    how='left', 
                    left_on=['UserId'], 
                    right_on=['UserId'])



In [134]:

    
ltv_to_evaluation = df_ltv[['UserId', 'lifetime_value']].drop_duplicates()



In [135]:

    
df_final = pd.merge(df_res, 
                    ltv_to_evaluation, 
                    how='left', 
                    left_on=['UserId'], 
                    right_on=['UserId'])

Saving the dataset



In [136]:

    
df_final.to_csv('15_bc.csv', index=None, sep=';')

Clustering

N'etant pas satisfait de l'approche du probleme en regression (code dans le notebook regression.ipynb), je propose une autre facon de voir les choses.
En clusterisant les comportements, nous obtenons une vision des differents profils des clients et ainsi observer les differences de lifetime value de chaque groupe.
Le choix de 5 classes est arbitraire (ouvert à discussion et à une analyse plus poussée).

Kmeans #1



In [137]:

    
# compute kmeans
RANDOM_STATE = 50
NB_CLUSTERS = 5
X = df_final[list(set(df_final).difference({'UserId'}))]
kmeans = KMeans(n_clusters=NB_CLUSTERS,
                random_state=RANDOM_STATE, n_jobs=8)
kmeans.fit(X)
X['kmeans_cluster'] = kmeans.labels_



In [138]:

    
from matplotlib import colors as matplot_colors
import matplotlib.pyplot as plt

def background_gradient(row):
    cmap = 'YlGnBu'
    m = row.min()
    M = row.max()
    low = 0
    high = 0.2
    rng = M - m
    norm = matplot_colors.Normalize(m - (rng * low), M + (rng * high))
    normed = norm(list(row))
    c = [matplot_colors.rgb2hex(x) for x in plt.cm.get_cmap(cmap)(normed)]
    return ['background-color: %s' % color for color in c]


X.groupby('kmeans_cluster').mean().T.style.apply(background_gradient, axis=1)









    Out[138]:




            kmeans_cluster         0         1         2         3         4     

                
                        min_bet
                        20.9906
                        44.85
                        172.86
                        14.2054
                        17.2431
            
            
                        median_bet
                        26.2083
                        1118
                        898.082
                        129.817
                        68.6359
            
            
                        Country_France
                        0.525399
                        0
                        0.666667
                        0.521739
                        0.555046
            
            
                        Country_Netherlands
                        0.00178389
                        0
                        0
                        0
                        0.00458716
            
            
                        PartnerType_Ad Networks
                        0.0155878
                        0
                        0
                        0
                        0
            
            
                        PartnerType_Sponsoring
                        0.0133367
                        0
                        0
                        0
                        0
            
            
                        Application_BETCLIC.PT
                        6.49291
                        0
                        0
                        13.9565
                        17.4472
            
            
                        min_bet_won
                        10.9776
                        0
                        0
                        0
                        11.578
            
            
                        Country_Hungary
                        0.00552158
                        0
                        0
                        0
                        0.00688073
            
            
                        sum_bet_won
                        96.3705
                        178384
                        46857.9
                        12045.7
                        2421.86
            
            
                        Gender_F
                        0.140418
                        0
                        0
                        0.0652174
                        0.0733945
            
            
                        mean_bet
                        28.059
                        1197.34
                        1001.02
                        188.304
                        86.5676
            
            
                        PartnerType_Expekt
                        4.24737e-05
                        0
                        0
                        0
                        0
            
            
                        Country_United Kingdom 
                        0.0132943
                        0
                        0
                        0.0652174
                        0.0321101
            
            
                        Odds
                        68.6593
                        3.01406
                        1.58785
                        26.5616
                        30.1208
            
            
                        Country_Portugal
                        0.178432
                        0
                        0
                        0.0652174
                        0.130734
            
            
                        PartnerType_Referral
                        1.92903
                        0
                        0
                        14.9348
                        5.2156
            
            
                        lifetime_value
                        9.31106
                        2304.83
                        102.807
                        156.89
                        14.2549
            
            
                        PartnerType_betclic
                        0.00522426
                        0
                        0
                        0
                        0.25
            
            
                        Country_Norway
                        0.00522426
                        1
                        0
                        0.0217391
                        0.0321101
            
            
                        mean_bet_won
                        18.6237
                        991.02
                        950.463
                        170.221
                        97.9391
            
            
                        count_bet
                        19.7362
                        180
                        56.6667
                        161.174
                        87.3601
            
            
                        PartnerType_Emailing
                        0.009684
                        0
                        0
                        0
                        0
            
            
                        Country_Italy
                        0.0493969
                        0
                        0
                        0
                        0.0458716
            
            
                        ratio_win
                        0.235702
                        0.594444
                        0.686032
                        0.445024
                        0.436401
            
            
                        Country_Ivory Coast
                        0.0266735
                        0
                        0
                        0.0217391
                        0.0183486
            
            
                        PartnerType_Affiliation
                        4.63282
                        180
                        33.3333
                        19.5
                        26.344
            
            
                        Application_BETCLIC.COM
                        2.53772
                        0
                        33.3333
                        19.1522
                        17.617
            
            
                        Application_BETCLIC.IT
                        0.325433
                        0
                        0
                        0
                        2.47248
            
            
                        max_bet_won
                        46.2576
                        5645
                        3152.47
                        1847.01
                        520.848
            
            
                        max_bet
                        45.4971
                        3161.2
                        2925.82
                        1013.9
                        297.176
            
            
                        Application_BETCLIC.FR
                        8.90911
                        0
                        23.3333
                        86.6304
                        41.5688
            
            
                        sum_bet
                        130.297
                        215521
                        48678.7
                        12492.4
                        2589.42
            
            
                        Country_Finland 
                        0.00777268
                        0
                        0
                        0.0217391
                        0.0114679
            
            
                        Country_Sweden
                        0.0304111
                        0
                        0
                        0.0869565
                        0.0321101
            
            
                        PartnerType_Offline
                        0.00114679
                        0
                        0
                        0
                        0
            
            
                        PartnerType_SEO
                        0.000679579
                        0
                        0
                        0
                        0
            
            
                        Country_Germany
                        0.0109157
                        0
                        0
                        0
                        0.0321101
            
            
                        Country_Poland
                        0.143221
                        0
                        0.333333
                        0.195652
                        0.0917431
            
            
                        IsLive
                        0.507051
                        1
                        1
                        1
                        0.915138
            
            
                        PartnerType_Mobile
                        0.000594631
                        0
                        0
                        0
                        0
            
            
                        PartnerType_PPC
                        4.54634
                        0
                        0
                        65.1957
                        18.2248
            
            
                        Gender_M
                        0.859582
                        1
                        1
                        0.934783
                        0.926606
            
            
                        count_bet_won
                        5.35623
                        107
                        42
                        65.6739
                        33.9817
            
            
                        Country_Switzerland
                        0.00195379
                        0
                        0
                        0
                        0.00688073
            
            
                        PartnerType_Partnerships
                        0.257221
                        0
                        0
                        0
                        1.12615
            
            
                        PartnerType_Internal
                        8.3245
                        0
                        23.3333
                        61.5435
                        36.1995
            
            
                        Application_EXPEKT.COM
                        1.47103
                        180
                        0
                        41.4348
                        8.25459
            
            
                        median_bet_won
                        14.7678
                        193.94
                        751.463
                        32.1745
                        45.6529



In [139]:

    
X.groupby('kmeans_cluster').count()









    Out[139]:







  
    
      
      min_bet
      median_bet
      Country_France
      Country_Netherlands
      PartnerType_Ad Networks
      PartnerType_Sponsoring
      Application_BETCLIC.PT
      min_bet_won
      Country_Hungary
      sum_bet_won
      Gender_F
      mean_bet
      PartnerType_Expekt
      Country_United Kingdom
      Odds
      Country_Portugal
      PartnerType_Referral
      lifetime_value
      PartnerType_betclic
      Country_Norway
      mean_bet_won
      count_bet
      PartnerType_Emailing
      Country_Italy
      ratio_win
      Country_Ivory Coast
      PartnerType_Affiliation
      Application_BETCLIC.COM
      Application_BETCLIC.IT
      max_bet_won
      max_bet
      Application_BETCLIC.FR
      sum_bet
      Country_Finland
      Country_Sweden
      PartnerType_Offline
      PartnerType_SEO
      Country_Germany
      Country_Poland
      IsLive
      PartnerType_Mobile
      PartnerType_PPC
      Gender_M
      count_bet_won
      Country_Switzerland
      PartnerType_Partnerships
      PartnerType_Internal
      Application_EXPEKT.COM
      median_bet_won
    
    
      kmeans_cluster
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
    
  
  
    
      0
      23544
      23544
      23544
      23544
      23544
      23544
      23544
      23544
      23544
      23544
      23544
      23544
      23544
      23544
      23544
      23544
      23544
      23544
      23544
      23544
      23544
      23544
      23544
      23544
      23544
      23544
      23544
      23544
      23544
      23544
      23544
      23544
      23544
      23544
      23544
      23544
      23544
      23544
      23544
      23544
      23544
      23544
      23544
      23544
      23544
      23544
      23544
      23544
      23544
    
    
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
    
    
      2
      3
      3
      3
      3
      3
      3
      3
      3
      3
      3
      3
      3
      3
      3
      3
      3
      3
      3
      3
      3
      3
      3
      3
      3
      3
      3
      3
      3
      3
      3
      3
      3
      3
      3
      3
      3
      3
      3
      3
      3
      3
      3
      3
      3
      3
      3
      3
      3
      3
    
    
      3
      46
      46
      46
      46
      46
      46
      46
      46
      46
      46
      46
      46
      46
      46
      46
      46
      46
      46
      46
      46
      46
      46
      46
      46
      46
      46
      46
      46
      46
      46
      46
      46
      46
      46
      46
      46
      46
      46
      46
      46
      46
      46
      46
      46
      46
      46
      46
      46
      46
    
    
      4
      436
      436
      436
      436
      436
      436
      436
      436
      436
      436
      436
      436
      436
      436
      436
      436
      436
      436
      436
      436
      436
      436
      436
      436
      436
      436
      436
      436
      436
      436
      436
      436
      436
      436
      436
      436
      436
      436
      436
      436
      436
      436
      436
      436
      436
      436
      436
      436
      436

Conclusion Kmeans #1

Les clusters sont beaucoup trop desequilibrés, les valeurs 'financieres' comportent des cas extremes qui polluent le bon placement des centroides.

Kmeans #2 : Robust scaler



In [146]:

    
X = df_final[list(set(df_final).difference({'UserId'}))]

transformer = RobustScaler().fit(X)
df_robust = pd.DataFrame(transformer.transform(X))
df_robust.columns = X.columns
df_robust#['mean_bet'].hist()

# compute kmeans
RANDOM_STATE = 50
NB_CLUSTERS = 5
X = df_robust[list(set(df_robust).difference({'UserId'}))]
kmeans = KMeans(n_clusters=NB_CLUSTERS,
                random_state=RANDOM_STATE, n_jobs=8)
kmeans.fit(X)
X['kmeans_cluster'] = kmeans.labels_
X.groupby('kmeans_cluster').mean().T.style.apply(background_gradient, axis=1)









    Out[146]:




            kmeans_cluster         0         1         2         3         4     

                
                        min_bet
                        0.801304
                        7.32414
                        4.4175
                        -0.05431
                        2.36657
            
            
                        median_bet
                        0.484519
                        4.81751
                        2.6296
                        -0.12915
                        1.19718
            
            
                        Country_France
                        -0.479141
                        -0.28169
                        -0.323034
                        -0.511111
                        -1
            
            
                        Country_Netherlands
                        0.00185697
                        0
                        0.00140449
                        0
                        0
            
            
                        PartnerType_Ad Networks
                        0.015849
                        0
                        0
                        0
                        0
            
            
                        PartnerType_Sponsoring
                        0.0135602
                        0
                        0
                        0
                        0
            
            
                        Application_BETCLIC.PT
                        6.92274
                        0.0140845
                        0.00561798
                        9
                        0
            
            
                        min_bet_won
                        1.18783
                        765.946
                        248.857
                        0
                        4429.5
            
            
                        Country_Hungary
                        0.00574365
                        0
                        0
                        0
                        0
            
            
                        sum_bet_won
                        1.23799
                        12.0724
                        4.01914
                        -0.00827777
                        37.9947
            
            
                        Application_EXPEKT.COM
                        1.7288
                        0.225352
                        0.289326
                        0.711111
                        1
            
            
                        Gender_F
                        0.1353
                        0.323944
                        0.251404
                        0.0666667
                        0
            
            
                        PartnerType_Expekt
                        4.31854e-05
                        0
                        0
                        0
                        0
            
            
                        Odds
                        4.11948
                        0.251596
                        -0.14996
                        763.074
                        9.06381
            
            
                        Country_Portugal
                        0.182631
                        0.0140845
                        0.00280899
                        0.322222
                        0
            
            
                        PartnerType_Referral
                        2.07717
                        0.140845
                        0.11236
                        2.1
                        0
            
            
                        lifetime_value
                        1.19338
                        -28.4483
                        -9.88006
                        0.252968
                        -407.97
            
            
                        PartnerType_betclic
                        0.00993263
                        0
                        0.00280899
                        0
                        0
            
            
                        Country_Norway
                        0.00583002
                        0.0140845
                        0.00421348
                        0
                        0
            
            
                        mean_bet_won
                        1.3362
                        120.902
                        40.8487
                        -0.0537441
                        654.835
            
            
                        count_bet
                        0.74395
                        -0.278169
                        -0.27816
                        1.18667
                        -0.3
            
            
                        PartnerType_Emailing
                        0.00984626
                        0
                        0
                        0
                        0
            
            
                        Country_Italy
                        0.0495336
                        0
                        0.0505618
                        0
                        0
            
            
                        ratio_win
                        0.12553
                        2.10923
                        2.0959
                        -0.344238
                        2.14286
            
            
                        Country_Ivory Coast
                        0.0269477
                        0
                        0.00421348
                        0.111111
                        0
            
            
                        PartnerType_Affiliation
                        5.23782
                        0.0704225
                        0.453652
                        1.36667
                        0
            
            
                        median_bet_won
                        4.82717
                        818.197
                        277.005
                        0
                        4429.5
            
            
                        Application_BETCLIC.COM
                        2.8989
                        0.084507
                        0.310393
                        11.7333
                        0
            
            
                        Application_BETCLIC.IT
                        0.374978
                        0
                        0.0800562
                        0
                        0
            
            
                        max_bet_won
                        0.94606
                        20.7541
                        6.99899
                        0.0390482
                        102.844
            
            
                        max_bet
                        0.297926
                        2.48993
                        1.00759
                        -0.0176593
                        0.297802
            
            
                        Application_BETCLIC.FR
                        1.27908
                        0.0160966
                        -0.0355136
                        1.18413
                        -0.142857
            
            
                        sum_bet
                        1.04566
                        5.94721
                        1.56802
                        -0.142439
                        -0.280992
            
            
                        Country_Finland 
                        0.0079461
                        0
                        0.00561798
                        0
                        1
            
            
                        Country_Sweden
                        0.0290206
                        0.211268
                        0.0646067
                        0.0111111
                        0
            
            
                        PartnerType_Offline
                        0.001166
                        0
                        0
                        0
                        0
            
            
                        PartnerType_SEO
                        0.000690966
                        0
                        0
                        0
                        0
            
            
                        Country_Germany
                        0.0110123
                        0.028169
                        0.0168539
                        0.0222222
                        0
            
            
                        Country_Poland
                        0.142382
                        0
                        0.171348
                        0.0333333
                        0
            
            
                        IsLive
                        -0.468777
                        -0.943662
                        -0.942416
                        -0.544444
                        -1
            
            
                        PartnerType_Mobile
                        0.000604595
                        0
                        0
                        0
                        0
            
            
                        PartnerType_PPC
                        5.06197
                        0.295775
                        0.217697
                        6.58889
                        0
            
            
                        Gender_M
                        -0.1353
                        -0.323944
                        -0.251404
                        -0.0666667
                        0
            
            
                        count_bet_won
                        1.0349
                        0.0507042
                        0.0651685
                        0.168889
                        0
            
            
                        Country_Switzerland
                        0.00211608
                        0
                        0
                        0
                        0
            
            
                        PartnerType_Partnerships
                        0.278675
                        0
                        0.00561798
                        1
                        0
            
            
                        PartnerType_Internal
                        2.29292
                        0.232394
                        0.161166
                        4.91944
                        0.25
            
            
                        mean_bet
                        0.421251
                        4.33345
                        2.27924
                        -0.130744
                        0.991596
            
            
                        Country_United Kingdom 
                        0.0141216
                        0.0140845
                        0.00140449
                        0.0111111
                        0



In [149]:

    
X.groupby('kmeans_cluster').count()









    Out[149]:







  
    
      
      min_bet
      median_bet
      Country_France
      Country_Netherlands
      PartnerType_Ad Networks
      PartnerType_Sponsoring
      Application_BETCLIC.PT
      min_bet_won
      Country_Hungary
      sum_bet_won
      Application_EXPEKT.COM
      Gender_F
      PartnerType_Expekt
      Odds
      Country_Portugal
      PartnerType_Referral
      lifetime_value
      PartnerType_betclic
      Country_Norway
      mean_bet_won
      count_bet
      PartnerType_Emailing
      Country_Italy
      ratio_win
      Country_Ivory Coast
      PartnerType_Affiliation
      median_bet_won
      Application_BETCLIC.COM
      Application_BETCLIC.IT
      max_bet_won
      max_bet
      Application_BETCLIC.FR
      sum_bet
      Country_Finland
      Country_Sweden
      PartnerType_Offline
      PartnerType_SEO
      Country_Germany
      Country_Poland
      IsLive
      PartnerType_Mobile
      PartnerType_PPC
      Gender_M
      count_bet_won
      Country_Switzerland
      PartnerType_Partnerships
      PartnerType_Internal
      mean_bet
      Country_United Kingdom
    
    
      kmeans_cluster
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
    
  
  
    
      0
      23156
      23156
      23156
      23156
      23156
      23156
      23156
      23156
      23156
      23156
      23156
      23156
      23156
      23156
      23156
      23156
      23156
      23156
      23156
      23156
      23156
      23156
      23156
      23156
      23156
      23156
      23156
      23156
      23156
      23156
      23156
      23156
      23156
      23156
      23156
      23156
      23156
      23156
      23156
      23156
      23156
      23156
      23156
      23156
      23156
      23156
      23156
      23156
      23156
    
    
      1
      71
      71
      71
      71
      71
      71
      71
      71
      71
      71
      71
      71
      71
      71
      71
      71
      71
      71
      71
      71
      71
      71
      71
      71
      71
      71
      71
      71
      71
      71
      71
      71
      71
      71
      71
      71
      71
      71
      71
      71
      71
      71
      71
      71
      71
      71
      71
      71
      71
    
    
      2
      712
      712
      712
      712
      712
      712
      712
      712
      712
      712
      712
      712
      712
      712
      712
      712
      712
      712
      712
      712
      712
      712
      712
      712
      712
      712
      712
      712
      712
      712
      712
      712
      712
      712
      712
      712
      712
      712
      712
      712
      712
      712
      712
      712
      712
      712
      712
      712
      712
    
    
      3
      90
      90
      90
      90
      90
      90
      90
      90
      90
      90
      90
      90
      90
      90
      90
      90
      90
      90
      90
      90
      90
      90
      90
      90
      90
      90
      90
      90
      90
      90
      90
      90
      90
      90
      90
      90
      90
      90
      90
      90
      90
      90
      90
      90
      90
      90
      90
      90
      90
    
    
      4
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1

Conlusion Kmeans #2

Amelioration du kmeans #1 mais toujours aussi desequilibré, essayons avec un scaler plus classique.

Kmeans #3



In [142]:

    
X = df_final[list(set(df_final).difference({'UserId'}))]

transformer = MinMaxScaler().fit(X)
df_robust = pd.DataFrame(transformer.transform(X))
df_robust.columns = X.columns
df_robust#['mean_bet'].hist()

# compute kmeans
RANDOM_STATE = 50
NB_CLUSTERS = 5
X = df_robust[list(set(df_robust).difference({'UserId'}))]
kmeans = KMeans(n_clusters=NB_CLUSTERS,
                random_state=RANDOM_STATE, n_jobs=8)
kmeans.fit(X)
df_final['kmeans_cluster'] = kmeans.labels_
df_final.groupby('kmeans_cluster').mean().T.style.apply(background_gradient, axis=1)









    



/home/tom/source/smart_click/betclic/lib/python3.5/site-packages/sklearn/preprocessing/data.py:323: DataConversionWarning: Data with input dtype bool, int64, float64 were all converted to float64 by MinMaxScaler.
  return self.partial_fit(X, y)






    Out[142]:




            kmeans_cluster         0         1         2         3         4     

                
                        UserId
                        1.06828e+07
                        1.07012e+07
                        1.07255e+07
                        1.0637e+07
                        1.07549e+07
            
            
                        mean_bet
                        47.9892
                        12.3854
                        26.7269
                        16.6599
                        42.6891
            
            
                        min_bet
                        39.1547
                        5.23521
                        20.646
                        4.98034
                        34.5676
            
            
                        max_bet
                        64.4959
                        31.2158
                        40.4294
                        59.5305
                        60.2931
            
            
                        median_bet
                        46.0023
                        10.6933
                        25.1792
                        12.9131
                        40.697
            
            
                        sum_bet
                        111.263
                        264.265
                        167.534
                        355.074
                        154.506
            
            
                        count_bet
                        6.62808
                        37.3089
                        13.2866
                        32.2382
                        14.7797
            
            
                        mean_bet_won
                        32.8917
                        8.09162
                        20.5327
                        10.6705
                        30.794
            
            
                        min_bet_won
                        22.7138
                        0.63661
                        10.6021
                        0.440544
                        20.6537
            
            
                        max_bet_won
                        57.3083
                        46.0442
                        46.7482
                        78.3702
                        62.097
            
            
                        median_bet_won
                        28.8674
                        3.11416
                        16.7909
                        3.16049
                        26.3678
            
            
                        sum_bet_won
                        81.6081
                        216.839
                        154.026
                        293.862
                        124.238
            
            
                        count_bet_won
                        1.3941
                        10.0048
                        4.51971
                        9.90222
                        4.05477
            
            
                        Application_BETCLIC.COM
                        1.04737
                        6.11771
                        7.30956
                        0
                        2.05447
            
            
                        Application_BETCLIC.FR
                        4.93233
                        0
                        0
                        32.2382
                        6.66627
            
            
                        Application_BETCLIC.IT
                        0.24766
                        1.13108
                        0
                        0
                        0.277761
            
            
                        Application_BETCLIC.PT
                        0
                        27.1704
                        0
                        0
                        4.42323
            
            
                        Application_EXPEKT.COM
                        0.40072
                        2.88971
                        5.97703
                        0
                        1.35798
            
            
                        Odds
                        73.4713
                        100.027
                        31.0029
                        62.6292
                        45.098
            
            
                        IsLive
                        0
                        0.843112
                        0.367501
                        1
                        0.398982
            
            
                        PartnerType_Ad Networks
                        0.0100792
                        0
                        0.000342818
                        0.0544018
                        0
            
            
                        PartnerType_Affiliation
                        1.69316
                        6.74545
                        7.02537
                        7.66348
                        3.42951
            
            
                        PartnerType_Emailing
                        0.000719942
                        0.006684
                        0.0589647
                        0.00275685
                        0
            
            
                        PartnerType_Expekt
                        0.000143988
                        0
                        0
                        0
                        0
            
            
                        PartnerType_Internal
                        2.84392
                        17.0472
                        5.37161
                        12.3404
                        6.07483
            
            
                        PartnerType_Mobile
                        0
                        0
                        0
                        0.00257306
                        0
            
            
                        PartnerType_Offline
                        0.00115191
                        0
                        0
                        0.00349201
                        0
            
            
                        PartnerType_PPC
                        1.27487
                        9.35815
                        0
                        8.83422
                        3.19066
            
            
                        PartnerType_Partnerships
                        0.0354212
                        0.598775
                        0.0294823
                        0.436133
                        0.184675
            
            
                        PartnerType_Referral
                        0.764003
                        3.47419
                        0.776826
                        2.90075
                        1.88896
            
            
                        PartnerType_SEO
                        0.00115191
                        0.00148533
                        0
                        0
                        0
            
            
                        PartnerType_Sponsoring
                        0
                        0.0527293
                        0
                        0
                        0.00897935
            
            
                        PartnerType_betclic
                        0.00345572
                        0.0243223
                        0.0239973
                        0
                        0.00209518
            
            
                        Gender_F
                        0
                        0
                        0
                        0
                        1
            
            
                        Gender_M
                        1
                        1
                        1
                        1
                        0
            
            
                        Country_Finland 
                        0.0151188
                        0.0135537
                        0
                        0
                        0.00329243
            
            
                        Country_France
                        0.758531
                        0
                        0
                        1
                        0.577372
            
            
                        Country_Germany
                        0.0165587
                        0.020609
                        0
                        0
                        0.013469
            
            
                        Country_Hungary
                        0.00446364
                        0.0142963
                        0
                        0
                        0.00748279
            
            
                        Country_Italy
                        0.0767459
                        0.0748236
                        0
                        0
                        0.07393
            
            
                        Country_Ivory Coast
                        0.0358531
                        0.0616413
                        0
                        0
                        0.0167614
            
            
                        Country_Netherlands
                        0.0024478
                        0.00352766
                        0
                        0
                        0.00239449
            
            
                        Country_Norway
                        0.00921526
                        0.012811
                        0
                        0
                        0.00179587
            
            
                        Country_Poland
                        0
                        0
                        1
                        0
                        0.151152
            
            
                        Country_Portugal
                        0
                        0.719829
                        0
                        0
                        0.114936
            
            
                        Country_Sweden
                        0.0578834
                        0.0447456
                        0
                        0
                        0.0272374
            
            
                        Country_Switzerland
                        0.0024478
                        0.00538433
                        0
                        0
                        0.000897935
            
            
                        Country_United Kingdom 
                        0.0207343
                        0.0287783
                        0
                        0
                        0.00927866
            
            
                        ratio_win
                        0.208766
                        0.23524
                        0.243643
                        0.276756
                        0.248235
            
            
                        lifetime_value
                        10.0264
                        9.04076
                        9.18415
                        8.72277
                        12.7769



In [152]:

    
df_final.groupby('kmeans_cluster').count()['UserId']









    Out[152]:





kmeans_cluster
0    6945
1    5386
2    2917
3    5441
4    3341
Name: UserId, dtype: int64

Conclusion kmeans #3

Les clusters sont bien equilibrés et il est possible de raconter une histoire sur le cluster ayant une plus forte lifetime value:
Les clients generant un $\text{lifetime value}$ haut parient des montants elevés en moyenne, jouent assez peu souvant, sont des femmes, francaise/italienne, sur des Odds assez bas, rarement en live et ont un assez bon ratio de gain.

Un tel modèle permet de predire, suivant les actions effectuées par le client en 15 jours, potentiel $\text{lifetime value}$ d'un client.

Un tel model est sujet a etre amelioré mais cette piste peut servir de base solide pour la suite.
En temps normal j'utilise la librairie "pickle" pour sauvegarder mes modeles mais pour ne pas generer de fichier suplementaire sur vos machines, j'ai retiré toute sauvegarde.
Tous les modeles ont une seed fixe pour etre reproductible.



In [ ]:



In [ ]:

	UserId	Country	Gender	BirthDate	PartnerType	FirstDepositDate	BetDate	BetId	Application	ProductName	IsLive	BetAmount	AmountWon	Odds	max_BetDate	min_FirstDepositDate	lifetime
0	10783501	Portugal	M	1992-06-08	PPC	2017-01-03	2017-01-03	607023702	BETCLIC.PT	SportsBook	False	5.00	0.0	124.76	2017-09-10	2017-01-03	250 days
1	10782912	Ivory Coast	M	1996-03-18	Internal	2017-01-03	2017-01-03	607023878	BETCLIC.COM	SportsBook	False	5.00	23.0	4.60	2017-10-21	2017-01-03	291 days
2	10780813	France	F	1998-04-05	PPC	2017-01-01	2017-01-03	607024082	BETCLIC.FR	SportsBook	False	5.00	0.0	8.00	2017-09-04	2017-01-01	246 days
3	10783215	Portugal	M	1976-12-17	Internal	2017-01-03	2017-01-03	607024088	BETCLIC.PT	SportsBook	False	1.00	0.0	41.08	2017-11-19	2017-01-03	320 days
4	10783401	Poland	M	1995-04-16	Affiliation	2017-01-03	2017-01-03	607024113	EXPEKT.COM	SportsBook	False	5.67	0.0	3.75	2017-01-09	2017-01-03	6 days

kmeans_cluster	0	1	2	3	4
min_bet	20.9906	44.85	172.86	14.2054	17.2431
median_bet	26.2083	1118	898.082	129.817	68.6359
Country_France	0.525399	0	0.666667	0.521739	0.555046
Country_Netherlands	0.00178389	0	0	0	0.00458716
PartnerType_Ad Networks	0.0155878	0	0	0	0
PartnerType_Sponsoring	0.0133367	0	0	0	0
Application_BETCLIC.PT	6.49291	0	0	13.9565	17.4472
min_bet_won	10.9776	0	0	0	11.578
Country_Hungary	0.00552158	0	0	0	0.00688073
sum_bet_won	96.3705	178384	46857.9	12045.7	2421.86
Gender_F	0.140418	0	0	0.0652174	0.0733945
mean_bet	28.059	1197.34	1001.02	188.304	86.5676
PartnerType_Expekt	4.24737e-05	0	0	0	0
Country_United Kingdom	0.0132943	0	0	0.0652174	0.0321101
Odds	68.6593	3.01406	1.58785	26.5616	30.1208
Country_Portugal	0.178432	0	0	0.0652174	0.130734
PartnerType_Referral	1.92903	0	0	14.9348	5.2156
lifetime_value	9.31106	2304.83	102.807	156.89	14.2549
PartnerType_betclic	0.00522426	0	0	0	0.25
Country_Norway	0.00522426	1	0	0.0217391	0.0321101
mean_bet_won	18.6237	991.02	950.463	170.221	97.9391
count_bet	19.7362	180	56.6667	161.174	87.3601
PartnerType_Emailing	0.009684	0	0	0	0
Country_Italy	0.0493969	0	0	0	0.0458716
ratio_win	0.235702	0.594444	0.686032	0.445024	0.436401
Country_Ivory Coast	0.0266735	0	0	0.0217391	0.0183486
PartnerType_Affiliation	4.63282	180	33.3333	19.5	26.344
Application_BETCLIC.COM	2.53772	0	33.3333	19.1522	17.617
Application_BETCLIC.IT	0.325433	0	0	0	2.47248
max_bet_won	46.2576	5645	3152.47	1847.01	520.848
max_bet	45.4971	3161.2	2925.82	1013.9	297.176
Application_BETCLIC.FR	8.90911	0	23.3333	86.6304	41.5688
sum_bet	130.297	215521	48678.7	12492.4	2589.42
Country_Finland	0.00777268	0	0	0.0217391	0.0114679
Country_Sweden	0.0304111	0	0	0.0869565	0.0321101
PartnerType_Offline	0.00114679	0	0	0	0
PartnerType_SEO	0.000679579	0	0	0	0
Country_Germany	0.0109157	0	0	0	0.0321101
Country_Poland	0.143221	0	0.333333	0.195652	0.0917431
IsLive	0.507051	1	1	1	0.915138
PartnerType_Mobile	0.000594631	0	0	0	0
PartnerType_PPC	4.54634	0	0	65.1957	18.2248
Gender_M	0.859582	1	1	0.934783	0.926606
count_bet_won	5.35623	107	42	65.6739	33.9817
Country_Switzerland	0.00195379	0	0	0	0.00688073
PartnerType_Partnerships	0.257221	0	0	0	1.12615
PartnerType_Internal	8.3245	0	23.3333	61.5435	36.1995
Application_EXPEKT.COM	1.47103	180	0	41.4348	8.25459
median_bet_won	14.7678	193.94	751.463	32.1745	45.6529

	min_bet	median_bet	Country_France	Country_Netherlands	PartnerType_Ad Networks	PartnerType_Sponsoring	Application_BETCLIC.PT	min_bet_won	Country_Hungary	sum_bet_won	Gender_F	mean_bet	PartnerType_Expekt	Country_United Kingdom	Odds	Country_Portugal	PartnerType_Referral	lifetime_value	PartnerType_betclic	Country_Norway	mean_bet_won	count_bet	PartnerType_Emailing	Country_Italy	ratio_win	Country_Ivory Coast	PartnerType_Affiliation	Application_BETCLIC.COM	Application_BETCLIC.IT	max_bet_won	max_bet	Application_BETCLIC.FR	sum_bet	Country_Finland	Country_Sweden	PartnerType_Offline	PartnerType_SEO	Country_Germany	Country_Poland	IsLive	PartnerType_Mobile	PartnerType_PPC	Gender_M	count_bet_won	Country_Switzerland	PartnerType_Partnerships	PartnerType_Internal	Application_EXPEKT.COM	median_bet_won
kmeans_cluster
0	23544	23544	23544	23544	23544	23544	23544	23544	23544	23544	23544	23544	23544	23544	23544	23544	23544	23544	23544	23544	23544	23544	23544	23544	23544	23544	23544	23544	23544	23544	23544	23544	23544	23544	23544	23544	23544	23544	23544	23544	23544	23544	23544	23544	23544	23544	23544	23544	23544
1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1
2	3	3	3	3	3	3	3	3	3	3	3	3	3	3	3	3	3	3	3	3	3	3	3	3	3	3	3	3	3	3	3	3	3	3	3	3	3	3	3	3	3	3	3	3	3	3	3	3	3
3	46	46	46	46	46	46	46	46	46	46	46	46	46	46	46	46	46	46	46	46	46	46	46	46	46	46	46	46	46	46	46	46	46	46	46	46	46	46	46	46	46	46	46	46	46	46	46	46	46
4	436	436	436	436	436	436	436	436	436	436	436	436	436	436	436	436	436	436	436	436	436	436	436	436	436	436	436	436	436	436	436	436	436	436	436	436	436	436	436	436	436	436	436	436	436	436	436	436	436

kmeans_cluster	0	1	2	3	4
min_bet	0.801304	7.32414	4.4175	-0.05431	2.36657
median_bet	0.484519	4.81751	2.6296	-0.12915	1.19718
Country_France	-0.479141	-0.28169	-0.323034	-0.511111	-1
Country_Netherlands	0.00185697	0	0.00140449	0	0
PartnerType_Ad Networks	0.015849	0	0	0	0
PartnerType_Sponsoring	0.0135602	0	0	0	0
Application_BETCLIC.PT	6.92274	0.0140845	0.00561798	9	0
min_bet_won	1.18783	765.946	248.857	0	4429.5
Country_Hungary	0.00574365	0	0	0	0
sum_bet_won	1.23799	12.0724	4.01914	-0.00827777	37.9947
Application_EXPEKT.COM	1.7288	0.225352	0.289326	0.711111	1
Gender_F	0.1353	0.323944	0.251404	0.0666667	0
PartnerType_Expekt	4.31854e-05	0	0	0	0
Odds	4.11948	0.251596	-0.14996	763.074	9.06381
Country_Portugal	0.182631	0.0140845	0.00280899	0.322222	0
PartnerType_Referral	2.07717	0.140845	0.11236	2.1	0
lifetime_value	1.19338	-28.4483	-9.88006	0.252968	-407.97
PartnerType_betclic	0.00993263	0	0.00280899	0	0
Country_Norway	0.00583002	0.0140845	0.00421348	0	0
mean_bet_won	1.3362	120.902	40.8487	-0.0537441	654.835
count_bet	0.74395	-0.278169	-0.27816	1.18667	-0.3
PartnerType_Emailing	0.00984626	0	0	0	0
Country_Italy	0.0495336	0	0.0505618	0	0
ratio_win	0.12553	2.10923	2.0959	-0.344238	2.14286
Country_Ivory Coast	0.0269477	0	0.00421348	0.111111	0
PartnerType_Affiliation	5.23782	0.0704225	0.453652	1.36667	0
median_bet_won	4.82717	818.197	277.005	0	4429.5
Application_BETCLIC.COM	2.8989	0.084507	0.310393	11.7333	0
Application_BETCLIC.IT	0.374978	0	0.0800562	0	0
max_bet_won	0.94606	20.7541	6.99899	0.0390482	102.844
max_bet	0.297926	2.48993	1.00759	-0.0176593	0.297802
Application_BETCLIC.FR	1.27908	0.0160966	-0.0355136	1.18413	-0.142857
sum_bet	1.04566	5.94721	1.56802	-0.142439	-0.280992
Country_Finland	0.0079461	0	0.00561798	0	1
Country_Sweden	0.0290206	0.211268	0.0646067	0.0111111	0
PartnerType_Offline	0.001166	0	0	0	0
PartnerType_SEO	0.000690966	0	0	0	0
Country_Germany	0.0110123	0.028169	0.0168539	0.0222222	0
Country_Poland	0.142382	0	0.171348	0.0333333	0
IsLive	-0.468777	-0.943662	-0.942416	-0.544444	-1
PartnerType_Mobile	0.000604595	0	0	0	0
PartnerType_PPC	5.06197	0.295775	0.217697	6.58889	0
Gender_M	-0.1353	-0.323944	-0.251404	-0.0666667	0
count_bet_won	1.0349	0.0507042	0.0651685	0.168889	0
Country_Switzerland	0.00211608	0	0	0	0
PartnerType_Partnerships	0.278675	0	0.00561798	1	0
PartnerType_Internal	2.29292	0.232394	0.161166	4.91944	0.25
mean_bet	0.421251	4.33345	2.27924	-0.130744	0.991596
Country_United Kingdom	0.0141216	0.0140845	0.00140449	0.0111111	0

	min_bet	median_bet	Country_France	Country_Netherlands	PartnerType_Ad Networks	PartnerType_Sponsoring	Application_BETCLIC.PT	min_bet_won	Country_Hungary	sum_bet_won	Application_EXPEKT.COM	Gender_F	PartnerType_Expekt	Odds	Country_Portugal	PartnerType_Referral	lifetime_value	PartnerType_betclic	Country_Norway	mean_bet_won	count_bet	PartnerType_Emailing	Country_Italy	ratio_win	Country_Ivory Coast	PartnerType_Affiliation	median_bet_won	Application_BETCLIC.COM	Application_BETCLIC.IT	max_bet_won	max_bet	Application_BETCLIC.FR	sum_bet	Country_Finland	Country_Sweden	PartnerType_Offline	PartnerType_SEO	Country_Germany	Country_Poland	IsLive	PartnerType_Mobile	PartnerType_PPC	Gender_M	count_bet_won	Country_Switzerland	PartnerType_Partnerships	PartnerType_Internal	mean_bet	Country_United Kingdom
kmeans_cluster
0	23156	23156	23156	23156	23156	23156	23156	23156	23156	23156	23156	23156	23156	23156	23156	23156	23156	23156	23156	23156	23156	23156	23156	23156	23156	23156	23156	23156	23156	23156	23156	23156	23156	23156	23156	23156	23156	23156	23156	23156	23156	23156	23156	23156	23156	23156	23156	23156	23156
1	71	71	71	71	71	71	71	71	71	71	71	71	71	71	71	71	71	71	71	71	71	71	71	71	71	71	71	71	71	71	71	71	71	71	71	71	71	71	71	71	71	71	71	71	71	71	71	71	71
2	712	712	712	712	712	712	712	712	712	712	712	712	712	712	712	712	712	712	712	712	712	712	712	712	712	712	712	712	712	712	712	712	712	712	712	712	712	712	712	712	712	712	712	712	712	712	712	712	712
3	90	90	90	90	90	90	90	90	90	90	90	90	90	90	90	90	90	90	90	90	90	90	90	90	90	90	90	90	90	90	90	90	90	90	90	90	90	90	90	90	90	90	90	90	90	90	90	90	90
4	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1

kmeans_cluster	0	1	2	3	4
UserId	1.06828e+07	1.07012e+07	1.07255e+07	1.0637e+07	1.07549e+07
mean_bet	47.9892	12.3854	26.7269	16.6599	42.6891
min_bet	39.1547	5.23521	20.646	4.98034	34.5676
max_bet	64.4959	31.2158	40.4294	59.5305	60.2931
median_bet	46.0023	10.6933	25.1792	12.9131	40.697
sum_bet	111.263	264.265	167.534	355.074	154.506
count_bet	6.62808	37.3089	13.2866	32.2382	14.7797
mean_bet_won	32.8917	8.09162	20.5327	10.6705	30.794
min_bet_won	22.7138	0.63661	10.6021	0.440544	20.6537
max_bet_won	57.3083	46.0442	46.7482	78.3702	62.097
median_bet_won	28.8674	3.11416	16.7909	3.16049	26.3678
sum_bet_won	81.6081	216.839	154.026	293.862	124.238
count_bet_won	1.3941	10.0048	4.51971	9.90222	4.05477
Application_BETCLIC.COM	1.04737	6.11771	7.30956	0	2.05447
Application_BETCLIC.FR	4.93233	0	0	32.2382	6.66627
Application_BETCLIC.IT	0.24766	1.13108	0	0	0.277761
Application_BETCLIC.PT	0	27.1704	0	0	4.42323
Application_EXPEKT.COM	0.40072	2.88971	5.97703	0	1.35798
Odds	73.4713	100.027	31.0029	62.6292	45.098
IsLive	0	0.843112	0.367501	1	0.398982
PartnerType_Ad Networks	0.0100792	0	0.000342818	0.0544018	0
PartnerType_Affiliation	1.69316	6.74545	7.02537	7.66348	3.42951
PartnerType_Emailing	0.000719942	0.006684	0.0589647	0.00275685	0
PartnerType_Expekt	0.000143988	0	0	0	0
PartnerType_Internal	2.84392	17.0472	5.37161	12.3404	6.07483
PartnerType_Mobile	0	0	0	0.00257306	0
PartnerType_Offline	0.00115191	0	0	0.00349201	0
PartnerType_PPC	1.27487	9.35815	0	8.83422	3.19066
PartnerType_Partnerships	0.0354212	0.598775	0.0294823	0.436133	0.184675
PartnerType_Referral	0.764003	3.47419	0.776826	2.90075	1.88896
PartnerType_SEO	0.00115191	0.00148533	0	0	0
PartnerType_Sponsoring	0	0.0527293	0	0	0.00897935
PartnerType_betclic	0.00345572	0.0243223	0.0239973	0	0.00209518
Gender_F	0	0	0	0	1
Gender_M	1	1	1	1	0
Country_Finland	0.0151188	0.0135537	0	0	0.00329243
Country_France	0.758531	0	0	1	0.577372
Country_Germany	0.0165587	0.020609	0	0	0.013469
Country_Hungary	0.00446364	0.0142963	0	0	0.00748279
Country_Italy	0.0767459	0.0748236	0	0	0.07393
Country_Ivory Coast	0.0358531	0.0616413	0	0	0.0167614
Country_Netherlands	0.0024478	0.00352766	0	0	0.00239449
Country_Norway	0.00921526	0.012811	0	0	0.00179587
Country_Poland	0	0	1	0	0.151152
Country_Portugal	0	0.719829	0	0	0.114936
Country_Sweden	0.0578834	0.0447456	0	0	0.0272374
Country_Switzerland	0.0024478	0.00538433	0	0	0.000897935
Country_United Kingdom	0.0207343	0.0287783	0	0	0.00927866
ratio_win	0.208766	0.23524	0.243643	0.276756	0.248235
lifetime_value	10.0264	9.04076	9.18415	8.72277	12.7769