Churn prediction is the task of identifying users that are likely to stop using a service, product or website. In this notebook, you will learn how to:
In [13]:
import graphlab as gl
import datetime
gl.canvas.set_target('ipynb') # make sure plots appear inline
In [4]:
interactions_ts = gl.TimeSeries("data/user_activity_data.ts/")
users = gl.SFrame("data/users.sf/")
We define churn to be no activity within a period of time (called the churn_period
). Hence,
a user/customer is said to have churned if periods of activity is followed
by no activity for a churn_period
(for example, 30 days).
<img src="https://dato.com/learn/userguide/churn_prediction/images/churn-illustration.png", align="left">
In [7]:
churn_period_oct = datetime.datetime(year = 2011, month = 10, day = 1)
In [8]:
(train, valid) = gl.churn_predictor.random_split(interactions_ts, user_id = 'CustomerID', fraction = 0.9, seed = 12)
In [9]:
print "Users in the training dataset : %s" % len(train['CustomerID'].unique())
print "Users in the validation dataset : %s" % len(valid['CustomerID'].unique())
In [10]:
model = gl.churn_predictor.create(train, user_id='CustomerID',
user_data = users, time_boundaries = [churn_period_oct])
In [11]:
model
Out[11]:
Here the question to ask is will they churn after a certain period of time. To validate we can see if they user has used us after that evaluation period. Voila! I was confusing it with expiration time (customer churn not usage churn)
In [12]:
predictions = model.predict(valid, user_data=users)
predictions
Out[12]:
In [15]:
predictions['probability'].show()
In [16]:
metrics = model.evaluate(valid, user_data=users, time_boundary=churn_period_oct)
metrics
Out[16]:
In [17]:
model.save('data/churn_model.mdl')