Practical Code to Calculating Customer Lifetime Value (CLV)

**Customer Lifetime Value (CLV)** is an estimation of the entire net profit attributed to a single customer. It’s an important metric to understand because it helps businesses determine how much is too much to spend on advertising to acquire a single customer.

```
In [2]:
```import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib as mpl
pd.set_option('max_columns', 50)
mpl.rcParams['lines.linewidth'] = 2
%matplotlib inline

Data Exploration

For this example we’ll calculate CLV from a dataset of roughly 4,200 transactions.

```
In [3]:
```data = pd.read_csv('/Users/crucker/Desktop/clv_transactions.csv')
data.head(6)

```
Out[3]:
```

```
In [176]:
```data.tail(6)

```
Out[176]:
```

```
In [177]:
```Transactions = data['CustomerID'].count()

```
In [178]:
```Customers = data['CustomerID'].max()

```
In [179]:
```MinTransactionDate = data['TransactionDate'].min()

```
In [180]:
```MaxTransactionDate = data['TransactionDate'].max()

```
In [181]:
```Amount = data['Amount'].sum()

```
In [182]:
```summary = [Transactions, Customers, MinTransactionDate, MaxTransactionDate, round(Amount, 2)]
summary

```
Out[182]:
```

As with any analysis, the first thing we’ll do is look at some basic summary statistics.

```
In [9]:
```data = {'Transactions': [4181],
'Customers': [1000],
'MinTransactionDate': ['2010-01-04'],
'MaxTransactionDate': ['2015-12-31'],
'Amount': [33729.91]}
df = pd.DataFrame(data, index = [''])
df

```
Out[9]:
```

```
In [210]:
```TransactionsPerCustomer = round(Transactions / Customers, 2)
TransactionsPerCustomer

```
Out[210]:
```

```
In [211]:
```AmountPerTransaction = round(Amount / Transactions, 2)
AmountPerTransaction

```
Out[211]:
```

```
In [212]:
```AmountPerCustomer = round(Amount / Customers, 2)
AmountPerCustomer

```
Out[212]:
```

```
In [213]:
```data = {'TransactionsPerCustomer': [4.0],
'AmountPerTransaction': [8.07],
'AmountPerCustomer': [33.73]}
df = pd.DataFrame(data, index = [''])
df

```
Out[213]:
```

```
In [214]:
```more_summary = [TransactionsPerCustomer, AmountPerTransaction, AmountPerCustomer]
more_summary

```
Out[214]:
```

```
In [4]:
```data.loc[data['Amount'] >= 29.99]

```
Out[4]:
```

```
In [6]:
```import seaborn as sns
sns.set(color_codes=True)

Plotting Univariate Distributions

```
In [8]:
```plt.title('Distribution of Transaction Amounts', fontsize=14, fontweight="bold")
sns.distplot(data.Amount, color='#3498db')

```
Out[8]:
```

Measuring Historic CLV

Now we need to consider the biggest source of error in our $34 CLV lower bound – some of the underlying customers are brand new and others have been customers for almost five years. Obviously the newer customers will have (generally) spent less on average than the old ones. So, we need to separate the customers into groups based on how long ago they were acquired (e.g. customers acquired in 2010, vs customers acquired in 2011, …).

Since we have 5 years worth of data, let’s separate customers into annual origin periods starting on 2010-01-01, and measure their purchases annually.

```
In [ ]:
```