Extending Development Patterns with Tails

Getting started

All exercises rely on chainladder v0.5.2 and later.


In [1]:
import pandas as pd
import numpy as np
import chainladder as cl
import seaborn as sns
sns.set_style('whitegrid')
%matplotlib inline
cl.__version__


Out[1]:
'0.5.2'

Basic tail fitting

Tails are another class of tranformers, and similar to the Development estimator they come with fit, transform and fit_transform methods. Also like our Development estimator, you can define a tail in the absence of any data.


In [2]:
tail = cl.TailCurve()
tail


Out[2]:
TailCurve(curve='exponential', errors='ignore', extrap_periods=100,
     fit_period=slice(None, None, None))

Upon fitting data, we get updated cdf_ and ldf_ attributes that extend beyond the length of the triangle. Notice how our tail includes extra development periods (age 147) beyond the end of the triangle (age 135) at which point an age-to-ultimate tail factor is applied.


In [3]:
quarterly = cl.load_dataset('quarterly')
tail.fit(quarterly)

print('Triangle latest', quarterly.development.max())
tail.fit(quarterly).ldf_['paid']


Triangle latest development    135
dtype: int64
Out[3]:
Origin 3-6 6-9 9-12 12-15 15-18 18-21 21-24 24-27 27-30 30-33 ... 120-123 123-126 126-129 129-132 132-135 135-138 138-141 141-144 144-147 147-Ult
(All) 8.5625 3.5547 2.7659 1.9332 1.6055 1.4011 1.3270 1.1658 1.1098 1.0780 ... 1.0000 1.0009 1.0000 1.0009 1.0000 1.0001 1.0001 1.0001 1.0001 1.0003

These extra twelve months (one year) of development patterns are included as it is typical to want to track IBNR run-off over a 1-year time horizon from the valuation date. The one-year extension is currently fixed at one year and there is no ability to extend it even further. However, a subsequent version of chainladder will look to address this issue.

Curve fitting

Curve fitting takes selected development patterns and extrapolates them using either an exponential or inverse_power fit. In most cases, the inverse_power produces a thicker (more conservative) tail.


In [4]:
inv = cl.TailCurve(curve='inverse_power').fit(quarterly['paid']).ldf_
print('Inverse power tail 147-Ult:')
inv[inv.development==inv.development.iloc[-1]]


Inverse power tail 147-Ult:
Out[4]:
Origin 147-Ult
(All) 1.0178

In [5]:
exp = cl.TailCurve(curve='exponential').fit(quarterly['paid']).ldf_
print('Exponential tail 147-Ult:')
exp[exp.development==exp.development.iloc[-1]]


Exponential tail 147-Ult:
Out[5]:
Origin 147-Ult
(All) 1.0003

When fitting a tail, you have a choice of which development patterns you want to include in the curve fitting process, the fit_period. In addition, you can also specify how far beyond the triangle to project the tail factor before dropping down to a 1.0 factor, extrap_periods.

These come with defaults of fitting to all data and extrapolating the patterns 100 periods beyond the end of the triangle.

Note that even though you can extrapolate the curve many years beyond the end of the triangle for computational purposes, the resultant development factors will compress allldf_ beyond one year into a single age-ultimate factor.


In [6]:
cl.TailCurve(fit_period=slice(5,None), extrap_periods=50).fit(quarterly).ldf_['incurred']


Out[6]:
Origin 3-6 6-9 9-12 12-15 15-18 18-21 21-24 24-27 27-30 30-33 ... 120-123 123-126 126-129 129-132 132-135 135-138 138-141 141-144 144-147 147-Ult
(All) 3.5988 2.4768 2.7341 1.4683 1.2966 1.1825 1.2418 1.0451 1.0440 1.0365 ... 0.9996 1.0000 0.9982 1.0027 0.9991 1.0004 1.0003 1.0003 1.0002 1.0017

In this example, we ignore the first five development patterns for curve fitting, and we allow our tail extrapolation to go 50 quarters beyond the end of the triangle. Note that both fit_period and extrap_periods follow the development_grain of the triangle being fit.

Chaining multiple transformers together

chainladder transformers take Triangle objects as input, but the also return Triangle objects with their transform method. To chain multiple transformers together, you must invoke the transform method on each transformer similar to how sklearn approaches its own tranformers.


In [7]:
try:
    cl.TailCurve().fit(cl.Development().fit(quarterly))
except:
    print('This fails because we did not transform our triangle')

print('This passes because we transform our triangle')
cl.TailCurve().fit(cl.Development().fit_transform(quarterly))


This fails because we did not transform our triangle
This passes because we transform our triangle
Out[7]:
TailCurve(curve='exponential', errors='ignore', extrap_periods=100,
     fit_period=slice(None, None, None))

We can see that we can just nest one transformed object in another transformer to chain two or more transformers together. Alternatively, we can rewrite this more cleanly as:


In [8]:
dev = cl.Development().fit_transform(quarterly)
tail = cl.TailCurve().fit(dev)
tail


Out[8]:
TailCurve(curve='exponential', errors='ignore', extrap_periods=100,
     fit_period=slice(None, None, None))

Chaining multiple transformers together is a very common pattern in chainladder. Like its inspiration sklearn, we can create an overall estimator known as a Pipeline that combines multiple transformers and optionally predictors as well in one estimator.


In [9]:
steps=[('dev', cl.Development(average='simple')),
       ('tail', cl.TailCurve(curve='inverse_power'))]

pipe = cl.Pipeline(steps=steps).fit(quarterly)

Pipelines keep references to each step with its named_steps argument.


In [10]:
print(pipe.named_steps.dev)
print(pipe.named_steps.tail)


Development(average='simple', drop=None, drop_high=None, drop_low=None,
      drop_valuation=None, n_periods=-1, sigma_interpolation='log-linear')
TailCurve(curve='inverse_power', errors='ignore', extrap_periods=100,
     fit_period=slice(None, None, None))

The Pipeline estimator is almost an exact replica of the sklearn Pipeline. The docs for sklearn are very comprehensive and to learn more about Pipeline, you can visit their docs.

With a Triangle transformed to include development patterns and tails, we are now ready to start fitting our suite of IBNR models.