In [1]:
import pandas as pd
import numpy as np
import chainladder as cl
import seaborn as sns
sns.set_style('whitegrid')
%matplotlib inline
cl.__version__
Out[1]:
In [2]:
tail = cl.TailCurve()
tail
Out[2]:
Upon fitting data, we get updated cdf_
and ldf_
attributes that extend beyond the length of the triangle. Notice how our tail includes extra development periods (age 147) beyond the end of the triangle (age 135) at which point an age-to-ultimate tail factor is applied.
In [3]:
quarterly = cl.load_dataset('quarterly')
tail.fit(quarterly)
print('Triangle latest', quarterly.development.max())
tail.fit(quarterly).ldf_['paid']
Out[3]:
These extra twelve months (one year) of development patterns are included as it is typical to want to track IBNR run-off over a 1-year time horizon from the valuation date. The one-year extension is currently fixed at one year and there is no ability to extend it even further. However, a subsequent version of chainladder
will look to address this issue.
In [4]:
inv = cl.TailCurve(curve='inverse_power').fit(quarterly['paid']).ldf_
print('Inverse power tail 147-Ult:')
inv[inv.development==inv.development.iloc[-1]]
Out[4]:
In [5]:
exp = cl.TailCurve(curve='exponential').fit(quarterly['paid']).ldf_
print('Exponential tail 147-Ult:')
exp[exp.development==exp.development.iloc[-1]]
Out[5]:
When fitting a tail, you have a choice of which development patterns you want to include in the curve fitting process, the fit_period
. In addition, you can also specify how far beyond the triangle to project the tail factor before dropping down to a 1.0 factor, extrap_periods
.
These come with defaults of fitting to all data and extrapolating the patterns 100 periods beyond the end of the triangle.
Note that even though you can extrapolate the curve many years beyond the end of the triangle for computational purposes, the resultant development factors will compress allldf_
beyond one year into a single age-ultimate factor.
In [6]:
cl.TailCurve(fit_period=slice(5,None), extrap_periods=50).fit(quarterly).ldf_['incurred']
Out[6]:
In this example, we ignore the first five development patterns for curve fitting, and we allow our tail extrapolation to go 50 quarters beyond the end of the triangle. Note that both fit_period
and extrap_periods
follow the development_grain
of the triangle being fit.
chainladder
transformers take Triangle
objects as input, but the also return Triangle
objects with their transform
method. To chain multiple transformers together, you must invoke the transform
method on each transformer similar to how sklearn
approaches its own tranformers.
In [7]:
try:
cl.TailCurve().fit(cl.Development().fit(quarterly))
except:
print('This fails because we did not transform our triangle')
print('This passes because we transform our triangle')
cl.TailCurve().fit(cl.Development().fit_transform(quarterly))
Out[7]:
We can see that we can just nest one transformed object in another transformer to chain two or more transformers together. Alternatively, we can rewrite this more cleanly as:
In [8]:
dev = cl.Development().fit_transform(quarterly)
tail = cl.TailCurve().fit(dev)
tail
Out[8]:
Chaining multiple transformers together is a very common pattern in chainladder
. Like its inspiration sklearn
, we can create an overall estimator known as a Pipeline
that combines multiple transformers and optionally predictors as well in one estimator.
In [9]:
steps=[('dev', cl.Development(average='simple')),
('tail', cl.TailCurve(curve='inverse_power'))]
pipe = cl.Pipeline(steps=steps).fit(quarterly)
Pipelines
keep references to each step with its named_steps
argument.
In [10]:
print(pipe.named_steps.dev)
print(pipe.named_steps.tail)
The Pipeline
estimator is almost an exact replica of the sklearn Pipeline
. The docs for sklearn
are very comprehensive and to learn more about Pipeline
, you can visit their docs.
With a Triangle
transformed to include development patterns and tails, we are now ready to start fitting our suite of IBNR models.