We replicate the data example in Harnau and Nielsen (2017) in Section 6.
The work on this vignette was supported by the European Research Council, grant AdG 694262.
First, we import the package
In [1]:
import apc
In [2]:
# Turn off a FutureWarnings
import warnings
warnings.simplefilter('ignore', FutureWarning)
Next, we create a model and attach the Taylor and Ashe (1983) data to it.
In [3]:
model = apc.Model()
model.data_from_df(apc.loss_TA(), data_format='CL')
In [4]:
model.fit_table('od_poisson_response')
model.deviance_table
Out[4]:
First, we see that the age-period-cohort deviance is an extremely unlikely draw from a $\chi^2_{28}$ so a Poisson model is clearly rejected. Thus, we look at the $F$-tests in the column F_vs_APC
and the corresponding p-values. We limit ourselves to nested models that cannot be rejected at the 5% level.
Remark: The nesting is nicely illustrated in the following figure, taken from Nielsen (2014, Figure 5):
Nielsen (2014) also discusses the individual sub-models and provides their specific parameterizations.
In [5]:
model.deviance_table[model.deviance_table['P>F'] > 0.05]
Out[5]:
The models not rejected at the 5% level include the age-period (AP), age-cohort (AC), age-drift (Ad) and age (A) model. Only the AP and AC model are immediately nested in the APC model with the Ad and A model nested in both of them.
When it comes to forecasting, the age-cohort model has several advantages over the age-period model. Since is does not include a period effect, it does not require parameter extrapolation. Further, in a run-off triangle, the situation we have here, the age-cohort model replicates the chain-ladder point forecasts. Thus, we now take the age-cohort model as the primary model. We can then see what models we can reduce the age-cohort model to.
In [6]:
model.fit_table('od_poisson_response', reference_predictor='AC')
model.deviance_table
model.deviance_table[model.deviance_table['P>F'] > 0.05]
Out[6]:
Age-drift and age model are (still) the only feasible reductions.
Remark (not in paper): we can also consider the age-period model as the new primary model and see what reductions are feasible. This yields the same reductions:
In [7]:
model.fit_table('od_poisson_response', reference_predictor='AP')
model.deviance_table
model.deviance_table[model.deviance_table['P>F'] > 0.05]
Out[7]:
Next, we take the age-drift model as the primary model.
In [8]:
model.fit_table('od_poisson_response', reference_predictor='Ad')
model.deviance_table
model.deviance_table[model.deviance_table['P>F'] > 0.05]
Out[8]:
We can still just about not reject the age model.
Taken together, these results replicate Table 2 in the paper.
First, we fit an over-dispersed Poisson age-period-cohort model
In [9]:
model.fit('od_poisson_response', 'APC')
As part of the estimation, the package attaches a parameter table to the model. This includes parameter estimates, standard errors, $t$ statistics and p-values compared to a $t$ distribution. We take a look at the first couple rows before recreating Table 3 from the paper.
In [10]:
model.parameters.head()
Out[10]:
To recreate Table 3, we further need to estimate an over-dispersed Poisson age-cohort model, and a Poisson age-period-cohort and age-cohort model.
In [11]:
model_ac = model.clone() # this creates a model object with the data already attached
model_ac.fit('od_poisson_response', 'AC')
model_apc_pois = model.clone()
model_apc_pois.fit('poisson_response', 'APC')
model_ac_pois = model.clone()
model_ac_pois.fit('poisson_response', 'AC')
For a Poisson model, the parameter table includes $z$ scores and p-values compared to a normal rather than a $t$ distribution. We look at the first couple rows of the Poisson age-period-cohort model.
In [12]:
model_apc_pois.parameters.head()
Out[12]:
Then we can combine the resulting parameter tables. We recall that the parameter estimates are identical for over-dispersed Poisson and Poisson model.
Remark: The standard errors do not exactly match those in the paper but give the same impression. This is due to a former bug in the software.
In [13]:
import pandas as pd
pd.concat([
pd.concat([
model.parameters['coef'],
model_apc_pois.parameters['std_err'].rename('se N'),
model.parameters['std_err'].rename('se t')
], axis=1),
pd.concat([
model_ac.parameters['coef'],
model_ac_pois.parameters['std_err'].rename('se N'),
model_ac.parameters['std_err'].rename('se t')
], axis=1)
], axis=1, keys=['apc model', 'ac model'], sort=False)
Out[13]:
We can also plot the parameter estimates, replicating Figure 1.
In [14]:
model.plot_parameters(around_coef=False)
Besides plots for the double differences and the detrended version, the plots also include the level, for which there is no confidence band given the sampling scheme, and the trends. We point out that these trends related to the detrended parameterization. Thus, they cannot be interpreted separately, in contrast to the detrended parameters.
Remark (not in paper): instead, we can also plot the double sums of double differences as shown in equation (3) in the paper. To do this, we merely need to add the argument plot_style='sum_sum'
to plot_parameters
. In this case, the trends are de-coupled and can be interpreted separately. However, the interpretation of the double sums is difficult.
First, we look at the $t$ forecast. If we do not supply the argument method
to get_distribution_fc
, $t$ forecasts will be generated.
In [15]:
model_ac.forecast()
Forecast by cell, cohort, age, period, and total are automatically generated. First, we look at the forecasts by period (calendar year).
In [16]:
model_ac.forecasts['Period'].round()
Out[16]:
The point-forecast corresponds to the cash-flow by calendar year. Besides, the output includes quantile forecasts, and the standard error and its components:
se_total
: $[\hat{\tau} \{D_1/(n-q)\}\{\hat{\pi}_\mathcal{A} + \hat{s}^2_\mathcal{A} + (\hat{\pi})^2\}]^{1/2}$se_process
: $[\hat{\tau} \{D_1/(n-q)\}\hat{\pi}_\mathcal{A}]^{1/2}$se_estimation_xi
: $[\hat{\tau} \{D_1/(n-q)\} \hat{s}^2_\mathcal{A}]^{1/2}$se_estimation_tau
: $[\hat{\tau} \{D_1/(n-q)\} (\hat{\pi}_\mathcal{A})^2]^{1/2}$Similarly, we can look at forecasts by cohort
In [17]:
model_ac.forecasts['Cohort'].round()
Out[17]:
and for the total
In [18]:
model_ac.forecasts['Total'].round()
Out[18]:
Next, we compute distribution forecasts based on the bootstrap by England and Verrall (1999) and England (2002). Since bootstrapping requires random sampling, the results differ somewhat from those in the paper. We note that the bootstrap does not have a solid theoretical foundation.
In [19]:
fc_bootstrap = apc.bootstrap_forecast(apc.loss_TA(), seed=1)
Just as for the $t$ forecast, this automatically computes forecasts by cell, age, period, cohort and for the total. The output for the bootstrap forecasts contains descriptive statistics over bootstrap draws:
In [20]:
fc_bootstrap['Period'].round()
Out[20]:
In contrast to the $t$ forecast, the bootstrap comes with a mean forecast that differs from the chain-ladder point forecast. Also, the reported bootstrap standard deviation differs from the bootstrapped chain-ladder standard error since it is computed around the bootstrap mean, not the chain-ladder point forecast.
Just as before, we can look at forecasts aggregated by cohort and for the total.
In [21]:
fc_bootstrap['Cohort'].round()
Out[21]:
In [22]:
fc_bootstrap['Total'].round()
Out[22]:
Taken together, this replicates Table 4.
In the deviance analysis we found that we cannot reject a reduction to an age-drift or even an age model. Since the age-cohort model replicates the chain-ladder point forecasts we have so far not considered forecasts resulting from the smaller models. However, this is easily done.
In [23]:
model_ad = model.clone()
model_ad.fit('od_poisson_response', 'Ad')
model_ad.forecast()
model_a = model.clone()
model_a.fit('od_poisson_response', 'A')
model_a.forecast()
We can now compare the forecasts of the three models. We look at the forecasts for the total, but we could just as easily look at other aggregates or forecasts by cells.
In [24]:
print('Age-Cohort Model')
model_ac.forecasts['Total'].round()
Out[24]:
In [25]:
print('Age-Drift Model')
model_ad.forecasts['Total'].round()
Out[25]:
In [26]:
print('Age Model')
model_a.forecasts['Total'].round()
Out[26]:
We can see that age-cohort and age-drift model yield quite similar results, especially when we look at the higher quantiles. In contrast, the age model produces substantially lower forecasts.