In this notebook, we gonna play around with EVT, i.e. in combination with regression problems to unveil induced anomalies. In EVT, not unlike the central limit theorem, we are interested in modelling the behavior of extremes (minimum or maximum) of a sample. In [2], the authors showed that if a distribution of maximas (minima) is not degenerate (has only one non-zero support for univariate distributions), it has to follow one of the three extreme value distributions (Weibull, Gumbal, Frechet) which have later been combined to the generalized extreme value distribution (GEVd) [1].
Basically, approaches use one of two distinct techniques to model extremes:
[1] A. F. Jenkinson, “The frequency distribution of the annual maximum (or minimum) values of meteorological elements,” Q. J. R. Meteorol. Soc., vol. 81, no. 348, pp. 158–171, 1955.
[2] R. A. Fisher and L. H. C. Tippett, “Limiting forms of the frequency distribution of the largest or smallest member of a sample,” Math. Proc. Cambridge Philos. Soc., vol. 24, no. 2, pp. 180–190, 1928.
[3] J. . Pickands, “Statistical inference using extreme order statistics,” Ann. Stat., vol. 3, no. 1, pp. 119–131, 1975.
[4] A. A. Balkema and L. de Haan, “Residual Life Time at Great Age,” Ann. Probab., vol. 2, no. 5, pp. 792–804, 1974.
In [1]:
import numpy as np
import matplotlib.pyplot as plt
from scipy import stats, optimize
In [ ]:
def