Introduction to Extreme Value Theory (EVT)

In this notebook, we gonna play around with EVT, i.e. in combination with regression problems to unveil induced anomalies. In EVT, not unlike the central limit theorem, we are interested in modelling the behavior of extremes (minimum or maximum) of a sample. In [2], the authors showed that if a distribution of maximas (minima) is not degenerate (has only one non-zero support for univariate distributions), it has to follow one of the three extreme value distributions (Weibull, Gumbal, Frechet) which have later been combined to the generalized extreme value distribution (GEVd) [1].

Basically, approaches use one of two distinct techniques to model extremes:

  • the 'original' AM/BM (anual maxima/block maxima) which has been proposed by Fisher and Tippett, 1928 [2]
  • more recently proposed POT (peaks-over-threshold) by Balkema and de Haan (1974)[4] and Pickands (1975) [3] (and modern PORT (peaks-over-random-threshold))


[1] A. F. Jenkinson, “The frequency distribution of the annual maximum (or minimum) values of meteorological elements,” Q. J. R. Meteorol. Soc., vol. 81, no. 348, pp. 158–171, 1955.

[2] R. A. Fisher and L. H. C. Tippett, “Limiting forms of the frequency distribution of the largest or smallest member of a sample,” Math. Proc. Cambridge Philos. Soc., vol. 24, no. 2, pp. 180–190, 1928.

[3] J. . Pickands, “Statistical inference using extreme order statistics,” Ann. Stat., vol. 3, no. 1, pp. 119–131, 1975.

[4] A. A. Balkema and L. de Haan, “Residual Life Time at Great Age,” Ann. Probab., vol. 2, no. 5, pp. 792–804, 1974.

In [1]:
import numpy as np
import matplotlib.pyplot as plt
from scipy import stats, optimize

In [ ]: