Extreme Value Theory (EVT) is unique as a statistical discipline in that it develops techniques and models for describing the unusual rather than the usual, e.g., it is focused in the tail of the distribution.
By definition, extreme values are scarce, meaning that estimates are often required for levels of a process that are much greater than have already been observed. This implies an extrapolation from an small set of observed levels to unobserved levels. Extreme Value Theory provides models to enable such extrapolation.
Applications of extreme value theory include predicting the probability distribution of:
Several engineering design processes:
Meteorology
Ocean engineering
Environmental sciences
Insurance industry
Financial industry
Material sciences
Telecommunications
Biology
...
One of the earliest books on the statistics of extreme values is E.J. Gumbel (1958). Research into extreme values as a subject in it’s own right began between 1920 and 1940 when work by E.L. Dodd, M. Frêchet, E.J. Gumbel, R. von Mises and L.H.C. Tippett investigated the asymptotic distribution of the largest order statistic. This led to the main theoretical result: the Extremal Types Theorem (also known as the Fisher–Tippett–Gnedenko theorem, the Fisher–Tippett theorem or the extreme value theorem) which was developed in stages by Fisher, Tippett and von Mises, and eventually proved in general by B. Gnedenko in 1943.
Until 1950, development was largely theoretical. In 1958, Gumbel started applying theory to problems in engineering. In the 1970s, L. de Haan, Balkema and J. Pickands generalised the theoretical results (the second theorem in extreme value theory), giving a better basis for statistical models.
Since the 1980s, methods for the application of Extreme Value Theory have become much more widespread.
There are two primary approaches to analyzing extremes of a dataset:
The first and more classical approach reduces the data considerably by taking maxima of long blocks of data, e.g., annual maxima. The generalized extreme value (GEV) distribution function has theoretical justification for fitting to block maxima of data.
The second approach is to analyze excesses over a high threshold. For this second approach the generalized Pareto (GP) distribution function has similar justification for fitting to excesses over a high threshold.
The generalized extreme value (GEV) family of distribution functions has theoretical support for fitting to block maximum data whereby the blocks are sufficiently large, and is given by:
$$G(z;\mu, \sigma, \xi) = exp\{-[1+\xi\frac{z - \mu}{\sigma}]^{-1/\xi}\}$$The parameters $\mu$ ($-\infty < \mu < \infty$), $\sigma$ ($\sigma > 0$) and $\xi$ ($\infty < \xi < \infty$) are location, scale and shape parameters, respectively. The value of the shape parameter $\xi$ differentiates between the three types of extreme value distribution in Extremal Types Theorem (also known as the Fisher–Tippett–Gnedenko theorem, the Fisher–Tippett theorem or the extreme value theorem).
$\xi > 0$ correspond to the Frêchet (type II) and
$\xi < 0$ correspond to the Weibull (type III)
distributions respectively. In practice, when we estimate the shape parameter $\xi$, the standard error for $\xi$ accounts for our uncertainty in choosing between the three models.
TODO
TODO
TODO
TODO