Before going into a multidimensional integrals. Now people talk about machine learning, but many things are known (see table).
Machine learning | Statistics |
---|---|
network, graphs model | weights parameters |
learning | fitting |
generalization | test set performance |
supervised learning | regression/classification |
unsupervised learning | density estimation, clustering |
large grant = \$1,000,000 | large grant = $50,000 |
Paskov, Traub, Faster valuation of financial derivatives
They considered a 360-dimensional collateralized mortgage obligation problems;
A typical way is to use Monte-Carlo sampling: sample $N$ points, compute average
The key idea is the idea of separation of variables:
$$A(i_1, \ldots, i_d) = U_1(i_1) \ldots U_d(i_d)$$.
In probability theory, it corresponds to the idea of independence of variables.
This class is too narrow to represent useful functions (or probability distributions).
For $d > 3$ The canonical decomposition of the form
$$A(i_1, \ldots, i_d) = \sum_{\alpha=1}^r U_1(i_1, \alpha) \ldots U_d(i_d, \alpha),$$then it is very difficult to compute numerically; there is even an example of $9 \times 9 \times 9$
tensor for which the tensor rank is not known!
But this is not the only tensor decomposition we can use.
Some times, it is better to study with the code and to try your own application
The maximal-volume principle is present in many areas, one of the earliest related references I know is the Design-optimality (Fedorov, 1972), and is related to the selection of optimal interpolation points for a given basis set:
$$f(x) \approx \sum_{\alpha=1}^r c_{\alpha} g_{\alpha}(x),$$instead of doing linear regression, just do interpolation at $r$ points.
Given an $n \times r$ matrix, find the submatrix of largest volume it in.
Use greedy algorithm (now we know that it is D-optimality from at least 1972)
Bad things about the canonical format:
You can compute Tucker by means of the SVD:
The simplest construction separates one index at a time and gives the tensor-train (or matrix-product state) decomposition
$$A(i_1, \ldots, i_d) \approx G_1(i_1) \ldots G_d(i_d), $$where $G_k(i_k)$ is $r_{k-1} \times r_k$ matrix for any fixed $r_k$.
Similar to Hidden Markov models
In [2]:
from IPython.core.display import HTML
def css_styling():
styles = open("custom.css", "r").read()
return HTML(styles)
css_styling()
Out[2]: