The library scikit-learn
is a great machine-learning toolkit that provides a large collection of regression methods.
By default, chaospy
only support traditional least-square regression, but is also designed to work together with the various regression functions provided by scikit-learn
.
Because scikit-learn
isn't a required dependency, you might need to install it first with e.g. pip install scikit-learn
. When that is done, it should be importable:
In [1]:
import sklearn
As en example to follow, consider the following artificial case:
In [2]:
import numpy
from matplotlib import pyplot
import chaospy
%matplotlib inline
samples = numpy.linspace(0, 5, 50)
numpy.random.seed(1000)
noise = chaospy.Normal(0, 0.1).sample(50)
evals = numpy.sin(samples) + noise
pyplot.scatter(samples, evals)
pyplot.show()
In [3]:
import chaospy
q0 = chaospy.variable()
expansion = chaospy.polynomial([1, q0, q0**2, q0**3])
fitted_polynomial = chaospy.fit_regression(
expansion, samples, evals)
pyplot.scatter(samples, evals)
pyplot.plot(samples, fitted_polynomial(samples))
pyplot.show()
fitted_polynomial.round(4)
Out[3]:
Least squares regression is also supported by sklearn
. So it is possible to get the same result using the LinearRegression
model. For example:
In [4]:
from sklearn.linear_model import LinearRegression
model = LinearRegression(fit_intercept=False)
fitted_polynomial = chaospy.fit_regression(
expansion, samples, evals, model=model)
pyplot.scatter(samples, evals)
pyplot.plot(samples, fitted_polynomial(samples))
pyplot.show()
fitted_polynomial.round(4)
Out[4]:
It is important to note that sklearn often does extra operations that may interfer with the compatability of chaospy
. Here fit_intercept=False
ensures that an extra columns isn't added needlessly. An error will be raised if this is forgotton.
While in most cases least squares regression is sufficient, that is not always the case. For those deviating cases scikit-learn
provides a set of alternative methods. Even though chaospy
doesn't differentiate between single dimensional and multi-dimensional responses, scikit-learn
do.
The methods that support single dimensional responses are:
least squares
-- Simple $L_2$ regression without any extra features.elastic net
-- $L_2$ regression with both $L_1$ and $L_2$ regularization terms.lasso
-- $L_2$ regression with an extra $L_1$ regularization term, and a preference for fewer non-zero terms.lasso lars
-- An implementation of lasso
meant for high dimensional data.orthogonal matching pursuit
-- $L_2$ regression with enforced number of non-zero terms.ridge
-- $L_2$ regression with an $L_2$ regularization term.bayesian ridge
-- Same as ridge
, but uses Bayesian probability to let data estimate the complexity parameter.auto relevant determination
-- Same as bayesian ridge
, but also favors fewer non-zero terms.
In [5]:
from sklearn import linear_model as lm
kws = {"fit_intercept": False}
univariate_models = {
"least squares": lm.LinearRegression(**kws),
"elastic net": lm.ElasticNet(alpha=0.1, **kws),
"lasso": lm.Lasso(alpha=0.1, **kws),
"lasso lars": lm.LassoLars(alpha=0.1, **kws),
"orthogonal matching pursuit":
lm.OrthogonalMatchingPursuit(n_nonzero_coefs=3, **kws),
"ridge": lm.Ridge(alpha=0.1, **kws),
"bayesian ridge": lm.BayesianRidge(**kws),
"auto relevant determination": lm.ARDRegression(**kws),
}
Again, as the polynomials already addresses the constant term, it is important to remember to include fit_intercept=False
for each model.
We can then create a fit for each of the univariate models:
In [6]:
for label, model in univariate_models.items():
fitted_polynomial = chaospy.fit_regression(
expansion, samples, evals, model=model)
pyplot.plot(samples, fitted_polynomial(samples), label=label)
pyplot.scatter(samples, evals)
pyplot.legend(loc="lower left")
pyplot.show()
This part of the tutorial uses the same example as the example introduction. In other words:
In [7]:
from chaospy.example import (
coordinates, exponential_model, distribution,
error_mean, error_variance
)
The methods that support multi-label dimensional responses are:
least squares
-- Simple $L_2$ regression without any extra features.elastic net
-- $L_2$ regression with both $L_1$ and $L_2$ regularization terms.lasso
-- $L_2$ regression with an extra $L_1$ regularization term, and a preference for fewer non-zero terms.lasso lars
-- An implementation of lasso
meant for high dimensional data.orthogonal matching pursuit
-- $L_2$ regression with enforced number of non-zero terms.ridge
-- $L_2$ regression with an $L_2$ regularization term.
In [8]:
multivariate_models = {
"least squares": lm.LinearRegression(**kws),
"elastic net": lm.MultiTaskElasticNet(alpha=0.2, **kws),
"lasso": lm.MultiTaskLasso(alpha=0.2, **kws),
"lasso lars": lm.LassoLars(alpha=0.2, **kws),
"orthogonal matching pursuit": lm.OrthogonalMatchingPursuit(
n_nonzero_coefs=3, **kws),
"ridge": lm.Ridge(alpha=0.2, **kws),
}
To illustrate the difference between the methods, we do the simple error analysis:
In [9]:
import warnings
warnings.filterwarnings("ignore", category=RuntimeWarning)
expansion = chaospy.orth_ttr(2, distribution)
samples = distribution.sample(50)
evals = numpy.array([exponential_model(sample)
for sample in samples.T])
for label, model in multivariate_models.items():
fitted_polynomial, coeffs = chaospy.fit_regression(
expansion, samples, evals, model=model, retall=True)
self_evals = fitted_polynomial(*samples)
error_mean_ = error_mean(chaospy.E(
fitted_polynomial, distribution))
error_var_ = error_variance(chaospy.Var(
fitted_polynomial, distribution))
count_non_zero = numpy.sum(numpy.any(coeffs, axis=-1))
print(f"{label:<30} {error_mean_:.5f} " +
f"{error_var_:.5f} {count_non_zero}")
It is worth noting that as some of the method removes coefficients, it can function at higher polynomial order than the saturate methods.