PAUL: Python Active-subspaces Utility Library

Tutorial 3: Bounded inputs and given data

This iPython notebook shows many of the features of the Python Active-subspaces Utility Library (PAUL) for working with functions of several variables. We examine a quadratic function of 3 variables, and we approximate it by a function of 1 variable and 2 variables. We assume we are given a set of samples of the function's inputs and corresponding outputs. If the gradients are also given, we can use them. If the gradients are not given, we estimate them with local linear models from neighboring subsets of the input/output pairs.

The variables are each assumed bounded between 0 and 1, and the space of variables is equipped with a uniform weight function. We must normalize the inputs to the interval [-1,1] for analysis.

First import the library.


In [1]:
import numpy as np
import active_subspaces as asub
import matplotlib.pyplot as plt

Assume the inputs, outputs, and gradients are given as a text file. The first three columns are contain samples of the three-dimensional inputs. The fourth column contains the corresponding function values. The last three columns contain the corresponding gradients.


In [2]:
# Load the data from a text file.
data = np.loadtxt('inputs_outputs.txt')

# The inputs are the first three columns. 
X0 = data[:,:3]

# The corresponding outputs are in the fourth column. 
# Shape the outputs as a column vector.
f = data[:,3].reshape((data.shape[0],1))

# The gradients are the last three columns.
df = data[:,4:]

The inputs to the simulation are each between 0 and 1. The arrays lb and ub contain the lower and upper bounds, respectively. We use the BoundedNormalizer class to normalize the variables to the interval [-1,1].


In [3]:
lb, ub = np.zeros((1,3)), np.ones((1,3))
bn = asub.utils.misc.BoundedNormalizer(lb, ub)
X = bn.normalize(X0)

QUICK START: Using the wrapper interfaces

What follows is the simplest way to construct and evaluate a model that uses active subspaces to reduce the dimension. The class ActiveSubspaceReducedModel contains many convenient functions for evaluating the model's prediction, estimating its average (i.e., integrating against the uniform weight function), and estimating the model's minimum.

The function build_from_data takes the dimension m of the inputs, the interface, an optional argument dfun for the gradient interface, an optional argument avdim for the dimension of the active subspace, and a Boolean variable that tells the model that the inputs are bounded.

If df is not specified, the code estimates the gradients with local linear models using the given inputs X and outputs f. If avdim is not specified, the code uses a simple heuristic based on the eigenvalue gaps to choose a dimension for the active subspace.

After estimating the active subspace, build_from_data constructs a radial basis response surface with a quadratic polynomial using the given inputs and outputs. The response surface is called by the class's predict function, which returns the prediction and the gradient of the prediction.

The class's average function estimates the average using a specially designed quadrature rule on the active variables applied to the response surface. The argument to the function average is the total number of points in the active variable quadrature rule.

The class's minimum function wraps the scipy.optimize sequential least-squares quadratic program (SLSQP) method applied to the response surface on the active variables. It then uses a heuristic for finding the minimum over the inactive variables.


In [4]:
# Instantiate a wrapper class for the active subspace utilities.
m = 3
bounded_inputs = True
model = asub.base.ActiveSubspaceReducedModel(m, bounded_inputs)

# Build the active subspace-enabled model using the interfaces.
model.build_from_data(X, f, df=df, avdim=1)

# Compute the model's prediction and gradient at a random input point.
x = np.random.uniform(-1.0, 1.0, size=(1,m))
fx, dfx = model.predict(x, compgrad=True)
print 'At the point x = ({:4.2f}, {:4.2f}, {:4.2f})...'.format(
    x[0,1], x[0,1], x[0,2])
print '\tthe estimate is {:4.2f},'.format(fx[0,0])
print '\tand the estimated gradient is ({:4.2f}, {:4.2f}, {:4.2f}).\n'.format(
    dfx[0,0], dfx[0,1], dfx[0,2])

# Estimate the average of the model.
avavg = model.average(20)
print 'The estimated average is {:4.2f}.'.format(avavg[0])

# Estimate the minimum of the model and its minimizer.
avmin = model.minimum()
print 'The estimated minimum is {:4.2f} at x=({:4.2f}, {:4.2f}, {:4.2f}).'.format(
    avmin[0], avmin[1][0,0], avmin[1][0,1], avmin[1][0,2])


The dimension of the active subspace is 1.
At the point x = (0.08, 0.08, -0.12)...
	the estimate is 0.11,
	and the estimated gradient is (-0.19, 0.32, 0.21).

The estimated average is 0.18.
The estimated minimum is 0.02 at x=(-0.00, 0.00, 0.00).

Let's dig deeper into the capabilities of the library. The functions exposed to the user allow greater control of the approximations.

Estimating the subspaces and the dimension of the active subspace

Next, we instantiate a Subspaces object and feed the gradient samples to the object's compute function, which estimates the active and inactive subspaces. This estimation uses a nonparametric bootstrap to estimate variability in the eigenvalue and subspace estimates. The plotting utilities eigenvalues and subspace_errors show, respectively, the estimated eigenvalues with boostrap ranges and the estimated error in the active subspace with bootstrap ranges.


In [5]:
# Instantiate the Subspaces object.
ss = asub.subspaces.Subspaces()

# Use the gradient samples to compute the eigenvectors that define the active subspace. 
ss.compute(df)

# Plot the estimated eigenvalues and errors in the estimated active subspace for varying dimension.
asub.utils.plotters.eigenvalues(ss.eigenvalues, e_br=ss.e_br)
asub.utils.plotters.subspace_errors(ss.sub_br)


Those plots can help the user determine the appropriate dimension for the active subspace. Calling Subspaces's function partition explicitly sets the active subspace's dimension. The class BoundedActiveVariableDomain takes a Subspaces object in its constructor. It then determines the vertices of the n-dimensional polytope (i.e., the zonotope) that contains the range of the active variables.

The class BoundedActiveVariableMap takes an ActiveVariableDomain in its constructor. It offers functions forward and inverse for going between the original variables and the active variables. forward returns a value for the active variables and inactive variables as tuple given a value for the original variables. inverse returns several values of the original variables that map to the given value of the active variables.


In [6]:
# Define the active subspace to be one-dimensional.
ss.partition(1)

# Instantiate the domain of functions of the active variables.
avdom = asub.domains.BoundedActiveVariableDomain(ss)

# Instantiate a map between the active variables and the original variables.
avmap = asub.domains.BoundedActiveVariableMap(avdom)

The plotting utility sufficient_summary is an incredibly useful tool for verifying the quality of the model when the dimension of the active subspace is 1 or 2. It plots function values at particular inputs against the corresponding values of the active variables. If the plot shows a strong trend, then the active subspace-enabled model may be a good model.

The next sequence uses the BoundedActiveVariableMap function forward to compute the values for the active variables corresponding to the given inputs. The plot then shows the relationship between the active variables and the function output.


In [7]:
# Compute the value of the active variables for each sampled input.
Y = avmap.forward(X)[0]

# Plot the outputs versus the value of the active variables. 
asub.utils.plotters.sufficient_summary(Y, f)


The library contains tools for building multivariate, least-squares-fit polynomial approximation models. We can build a polynomial response surface on the active varibles and plot it next to the training data for comparison.


In [8]:
# Instantiate a PolynomialApproximation object of degree 2.
pr = asub.utils.response_surfaces.PolynomialApproximation(N=2)

# Fit the response surface on the active variables. 
pr.train(Y, f)

# Get the upper and lower bounds of the 1d active variable
a, b = avdom.vertY[0,0], avdom.vertY[1,0]

# Evaluate the response surface over a grid of 200 points.
Y2 = np.linspace(a, b, num=200).reshape((200,1))
fY2 = pr.predict(Y2)[0]

# Plot the response surface at its training data. 
plt.figure(figsize=(7,7))
plt.plot(Y, f, 'bo', Y2, fY2, 'r-')
plt.grid(True)
plt.xlabel('Active variable')
plt.ylabel('Output')
plt.show()


Response surfaces

The library contains a class ActiveSubspaceResponseSurface, whose constructor takes an ActiveVariableMap. We can train (or fit) the response surface by giving it inputs X and outputs f. The radial basis response surface treats the eigenvalues from the active subspace analysis to determine characteristic length scales for each variable in the radial basis.

The response surface's predict function takes a set of inputs and returns the response surface's prediction and gradient. estimated variance.


In [9]:
# Instantiate the ActiveSubspaceResponseSurface.
asrs = asub.response_surfaces.ActiveSubspaceResponseSurface(avmap)

# Fit the response surface with 21 points in the design.
asrs.train_with_data(X, f)

# Randomly sample the inputs.
x = np.random.uniform(-1.0, 1.0, size=(1,m))

# Compute the corresponding value of the active variable.
y = avmap.forward(x)[0][0,0]

# Evaluate the prediction, the gradient, and the variance. 
fx, dfx = asrs.predict(x, compgrad=True)
print 'At the point x = ({:4.2f}, {:4.2f}, {:4.2f}) and y = {:4.2f}...'.format(
    x[0,1], x[0,1], x[0,2], y)
print '\tthe estimate is {:4.2f},'.format(fx[0,0])
print '\tand the estimated gradient is ({:4.2f}, {:4.2f}, {:4.2f}).'.format(
    dfx[0,0], dfx[0,1], dfx[0,2])


At the point x = (-0.58, -0.58, 0.36) and y = 0.28...
	the estimate is 0.06,
	and the estimated gradient is (0.12, -0.20, -0.14).

Integrating

We can exploit the active subspace to efficiently integrate a function of several variables. The function av_integrate takes the ActiveSubspaceResponseSurface, the ActiveVariableMap, and the number of quadrature nodes in the active variable.


In [10]:
avavg = asub.integrals.av_integrate(asrs, avmap, 50)
print 'The estimated average is {:4.2f}.'.format(avavg)


The estimated average is 0.18.

Minimizing

The library contains utilities that use the response surface to estimate the function's minimum. The minimize function uses sequential least-squares quadratic program (SLSQP) solver from the scipy.optimize package to estimate the minimizer of the response surface over the space of active variables. Then it uses a global quadratic function on the inactive variables to estimate the function's minimum over the inactive variables with the active variables fixed at the previously computed minimizer.


In [11]:
avmin = asub.optimizers.minimize(asrs, X, f)
print 'The estimated minimum is {:4.2f} at x=({:4.2f}, {:4.2f}, {:4.2f}).'.format(
    avmin[1], avmin[0][0,0], avmin[0][0,1], avmin[0][0,2])


The estimated minimum is 0.02 at x=(-0.00, 0.00, 0.00).

Increase the dimension of the active subspace and repeat

Next, we manually set the dimension of the active subspace to 2 and repeat the computations. Note that the ActiveVariableMap has the ActiveVariableDomain, which has the Subspaces object. Thus, changing the active subspace dimension with the Subspaces.partition function automatically changes the relationship between the active variables and the original variables. For example, the function ActiveVariableMap.forward now returns a vector of dimension 2 for every point in the input space you give it.

We can make a sufficient summary plot in 1 and 2 dimensions. The function sufficient_summary will make both the 1d and the 2d sufficient summary plot (where the latter is a scatter plot with color corresponding to the output) when the input vectors are 2d.


In [12]:
# Manually set the dimension of the active subspace to 2.
ss.partition(2)

# Recompute the active variable range.
avdom.compute_boundary()

# Compute the value of the active variables for samples of the original variables.
Y3 = avmap.forward(X)[0]

# Show the 1d and 2d sufficient summary plots. 
asub.utils.plotters.sufficient_summary(Y3, f)


We can construct and plot the polynomial response surface on the two active variables. In principle, we can evaluate the quadratic polynomial of the active variables at points outside the range of the active variable. But this is bad practice, since those points fall outside the range of the active variables. Therefore, we use the constraints from the zonotope to throw away points from an initial grid. The remaining points are guaranteed to have corresponding points in the original variables.


In [13]:
# Instantiate a new PolynomialApproximation object.
pr2 = asub.utils.response_surfaces.PolynomialApproximation(N=2)

# Fit the response surface with the same outputs and the 2d active variable samples.
pr2.train(Y3, f)

# Get all the points on a grid.
yy1, yy2 = np.meshgrid(np.arange(-3.0, 3.0, 0.1), np.arange(-3.0, 3.0, 0.1))
Y4 = np.hstack((yy1.reshape((yy1.size,1)), yy2.reshape((yy2.size,1))))

# Check to see if each point falls inside the zonotope.
ypoints = []
for y in Y4:
    if np.all(avdom.constraints['fun'](y.reshape((2,1))) >= 0):
        ypoints.append(y)
Y5 = np.array(ypoints)

# Evaluate the response surface at each point in the zonotope.
fY5 = pr2.predict(Y5)[0]

# Scatter plot of the response surface. 
y1, y2 = Y5[:,0], Y5[:,1]
plt.figure(figsize=(7,7))
plt.scatter(y1, y2, c=fY5, s=100.0, vmin=np.min(fY5), vmax=np.max(fY5))
plt.xlabel('Active variable 1')
plt.ylabel('Active variable 2')
ymin = 1.1*np.amin([np.amin(y1), np.amin(y2)])
ymax = 1.1*np.amax([np.amax(y1), np.amax(y2)])
plt.axis([ymin, ymax, ymin, ymax])
plt.axes().set_aspect('equal')
plt.grid(True)
plt.colorbar()


Out[13]:
<matplotlib.colorbar.Colorbar instance at 0x10cfd5a70>

Construct and train a new ActiveSubspaceResponseSurface on the two active variables. By default, this constructor uses a radial basis approximations with a quadratic mean. Then repeat the response surface predictions, the estimated average, and the estimated minimum.


In [14]:
# Instantiate a new ActiveSubspaceResponseSurface with the 2d active variables.
asrs2 = asub.response_surfaces.ActiveSubspaceResponseSurface(avmap)

# Train the response surface on a maximin design on the zonotope.
asrs2.train_with_data(X, f)

# Evaluate the response surface prediction.
x = np.random.uniform(-1.0, 1.0, size=(1,m))
y = avmap.forward(x)[0][0,0]
fx, dfx = asrs2.predict(x, compgrad=True)
print 'At the point x = ({:4.2f}, {:4.2f}, {:4.2f}) and y = {:4.2f}...'.format(
    x[0,1], x[0,1], x[0,2], y)
print '\tthe estimate is {:4.2f},'.format(fx[0,0])
print '\tand the estimated gradient is ({:4.2f}, {:4.2f}, {:4.2f}).\n'.format(
    dfx[0,0], dfx[0,1], dfx[0,2])

# Estimate the function's average.
avavg = asub.integrals.av_integrate(asrs2, avmap, 50)
print 'The estimated average is {:4.2f}.'.format(avavg)

# Estimate the function's minimum.
avmin = asub.optimizers.minimize(asrs2, X, f)
print 'The estimated minimum is {:4.2f} at x=({:4.2f}, {:4.2f}, {:4.2f}).'.format(
    avmin[1], avmin[0][0,0], avmin[0][0,1], avmin[0][0,2])


At the point x = (-0.50, -0.50, 0.82) and y = 0.03...
	the estimate is 0.02,
	and the estimated gradient is (0.05, -0.00, 0.04).

The estimated average is 0.20.
The estimated minimum is 0.00 at x=(-0.00, 0.00, 0.00).