In [57]:
include("../optimizers/linesearch.jl")
include("../utils/functions.jl")
using Gadfly
The goal of this notebook is to show how to convert a constrained optimization problem with linear constraints to an unconstrained optimization problem.
The problem of interest is $$ \min_x f(x) \text{ subject to } Ax = b,$$ where $A \in \mathbb{R}^{m \times n}$ is the matrix of constraints, $x \in \mathbb{R}^n$ is a vector, and $b \in \mathbb{R}^m$ is a known vector.
For any solution $x_0$ of $Ax = b$, we can write any other solution $x$ in terms of this $x = x_0 + v$, where $v \in \text{ker}(A)$. Thus, this constrained minimization problem reduces to $$\min_{v \in ker(A)} f(x_0 + v),$$ which is an unconstrained optimization problem.
Then, we can use our favorite unconstrained problem to solve this, e.g., Newton line search.
In [109]:
function optimize_linear_constraints(f, A, f_g, f_h, yinit=0)
"""
Convert constrained optimization problem min f(x) s.t. Ax = b
to unconstrained problem min g(y) = f(B*y + x0), where x0 is a solution to Ax = b
and B is a basis for the kernel of A.
Defaults to using Newton's line search algorithm.
f: constrained objective
A: constraint matrix
f_g: gradient of f
f_h: hessian of f
yinit: init y with a vector of all yinit values
"""
m, n = size(A)
b = zeros(m)
x0 = A \ b
B = nullspace(A)
y0 = ones(size(B)[2]) .* yinit
g(y) = f(B*y + x0)
g_g(y) = B' * f_g(B*y + x0)
g_h(y) = B' * f_h(B*y + x0) * B
yvals = line_search(g, y0, g_g, g_h, "newton", 5000)
xvals = [B*y + x0 for y in yvals]
lambdas = [A' \ f_g(x) for x in xvals]
return yvals, xvals, lambdas
end
Out[109]:
Suppose we have the problem $$ \min f_N(x) \text{ subject to } \sum_{i=1}^N x_i = 0 \text{ and } \sum_{i=1}^N (-1)^i x_i = 0,$$ where $f_N$ is the cute function: $$ f(x) = \sum_{i=1}^{n-4} (-4x_i+3)^2 + (x_i^2 + 2x_{i+1}^2 + 3x_{i+2}^2 + 4 x_{i+3}^2 + 5x_n^2)^2 $$
Suppose $N$ is even. We can express these constraints using the following matrix \begin{align} A = \begin{bmatrix} 1 & 1 & \ldots 1 \\ -1 & 1 & \ldots 1, \end{bmatrix} \end{align} where the number of columns is $n$, the dimension of $x$.
In [110]:
function generate_constraint_matrix(n)
A = ones(2,n)
for i in 1:n
if i % 2 == 1
A[2,i] = -1
end
end
return A
end
Out[110]:
In [141]:
A = generate_constraint_matrix(30)
b = zeros(2)
x0 = A \ b; B = nullspace(A)
f = cute
f_g = cute_g
f_h = cute_h
g(y) = f(B*y + x0)
g_g(y) = B' * f_g(B*y + x0)
g_h(y) = B' * f_h(B*y + x0) * B
Out[141]:
In [142]:
yvals, xvals, lambdas = optimize_linear_constraints(f, A, f_g, f_h, 0);
In [143]:
niters = length(xvals)
fx = [f(x) for x in xvals]
Gadfly.plot(x=1:niters, y=fx, Geom.line,
Guide.xlabel("iteration"), Guide.ylabel("f(x)"), Guide.title("Value of function"),
Scale.x_log10, Scale.y_log10)
Out[143]:
In [144]:
grads = [norm(g_g(y),2) for y in yvals]
ratios = grads[2:niters,:]./grads[1:niters-1,:]
Gadfly.plot(x=1:niters-1, y=ratios, Geom.line,
Guide.xlabel("iteration"), Guide.ylabel("f_g(x)"), Guide.title("Gradient norm ratios"),
Scale.x_log10, Scale.y_log10)
Out[144]:
In [145]:
Gadfly.plot(x=1:niters, y=grads, Geom.line,
Guide.xlabel("iteration"), Guide.ylabel("f_g(x)"), Guide.title("Gradient norm"),
Scale.x_log10, Scale.y_log10)
Out[145]:
In [108]:
# Lagrange multipliers
lambdas[end]
Out[108]:
In [165]:
A = generate_constraint_matrix(100)
x0 = A \ b; B = nullspace(A)
f = cute
f_g = cute_g
f_h = cute_h
g(y) = f(B*y + x0)
g_g(y) = B' * f_g(B*y + x0)
g_h(y) = B' * f_h(B*y + x0) * B
yvals, xvals, lambdas = optimize_linear_constraints(f, A, f_g, f_h, 0);
# Lagrange multipliers
println(lambdas[end])
# Final value
println(f(xvals[end]))
In [166]:
niters = length(xvals)
fx = [f(x) for x in xvals]
#gy = [g(y) for y in yvals]
Gadfly.plot(x=1:niters, y=fx, Geom.line,
Guide.xlabel("iteration"), Guide.ylabel("f(x)"), Guide.title("Value of function"),
Scale.x_log10, Scale.y_log10)
Out[166]:
In [167]:
grads = [norm(g_g(y),2) for y in yvals]
ratios = grads[2:niters,:]./grads[1:niters-1,:]
Gadfly.plot(x=1:niters-1, y=ratios, Geom.line,
Guide.xlabel("iteration"), Guide.ylabel("f_g(x)"), Guide.title("Gradient norm ratios"),
Scale.x_log10, Scale.y_log10)
Out[167]:
In [168]:
Gadfly.plot(x=1:niters, y=grads, Geom.line,
Guide.xlabel("iteration"), Guide.ylabel("f_g(x)"), Guide.title("Gradient norm"),
Scale.x_log10, Scale.y_log10)
Out[168]:
In [ ]: