The basic components
In [1]:
import numpy as np
objective = np.poly1d([1.3, 4.0, 0.6])
print objective
In [2]:
import scipy.optimize as opt
x_ = opt.fmin(objective, [3])
print "solved: x={}".format(x_)
In [3]:
%matplotlib inline
In [4]:
x = np.linspace(-4,1,101.)
import matplotlib.pylab as mpl
mpl.plot(x, objective(x))
mpl.plot(x_, objective(x_), 'ro')
Out[4]:
Additional components
In [5]:
import scipy.special as ss
import scipy.optimize as opt
import numpy as np
import matplotlib.pylab as mpl
x = np.linspace(2, 7, 200)
# 1st order Bessel
j1x = ss.j1(x)
mpl.plot(x, j1x)
# use scipy.optimize's more modern "results object" interface
result = opt.minimize_scalar(ss.j1, method="bounded", bounds=[2, 4])
j1_min = ss.j1(result.x)
mpl.plot(result.x, j1_min,'ro')
Out[5]:
In [6]:
import mystic.models as models
print(models.rosen.__doc__)
In [9]:
!mystic_model_plotter.py mystic.models.rosen -f -d -x 1 -b "-3:3:.1, -1:5:.1, 1"
In [5]:
import mystic
mystic.model_plotter(mystic.models.rosen, fill=True, depth=True, scale=1, bounds="-3:3:.1, -1:5:.1, 1")
In [10]:
import scipy.optimize as opt
import numpy as np
# initial guess
x0 = [1.3, 1.6, -0.5, -1.8, 0.8]
result = opt.minimize(opt.rosen, x0)
print result.x
# number of function evaluations
print result.nfev
# again, but this time provide the derivative
result = opt.minimize(opt.rosen, x0, jac=opt.rosen_der)
print result.x
# number of function evaluations and derivative evaluations
print result.nfev, result.njev
print ''
# however, note for a different x0...
for i in range(5):
x0 = np.random.randint(-20,20,5)
result = opt.minimize(opt.rosen, x0, jac=opt.rosen_der)
print "{} @ {} evals".format(result.x, result.nfev)
$\psi(x) = f(x) + k*p(x)$
In [42]:
# http://docs.scipy.org/doc/scipy/reference/tutorial/optimize.html#tutorial-sqlsp
'''
Maximize: f(x) = 2*x0*x1 + 2*x0 - x0**2 - 2*x1**2
Subject to: x0**3 - x1 == 0
x1 >= 1
'''
import numpy as np
def objective(x, sign=1.0):
return sign*(2*x[0]*x[1] + 2*x[0] - x[0]**2 - 2*x[1]**2)
def derivative(x, sign=1.0):
dfdx0 = sign*(-2*x[0] + 2*x[1] + 2)
dfdx1 = sign*(2*x[0] - 4*x[1])
return np.array([ dfdx0, dfdx1 ])
# unconstrained
result = opt.minimize(objective, [-1.0,1.0], args=(-1.0,),
jac=derivative, method='SLSQP', options={'disp': True})
print("unconstrained: {}".format(result.x))
cons = ({'type': 'eq',
'fun' : lambda x: np.array([x[0]**3 - x[1]]),
'jac' : lambda x: np.array([3.0*(x[0]**2.0), -1.0])},
{'type': 'ineq',
'fun' : lambda x: np.array([x[1] - 1]),
'jac' : lambda x: np.array([0.0, 1.0])})
# constrained
result = opt.minimize(objective, [-1.0,1.0], args=(-1.0,), jac=derivative,
constraints=cons, method='SLSQP', options={'disp': True})
print("constrained: {}".format(result.x))
Optimizer classifications
In [ ]:
# from scipy.optimize.minimize documentation
'''
**Unconstrained minimization**
Method *Nelder-Mead* uses the Simplex algorithm [1]_, [2]_. This
algorithm has been successful in many applications but other algorithms
using the first and/or second derivatives information might be preferred
for their better performances and robustness in general.
Method *Powell* is a modification of Powell's method [3]_, [4]_ which
is a conjugate direction method. It performs sequential one-dimensional
minimizations along each vector of the directions set (`direc` field in
`options` and `info`), which is updated at each iteration of the main
minimization loop. The function need not be differentiable, and no
derivatives are taken.
Method *CG* uses a nonlinear conjugate gradient algorithm by Polak and
Ribiere, a variant of the Fletcher-Reeves method described in [5]_ pp.
120-122. Only the first derivatives are used.
Method *BFGS* uses the quasi-Newton method of Broyden, Fletcher,
Goldfarb, and Shanno (BFGS) [5]_ pp. 136. It uses the first derivatives
only. BFGS has proven good performance even for non-smooth
optimizations. This method also returns an approximation of the Hessian
inverse, stored as `hess_inv` in the OptimizeResult object.
Method *Newton-CG* uses a Newton-CG algorithm [5]_ pp. 168 (also known
as the truncated Newton method). It uses a CG method to the compute the
search direction. See also *TNC* method for a box-constrained
minimization with a similar algorithm.
Method *Anneal* uses simulated annealing, which is a probabilistic
metaheuristic algorithm for global optimization. It uses no derivative
information from the function being optimized.
Method *dogleg* uses the dog-leg trust-region algorithm [5]_
for unconstrained minimization. This algorithm requires the gradient
and Hessian; furthermore the Hessian is required to be positive definite.
Method *trust-ncg* uses the Newton conjugate gradient trust-region
algorithm [5]_ for unconstrained minimization. This algorithm requires
the gradient and either the Hessian or a function that computes the
product of the Hessian with a given vector.
**Constrained minimization**
Method *L-BFGS-B* uses the L-BFGS-B algorithm [6]_, [7]_ for bound
constrained minimization.
Method *TNC* uses a truncated Newton algorithm [5]_, [8]_ to minimize a
function with variables subject to bounds. This algorithm uses
gradient information; it is also called Newton Conjugate-Gradient. It
differs from the *Newton-CG* method described above as it wraps a C
implementation and allows each variable to be given upper and lower
bounds.
Method *COBYLA* uses the Constrained Optimization BY Linear
Approximation (COBYLA) method [9]_, [10]_, [11]_. The algorithm is
based on linear approximations to the objective function and each
constraint. The method wraps a FORTRAN implementation of the algorithm.
Method *SLSQP* uses Sequential Least SQuares Programming to minimize a
function of several variables with any combination of bounds, equality
and inequality constraints. The method wraps the SLSQP Optimization
subroutine originally implemented by Dieter Kraft [12]_. Note that the
wrapper handles infinite values in bounds by converting them into large
floating values.
'''
In [49]:
import scipy.optimize as opt
# constrained: linear (i.e. A*x + b)
print opt.cobyla.fmin_cobyla
print opt.linprog
# constrained: quadratic programming (i.e. up to x**2)
print opt.fmin_slsqp
In [47]:
# http://cvxopt.org/examples/tutorial/lp.html
'''
minimize: f = 2*x0 + x1
subject to:
-x0 + x1 <= 1
x0 + x1 >= 2
x1 >= 0
x0 - 2*x1 <= 4
'''
import cvxopt as cvx
from cvxopt import solvers as cvx_solvers
A = cvx.matrix([ [-1.0, -1.0, 0.0, 1.0], [1.0, -1.0, -1.0, -2.0] ])
b = cvx.matrix([ 1.0, -2.0, 0.0, 4.0 ])
cost = cvx.matrix([ 2.0, 1.0 ])
sol = cvx_solvers.lp(cost, A, b)
print(sol['x'])
In [48]:
# http://cvxopt.org/examples/tutorial/qp.html
'''
minimize: f = 2*x1**2 + x2**2 + x1*x2 + x1 + x2
subject to:
x1 >= 0
x2 >= 0
x1 + x2 == 1
'''
import cvxopt as cvx
from cvxopt import solvers as cvx_solvers
Q = 2*cvx.matrix([ [2, .5], [.5, 1] ])
p = cvx.matrix([1.0, 1.0])
G = cvx.matrix([[-1.0,0.0],[0.0,-1.0]])
h = cvx.matrix([0.0,0.0])
A = cvx.matrix([1.0, 1.0], (1,2))
b = cvx.matrix(1.0)
sol = cvx_solvers.qp(Q, p, G, h, A, b)
print(sol['x'])
Notice how much nicer it is to see the optimizer "trajectory". Now, instead of a single number, we have the path the optimizer took. scipy.optimize
has a version of this, with options={'retall':True}
, which returns the solver trajectory.
EXERCISE: Solve the constrained programming problem by any of the means above.
Minimize: f = -1x[0] + 4x[1]
Subject to: -3x[0] + 1x[1] <= 6 1x[0] + 2x[1] <= 4 x[1] >= -3
where: -inf <= x[0] <= inf
In [50]:
import scipy.optimize as opt
# probabilstic solvers, that use random hopping/mutations
print opt.differential_evolution
print opt.basinhopping
print opt.anneal
In [12]:
import scipy.optimize as opt
# bounds instead of an initial guess
bounds = [(-10., 10)]*5
for i in range(10):
result = opt.differential_evolution(opt.rosen, bounds)
print result.x,
# number of function evaluations
print '@ {} evals'.format(result.nfev)
Other important special cases:
In [79]:
import scipy.optimize as opt
import scipy.stats as stats
import numpy as np
# Define the function to fit.
def function(x, a, b, f, phi):
result = a * np.exp(-b * np.sin(f * x + phi))
return result
# Create a noisy data set around the actual parameters
true_params = [3, 2, 1, np.pi/4]
print "target parameters: {}".format(true_params)
x = np.linspace(0, 2*np.pi, 25)
exact = function(x, *true_params)
noisy = exact + 0.3*stats.norm.rvs(size=len(x))
# Use curve_fit to estimate the function parameters from the noisy data.
initial_guess = [1,1,1,1]
estimated_params, err_est = opt.curve_fit(function, x, noisy, p0=initial_guess)
print "solved parameters: {}".format(estimated_params)
# err_est is an estimate of the covariance matrix of the estimates
print "covarance: {}".format(err_est.diagonal())
import matplotlib.pylab as mpl
mpl.plot(x, noisy, 'ro')
mpl.plot(x, function(x, *estimated_params))
Out[79]:
Typical uses
In [82]:
import numpy as np
import scipy.optimize as opt
def system(x,a,b,c):
x0, x1, x2 = x
eqs= [
3 * x0 - np.cos(x1*x2) + a, # == 0
x0**2 - 81*(x1+0.1)**2 + np.sin(x2) + b, # == 0
np.exp(-x0*x1) + 20*x2 + c # == 0
]
return eqs
# coefficients
a = -0.5
b = 1.06
c = (10 * np.pi - 3.0) / 3
# initial guess
x0 = [0.1, 0.1, -0.1]
# Solve the system of non-linear equations.
result = opt.root(system, x0, args=(a, b, c))
print "root:", result.x
print "solution:", result.fun
In [91]:
import numpy as np
import scipy.stats as stats
# Create clean data.
x = np.linspace(0, 4.0, 100)
y = 1.5 * np.exp(-0.2 * x) + 0.3
# Add a bit of noise.
noise = 0.1 * stats.norm.rvs(size=100)
noisy_y = y + noise
# Fit noisy data with a linear model.
linear_coef = np.polyfit(x, noisy_y, 1)
linear_poly = np.poly1d(linear_coef)
linear_y = linear_poly(x)
# Fit noisy data with a quadratic model.
quad_coef = np.polyfit(x, noisy_y, 2)
quad_poly = np.poly1d(quad_coef)
quad_y = quad_poly(x)
import matplotlib.pylab as mpl
mpl.plot(x, noisy_y, 'ro')
mpl.plot(x, linear_y)
mpl.plot(x, quad_y)
#mpl.plot(x, y)
Out[91]:
Standard diagnostic tools
EXERCISE: Use any of the solvers we've seen thus far to find the minimum of the zimmermann
function (i.e. use mystic.models.zimmermann
as the objective). Use the bounds suggested below, if your choice of solver allows it.
In [93]:
import mystic.models as models
print models.zimmermann.__doc__
EXERCISE: Do the same for the fosc3d
function found at mystic.models.fosc3d
, using the bounds suggested by the documentation, if your chosen solver accepts bounds or constraints.
More to ponder: what about high-dimenstional and nonlinear constraints?
Let's look at optimization "redesigned" in mystic...