Dimensionality of the problem: The scale of an optimization problem is pretty much set by the dimensionality of the problem, i.e. the number of scalar variables on which the search is performed
Optimizing convex functions is easy. Optimizing non-convex functions can be very hard
Optimizing smooth functions is easier
Noisy gradients: Many optimization methods rely on gradients of the objective function. If the gradient function is not given, they are computed numerically, which induces errors. In such situation, even if the object function is not noisy, a gradient-based optimization may be a noisy optimization.
Optimizations under constraints. For example, $-1 < x_1 < 1$
In [14]:
from scipy import optimize
import numpy as np
In [15]:
def f(x):
return -np.exp(-(x - 0.7) ** 2)
In [16]:
optimize.minimize_scalar(f)
Out[16]:
In [17]:
def f(x): # The rosenbrock function
return .5 * (1 - x[0]) ** 2 + (x[1] - x[0]**2)**2
In [18]:
optimize.minimize(f, [2, -1], method="CG")
Out[18]:
Gradient methods need the Jacobian of the function. They can compute it numerically, but will perform better if you can pass them the gradient
In [19]:
def jacobian(x):
return np.array((-2*.5*(1-x[0]) - 4*x[0]*(x[1] - x[0]**2), 2*(x[1] - x[0]**2)))
In [20]:
optimize.minimize(f, [2, -1], method="CG", jac=jacobian)
Out[20]:
Note that the function has only been evaluated 27 times, compared to 108 without the gradient
In [21]:
def f(x): # The rosenbrock function
return .5 * (1 - x[0]) ** 2 + (x[1] - x[0]**2)**2
In [22]:
def jacobian(x):
return np.array((-2*.5*(1-x[0]) - 4*x[0]*(x[1] - x[0]**2), 2*(x[1] - x[0]**2)))
In [23]:
optimize.minimize(f, [2, -1], method="Newton-CG", jac=jacobian)
Out[23]:
Need lest function evaluations but more gradient evaluations, as it uses it to approximate the Hessian. Let's compute the Hessian and pass it to the algo
In [24]:
def hessian(x):
return np.array(((1 - 4*x[1] + 12*x[0]**2, -4*x[0]), (-4*x[0],2)))
In [25]:
optimize.minimize(f, [2, -1], method="Newton-CG", jac=jacobian, hess=hessian)
Out[25]:
In [26]:
optimize.brute
Out[26]:
In [28]:
def f(x):
return np.arctan(x) - np.arctan(np.linspace(0, 1, len(x)))
x0 = np.zeros(10)
optimize.leastsq(f, x0)
Out[28]:
In [29]:
def g(x):
return np.sum(f(x) ** 2)
optimize.minimize(g, x0, method="BFGS")
Out[29]:
In [31]:
f(x0)
Out[31]:
BFGS needs more function calls, and gives a less precise result.
In [ ]: