Lecture #3

These notes are supplementary material to the slides for this lecture. In here I will discuss tangential topics that are easier to explore using a scientific computing environment.

A side note on exponential growth models

You will notice that the paper we read this week has many references to the rate at which energy demand is increasing in different parts of the world or for different sectors of the economy. Given this, I thought it would be appropriate to ensure that we all know what exactly does exponential growth mean.

Let $e_n$ be the energy demand at a discrete time $n$. Exponential growth at a rate of $r$ means that for time $n+1$ we have:

$e_{n+1} = (1 + r) \times e_n$

And more generally, at time $n + p$ (i.e., p time intervals later) we have:

$e_{n+1} = (1 + r)^p \times e_n$

Here, $r$ is the rate of growth, expressed as a percentage.

Let's declare a function that implements this formula so that we can play with it.


In [5]:
def expgrowth(e_n, r, p):
    return ((1+r)**p)*e_n

Now let's use that function with some values. Say, for instance, that the current demand is 3,000 MWh and it will grow at an annual rate of 3.2% as it does for emerging economies. When will this value double?

First, let's just test one number to make sure that the function works. Let's test for $p = 50$


In [15]:
print expgrowth(3000, 0.032, 50)


14491.2590228

So, as we can see, the solution is definitely below $p = 50$, and certainly above $p = 1$, but where exactly is it?

As we explained in class, there are a number of ways to solve this problem. I want to walk you through one particular approach in which we cast the problem as an optimization program (albeit a very simple one) and solve it using an optimization package in the scipy library.

First, let's declare a new function that returns the absolute value of the difference between $e_{n+p}$ and $2 e_n$ . This function will be zero when the two terms are equal to each other, which is exactly what we are interested in.


In [47]:
def doublediff(e_n,r,p):
   return abs(expgrowth(e_n,r,p) - 2*e_n)

Now, let's import the optimization package:


In [17]:
from scipy import optimize as opt

What functions does it have?


In [18]:
info(opt)


=====================================================
Optimization and root finding (:mod:`scipy.optimize`)
=====================================================

.. currentmodule:: scipy.optimize

Optimization
============

General-purpose
---------------

.. autosummary::
   :toctree: generated/

   minimize - Unified interface for minimizers of multivariate functions
   fmin - Nelder-Mead Simplex algorithm
   fmin_powell - Powell's (modified) level set method
   fmin_cg - Non-linear (Polak-Ribiere) conjugate gradient algorithm
   fmin_bfgs - Quasi-Newton method (Broydon-Fletcher-Goldfarb-Shanno)
   fmin_ncg - Line-search Newton Conjugate Gradient
   leastsq - Minimize the sum of squares of M equations in N unknowns

Constrained (multivariate)
--------------------------

.. autosummary::
   :toctree: generated/

   fmin_l_bfgs_b - Zhu, Byrd, and Nocedal's constrained optimizer
   fmin_tnc - Truncated Newton code
   fmin_cobyla - Constrained optimization by linear approximation
   fmin_slsqp - Minimization using sequential least-squares programming
   nnls - Linear least-squares problem with non-negativity constraint

Global
------

.. autosummary::
   :toctree: generated/

   anneal - Simulated annealing
   basinhopping - Basinhopping stochastic optimizer
   brute - Brute force searching optimizer

Scalar function minimizers
--------------------------

.. autosummary::
   :toctree: generated/

   minimize_scalar - Unified interface for minimizers of univariate functions
   fminbound - Bounded minimization of a scalar function
   brent - 1-D function minimization using Brent method
   golden - 1-D function minimization using Golden Section method
   bracket - Bracket a minimum, given two starting points

Rosenbrock function
-------------------

.. autosummary::
   :toctree: generated/

   rosen - The Rosenbrock function.
   rosen_der - The derivative of the Rosenbrock function.
   rosen_hess - The Hessian matrix of the Rosenbrock function.
   rosen_hess_prod - Product of the Rosenbrock Hessian with a vector.

Fitting
=======

.. autosummary::
   :toctree: generated/

   curve_fit -- Fit curve to a set of points

Root finding
============

Scalar functions
----------------
.. autosummary::
   :toctree: generated/

   brentq - quadratic interpolation Brent method
   brenth - Brent method, modified by Harris with hyperbolic extrapolation
   ridder - Ridder's method
   bisect - Bisection method
   newton - Secant method or Newton's method

Fixed point finding:

.. autosummary::
   :toctree: generated/

   fixed_point - Single-variable fixed-point solver

Multidimensional
----------------

General nonlinear solvers:

.. autosummary::
   :toctree: generated/

   root - Unified interface for nonlinear solvers of multivariate functions
   fsolve - Non-linear multi-variable equation solver
   broyden1 - Broyden's first method
   broyden2 - Broyden's second method

Large-scale nonlinear solvers:

.. autosummary::
   :toctree: generated/

   newton_krylov
   anderson

Simple iterations:

.. autosummary::
   :toctree: generated/

   excitingmixing
   linearmixing
   diagbroyden

:mod:`Additional information on the nonlinear solvers <scipy.optimize.nonlin>`

Utility Functions
=================

.. autosummary::
   :toctree: generated/

   line_search - Return a step that satisfies the strong Wolfe conditions
   check_grad - Check the supplied derivative using finite differences

   show_options - Show specific options optimization solvers

From that list, the minimize function looks interesting. Let's find out more about it:


In [19]:
info(opt.minimize)


 minimize(fun, x0, args=(), method='BFGS', jac=None, hess=None, hessp=None,
          bounds=None, constraints=(), tol=None, callback=None, options=None)

Minimization of scalar function of one or more variables.

.. versionadded:: 0.11.0

Parameters
----------
fun : callable
    Objective function.
x0 : ndarray
    Initial guess.
args : tuple, optional
    Extra arguments passed to the objective function and its
    derivatives (Jacobian, Hessian).
method : str, optional
    Type of solver.  Should be one of

        - 'Nelder-Mead'
        - 'Powell'
        - 'CG'
        - 'BFGS'
        - 'Newton-CG'
        - 'Anneal'
        - 'L-BFGS-B'
        - 'TNC'
        - 'COBYLA'
        - 'SLSQP'
        - 'dogleg'
        - 'trust-ncg'

jac : bool or callable, optional
    Jacobian of objective function. Only for CG, BFGS, Newton-CG,
    dogleg, trust-ncg.
    If `jac` is a Boolean and is True, `fun` is assumed to return the
    value of Jacobian along with the objective function. If False, the
    Jacobian will be estimated numerically.
    `jac` can also be a callable returning the Jacobian of the
    objective. In this case, it must accept the same arguments as `fun`.
hess, hessp : callable, optional
    Hessian of objective function or Hessian of objective function
    times an arbitrary vector p.  Only for Newton-CG,
    dogleg, trust-ncg.
    Only one of `hessp` or `hess` needs to be given.  If `hess` is
    provided, then `hessp` will be ignored.  If neither `hess` nor
    `hessp` is provided, then the hessian product will be approximated
    using finite differences on `jac`. `hessp` must compute the Hessian
    times an arbitrary vector.
bounds : sequence, optional
    Bounds for variables (only for L-BFGS-B, TNC and SLSQP).
    ``(min, max)`` pairs for each element in ``x``, defining
    the bounds on that parameter. Use None for one of ``min`` or
    ``max`` when there is no bound in that direction.
constraints : dict or sequence of dict, optional
    Constraints definition (only for COBYLA and SLSQP).
    Each constraint is defined in a dictionary with fields:
        type : str
            Constraint type: 'eq' for equality, 'ineq' for inequality.
        fun : callable
            The function defining the constraint.
        jac : callable, optional
            The Jacobian of `fun` (only for SLSQP).
        args : sequence, optional
            Extra arguments to be passed to the function and Jacobian.
    Equality constraint means that the constraint function result is to
    be zero whereas inequality means that it is to be non-negative.
    Note that COBYLA only supports inequality constraints.
tol : float, optional
    Tolerance for termination. For detailed control, use solver-specific
    options.
options : dict, optional
    A dictionary of solver options. All methods accept the following
    generic options:
        maxiter : int
            Maximum number of iterations to perform.
        disp : bool
            Set to True to print convergence messages.
    For method-specific options, see `show_options('minimize', method)`.
callback : callable, optional
    Called after each iteration, as ``callback(xk)``, where ``xk`` is the
    current parameter vector.

Returns
-------
res : Result
    The optimization result represented as a ``Result`` object.
    Important attributes are: ``x`` the solution array, ``success`` a
    Boolean flag indicating if the optimizer exited successfully and
    ``message`` which describes the cause of the termination. See
    `Result` for a description of other attributes.


See also
--------
minimize_scalar: Interface to minimization algorithms for scalar
    univariate functions.

Notes
-----
This section describes the available solvers that can be selected by the
'method' parameter. The default method is *BFGS*.

**Unconstrained minimization**

Method *Nelder-Mead* uses the Simplex algorithm [1]_, [2]_. This
algorithm has been successful in many applications but other algorithms
using the first and/or second derivatives information might be preferred
for their better performances and robustness in general.

Method *Powell* is a modification of Powell's method [3]_, [4]_ which
is a conjugate direction method. It performs sequential one-dimensional
minimizations along each vector of the directions set (`direc` field in
`options` and `info`), which is updated at each iteration of the main
minimization loop. The function need not be differentiable, and no
derivatives are taken.

Method *CG* uses a nonlinear conjugate gradient algorithm by Polak and
Ribiere, a variant of the Fletcher-Reeves method described in [5]_ pp.
120-122. Only the first derivatives are used.

Method *BFGS* uses the quasi-Newton method of Broyden, Fletcher,
Goldfarb, and Shanno (BFGS) [5]_ pp. 136. It uses the first derivatives
only. BFGS has proven good performance even for non-smooth
optimizations. This method also returns an approximation of the Hessian
inverse, stored as `hess_inv` in the Result object.

Method *Newton-CG* uses a Newton-CG algorithm [5]_ pp. 168 (also known
as the truncated Newton method). It uses a CG method to the compute the
search direction. See also *TNC* method for a box-constrained
minimization with a similar algorithm.

Method *Anneal* uses simulated annealing, which is a probabilistic
metaheuristic algorithm for global optimization. It uses no derivative
information from the function being optimized.

Method *dogleg* uses the dog-leg trust-region algorithm [5]_
for unconstrained minimization. This algorithm requires the gradient
and Hessian; furthermore the Hessian is required to be positive definite.

Method *trust-ncg* uses the Newton conjugate gradient trust-region
algorithm [5]_ for unconstrained minimization. This algorithm requires
the gradient and either the Hessian or a function that computes the
product of the Hessian with a given vector.

**Constrained minimization**

Method *L-BFGS-B* uses the L-BFGS-B algorithm [6]_, [7]_ for bound
constrained minimization.

Method *TNC* uses a truncated Newton algorithm [5]_, [8]_ to minimize a
function with variables subject to bounds. This algorithm uses
gradient information; it is also called Newton Conjugate-Gradient. It
differs from the *Newton-CG* method described above as it wraps a C
implementation and allows each variable to be given upper and lower
bounds.

Method *COBYLA* uses the Constrained Optimization BY Linear
Approximation (COBYLA) method [9]_, [10]_, [11]_. The algorithm is
based on linear approximations to the objective function and each
constraint. The method wraps a FORTRAN implementation of the algorithm.

Method *SLSQP* uses Sequential Least SQuares Programming to minimize a
function of several variables with any combination of bounds, equality
and inequality constraints. The method wraps the SLSQP Optimization
subroutine originally implemented by Dieter Kraft [12]_.

References
----------
.. [1] Nelder, J A, and R Mead. 1965. A Simplex Method for Function
    Minimization. The Computer Journal 7: 308-13.
.. [2] Wright M H. 1996. Direct search methods: Once scorned, now
    respectable, in Numerical Analysis 1995: Proceedings of the 1995
    Dundee Biennial Conference in Numerical Analysis (Eds. D F
    Griffiths and G A Watson). Addison Wesley Longman, Harlow, UK.
    191-208.
.. [3] Powell, M J D. 1964. An efficient method for finding the minimum of
   a function of several variables without calculating derivatives. The
   Computer Journal 7: 155-162.
.. [4] Press W, S A Teukolsky, W T Vetterling and B P Flannery.
   Numerical Recipes (any edition), Cambridge University Press.
.. [5] Nocedal, J, and S J Wright. 2006. Numerical Optimization.
   Springer New York.
.. [6] Byrd, R H and P Lu and J. Nocedal. 1995. A Limited Memory
   Algorithm for Bound Constrained Optimization. SIAM Journal on
   Scientific and Statistical Computing 16 (5): 1190-1208.
.. [7] Zhu, C and R H Byrd and J Nocedal. 1997. L-BFGS-B: Algorithm
   778: L-BFGS-B, FORTRAN routines for large scale bound constrained
   optimization. ACM Transactions on Mathematical Software 23 (4):
   550-560.
.. [8] Nash, S G. Newton-Type Minimization Via the Lanczos Method.
   1984. SIAM Journal of Numerical Analysis 21: 770-778.
.. [9] Powell, M J D. A direct search optimization method that models
   the objective and constraint functions by linear interpolation.
   1994. Advances in Optimization and Numerical Analysis, eds. S. Gomez
   and J-P Hennart, Kluwer Academic (Dordrecht), 51-67.
.. [10] Powell M J D. Direct search algorithms for optimization
   calculations. 1998. Acta Numerica 7: 287-336.
.. [11] Powell M J D. A view of algorithms for optimization without
   derivatives. 2007.Cambridge University Technical Report DAMTP
   2007/NA03
.. [12] Kraft, D. A software package for sequential quadratic
   programming. 1988. Tech. Rep. DFVLR-FB 88-28, DLR German Aerospace
   Center -- Institute for Flight Mechanics, Koln, Germany.

Examples
--------
Let us consider the problem of minimizing the Rosenbrock function. This
function (and its respective derivatives) is implemented in `rosen`
(resp. `rosen_der`, `rosen_hess`) in the `scipy.optimize`.

>>> from scipy.optimize import minimize, rosen, rosen_der

A simple application of the *Nelder-Mead* method is:

>>> x0 = [1.3, 0.7, 0.8, 1.9, 1.2]
>>> res = minimize(rosen, x0, method='Nelder-Mead')
>>> res.x
[ 1.  1.  1.  1.  1.]

Now using the *BFGS* algorithm, using the first derivative and a few
options:

>>> res = minimize(rosen, x0, method='BFGS', jac=rosen_der,
...                options={'gtol': 1e-6, 'disp': True})
Optimization terminated successfully.
         Current function value: 0.000000
         Iterations: 52
         Function evaluations: 64
         Gradient evaluations: 64
>>> res.x
[ 1.  1.  1.  1.  1.]
>>> print res.message
Optimization terminated successfully.
>>> res.hess
[[ 0.00749589  0.01255155  0.02396251  0.04750988  0.09495377]
 [ 0.01255155  0.02510441  0.04794055  0.09502834  0.18996269]
 [ 0.02396251  0.04794055  0.09631614  0.19092151  0.38165151]
 [ 0.04750988  0.09502834  0.19092151  0.38341252  0.7664427 ]
 [ 0.09495377  0.18996269  0.38165151  0.7664427   1.53713523]]


Next, consider a minimization problem with several constraints (namely
Example 16.4 from [5]_). The objective function is:

>>> fun = lambda x: (x[0] - 1)**2 + (x[1] - 2.5)**2

There are three constraints defined as:

>>> cons = ({'type': 'ineq', 'fun': lambda x:  x[0] - 2 * x[1] + 2},
...         {'type': 'ineq', 'fun': lambda x: -x[0] - 2 * x[1] + 6},
...         {'type': 'ineq', 'fun': lambda x: -x[0] + 2 * x[1] + 2})

And variables must be positive, hence the following bounds:

>>> bnds = ((0, None), (0, None))

The optimization problem is solved using the SLSQP method as:

>>> res = minimize(fun, (2, 0), method='SLSQP', bounds=bnds,
...                constraints=cons)

It should converge to the theoretical solution (1.4 ,1.7).

Well, that was a long documentation. The bottom line, though, is that it needs at least two inputs (the rest are optional), which are a function fun and a set of initial guesses x0. This last one, needs to be an ndarray with as many dimensions as there are variables to be optimized over. In our case, we only have a single variable, namely $p$. Let's create that ndarray:


In [34]:
import numpy as np  # Normally, we would do this at the beginning

p0 = np.array([3])

Now, ideally we would just run:

    opt.minimize(doublediff,p0)

and be done with it. However, if you look carefully, we are trying to optimize an objective function (doublediff) that has 3 variables ($e_n$, $r$ and $p$), by only feeding it one initial guess for a single variable ($p0$). This won't work.

Thus, we need to make our objective function more specific by providing it with the values we want for $e_n$ and $r$:


In [38]:
def objective(p0):
    return doublediff(3000, 0.032, p0)

Now we can optimize (minimize) it:


In [39]:
opt.minimize(objective,p0)


Out[39]:
   status: 0
  success: True
     njev: 21
     nfev: 63
 hess_inv: array([[ 232869.54979428]])
      fun: -5999.999741246641
        x: array([-516.40289312])
  message: 'Optimization terminated successfully.'
      jac: array([ 0.])

Well, not quite. If you notice, the optimization program we formulated did not take into account the fact that there are some constraints. For instance, $p > 1$ and we also know that $p < 50$. This is actually more formally known as a bound. How do we add that? Well, you can read the documentation above and look for bounds, but here's the answer:


In [50]:
opt.minimize(objective,p0,method='L-BFGS-B', bounds=[(1,50)])


Out[50]:
  status: 2
 success: False
    nfev: 97
     fun: array([  7.44742465e-07])
       x: array([ 22.00560357])
 message: 'ABNORMAL_TERMINATION_IN_LNSRCH'
     jac: array([ 40.04359653])
     nit: 6

And there you have it, the answer is the value of x in the results above. Let's check it out:


In [51]:
expgrowth(3000,0.032,22.00560357)


Out[51]:
5999.999998391946

Pretty close!


In [ ]: