Optimization in 1-D and N-D. Debugging Functions

Unit 11, Lecture 1

Numerical Methods and Statistics

Prof. Andrew White, April 19, 2020

Goals:

Learn the meaning of root-finding and minimization, the two types of optimization
Understand the iterative nature of these methods
Debug common problems when defining functions, which is essential for optimization
Be able to identify convex problems and understand their complexties
Learn the two standard methods for minimize and root-finding and how to call them in Python



In [1]:

    
import numpy as np
import matplotlib.pyplot as plt

Numerical Optimization - Root Finding

What is $x$ in

$$\cos (x) = x$$

To use a root-finding method, we must make our equation have one side be 0.

$$\cos (x) - x = 0$$

Newton's Method - For finding Roots



In [2]:

    
from scipy.optimize import newton

root =  newton(lambda x: np.cos(x) - x, x0=0)
print(root, np.cos(root) - root)









    



0.7390851332151606 1.1102230246251565e-16

$$x_{i+1} = x_i - \frac{f(x_i)}{f'(x_i)}$$

Source: Wikipedia, Newton's Method

This method, like all this unit are iterative. This is modifiable by you, either through choosing tolerance, or maximum iterations. The functions have sensible defaults



In [3]:

    
root =  newton(lambda x: np.cos(x) - x, x0=0, tol=1e-3)
print(root, np.cos(root) - root)









    



0.73908511214521 3.526292269295794e-08



In [4]:

    
root =  newton(lambda x: np.cos(x) - x, x0=0, tol=1e-3, maxiter=5)
print(root, np.cos(root) - root)









    



0.73908511214521 3.526292269295794e-08

Newton's Method

Use root instead, this is just for instructional purposes. root is better

Type: Root Finding

Discrete/Continuous: Continuous

Dimensions: 1

Derivative: optional

Non-Convex: not recommended

Python: newton

Let's try to solve a more interesting equation with Newton's method!

$$ \int_0^x e^{-s^2}\,ds = \frac{1}{2}$$

Find $x$



In [5]:

    
#Note: this code is intentionally really bad
from scipy.integrate import quad
def equation(x):
    return quad(lambda s: np.exp(-s**2), 0, x)
root = newton(equation, x0=0)









    



---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-5-b5fda6a0acf0> in <module>
      3 def equation(x):
      4     return quad(lambda s: np.exp(-s**2), 0, x)
----> 5 root = newton(equation, x0=0)

~/miniconda3/lib/python3.7/site-packages/scipy/optimize/zeros.py in newton(func, x0, fprime, args, tol, maxiter, fprime2, x1, rtol, full_output, disp)
    316         q1 = func(p1, *args)
    317         funcalls += 1
--> 318         if abs(q1) < abs(q0):
    319             p0, p1, q0, q1 = p1, p0, q1, q0
    320         for itr in range(maxiter):

TypeError: bad operand type for abs(): 'tuple'

Steps for Debugging Code

Correct any Python errors
Restart the Kernel
Correct any Python errors
Print to discover logical errors

Most Common Mistakes

Copy-Pasta leaving in old variables
Mixing Numpy arrays and for loops
Forgetting order of function args and return values

Review of Functions

For functions like quad, newton, and minimize the function should take in 1 argument and return 1 argument.

Correct Examples

$$x^2 - 3x + 2$$



In [6]:

    
def version_1(x):
    return x**2 - 3 *x + 2

version_2 = lambda x: x ** 2 - 3 *x + 2

np_version = np.vectorize(version_1)

print(version_1(3.))
print(version_2(3.))

some_threes = np.zeros(10)
some_threes[:] = 3.0 # -> Notice how we don't replace the numpy array with 3, but instead make all elements in it equal to three
#some_threes = 3 -> This would delete the numpy array and now some_threes would be a single 3

print(np_version(some_threes))









    



2.0
2.0
[2. 2. 2. 2. 2. 2. 2. 2. 2. 2.]

$$\int_{\pi}^x \sin^2(s)\,ds$$



In [7]:

    
from scipy.integrate import quad
import numpy as np
from math import pi

def integrate_sin2(x):
    ans, err = quad(lambda s: np.sin(s) ** 2, pi, x)
    return ans
    
print(integrate_sin2(2 * pi))









    



1.5707963267948966

2 Vectors, $\vec{x}$ and $\vec{y}$, where one component of $\vec{x}$ is changing and we want to know the distance between the two vectors.



In [8]:

    
from math import sqrt

def distance(x, y):
    sum_sq = np.sum((x - y)**2)
    return sqrt(sum_sq)

def my_distance(s):
    x = np.zeros(3)
    y = np.zeros(3)
    x[0] = 2.0
    x[1] = s
    x[2] = -3.5
    y[0] = 1.0
    y[1] = -3.0
    y[2] = 0.0
    return distance(x, y)

print(my_distance(1))









    



5.408326913195984

Incorrect Examples

No Return Value



In [9]:

    
def version_1(x):
    x**2 - 3 *x + 2

print(version_1(3.))









    



None

Bad Return Value



In [10]:

    
def integrate_sin2(x):
    return quad(lambda s: np.sin(s) ** 2, pi, x)

print(integrate_sin2(2 * pi))









    



(1.5707963267948966, 1.743934249004316e-14)

Too many arguments



In [11]:

    
def distance(x, y):
    sum_sq = np.sum((x - y)**2)
    return sqrt(sum_sq)

print(distance(1))









    



---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-11-6fd995bbf352> in <module>
      3     return sqrt(sum_sq)
      4 
----> 5 print(distance(1))

TypeError: distance() missing 1 required positional argument: 'y'

Let's return to our example from above:

$$ \int_0^x e^{-s^2}\,ds = \frac{1}{2}$$

Find $x$



In [12]:

    
#note still wrong
def equation(x):
    ans, err = quad(lambda s: np.exp(-s**2), 0, x)
    return ans
root = newton(equation, x0=0)
print(root)

0.0



In [13]:

    
equation(root)









    Out[13]:





0.0

We forgot to rearrange the equation to be equal to $0$



In [14]:

    
root = newton(lambda x: equation(x) - 0.5, x0=0)
print(root, equation(root))









    



0.5510394276090265 0.4999999999999999

Scope

Scope means the set of variables and functions which are defined and accessible in your code



In [15]:

    
x = 4
y = 2

#Right now, there is x,y and all other functions/variables defined or imported above in the scope

Scopes nest



In [16]:

    
x = 4
y = 2
#Here, I have x and y in scope
def scope_example():
    z = 2
    #Here, I have x,y and z in scope
    
#Here I again have only x and y in scope



In [17]:

    
x = 4 
y = 2
print(y,"Before function")
def scope_example():
    y = 25 #This is a new version of y that exists only in this scope
    print(y, "Inside function")
scope_example()
print(y, "After Function")









    



2 Before function
25 Inside function
2 After Function



In [18]:

    
x = 4 
y = [2]
print(y,"Before function")
def scope_example():
    y[0] = 25 #Here I'm not creating a y, but modifying y
    print(y, "Inside function")
scope_example()
print(y, "After Function")









    



[2] Before function
[25] Inside function
[25] After Function

Things to remember about scope:

Scopes nest, so that you can access things above your scope
You can modify variables from any scope you can see, but ones you create disappear outside of the scope

Returning to Optimization

Applications of Optimization

Solving non-linear equations
Solving systems of equations
Optimizing equations with or without constraints
Fitting models to data

Choosing which method to use

There are five things to consider when doing optimization:

Is it 1 or N dimensions?
Are you minimizing or root-finding?
Are there constraints?
Are there bounds?
Is it convex or non-convex?

Identifying Convexity

If a problem has more than one minimum (derivative is 0), then the problem is non-convex. The opposite of convex is non-convex. A concave function can be made convex with a negative sign. A non-convex function cannot be made convex.

Knowing about convexity can come from:

Plots
Running convex optimization in two starting positions and getting different minimums
Knowing something specific about the problem

Consider a function with two minimums:



In [19]:

    
def two_well(x):
    if x < 0.125:
        return (x + 2) ** 2
    if x >= 0.125:
        return (x - 2) ** 2 + 1
    
np_two_well = np.vectorize(two_well)



In [20]:

    
x = np.linspace(-4, 4, 1000)
plt.plot(x, np_two_well(x))
plt.show()

BFGS Optimization - Minimization

Broyden–Fletcher–Goldfarb–Shanno

BFGS

Type: Minimization

Discrete/Continuous: Continuous

Dimensions: N

Derivative: optional

Non-Convex: not recommended

Python: minimize if no constraints or bounds are given

Nomenclature

In optimization, you have a function called the objective function. That's what you're minimizing. It always returns a single value. It is sometimes called the fit, the error, the residual, or the penalty.



In [21]:

    
def obj(x):
    return x**2

x = np.linspace(-1,1,100)
plt.plot(x, obj(x))
plt.show()



In [22]:

    
from scipy.optimize import minimize

minimize(obj, x0=3)









    Out[22]:





      fun: 2.5388964258140254e-16
 hess_inv: array([[0.5]])
      jac: array([4.67689909e-08])
  message: 'Optimization terminated successfully.'
     nfev: 9
      nit: 2
     njev: 3
   status: 0
  success: True
        x: array([1.59339149e-08])

Minimize Return value

     fun: The value of the function at the minimum
hess_inv: The inverse of the the Hessian
     jac: The value of the Jacobian
 message: A string describing what happened
    nfev: Number of function evaluations
     nit: Number of iterations of the x point
    njev: Number of times it computed the Jacobian
  status: The single digit message (0 = success, != 0 some error)
 success: Boolean indicating success
       x: The minimum x

Objective Functions

Be careful that your objective function is convex and it's minimum isn't at $\infty$ or $-\infty$.

A good objective function

Minimize the following:

$$f(x) = \frac{(x - 4)^2}{2} + \frac{(x - 2)^2}{4}$$



In [23]:

    
def f(x):
    return ((x-4)**2)/2+((x-2)**2)/4
x = np.linspace(-10, 10, 100)
plt.plot(x, f(x))
plt.show()



In [24]:

    
result = minimize(f, x0=0)
print(f'the minimum value occurs at {result.x} and is {result.fun}')









    



the minimum value occurs at [3.33333336] and is 0.6666666666666674

A bad objective function

Minimize the following:

$$ f(x) = \frac{(x - 4)} { 2 } $$

The minimum is at $-\infty$!



In [25]:

    
result = minimize(lambda x: (x - 4) / 2, x0=0)
print(f'the minimum value occurs at {result.x} and is {result.fun}')









    



the minimum value occurs at [-1.7895697e+08] and is -89478487.25

A good objective function but bad `x0`

Minimize the following function

$$4 \left[ r^{-12} - r^{-6}\right]$$



In [26]:

    
result = minimize(lambda r: 4 * (r**(-12) - r**(-6)), x0=0)
print(f'the minimum is at {result.x}')









    



the minimum is at [0.]






    



/home/whitead/miniconda3/lib/python3.7/site-packages/ipykernel_launcher.py:1: RuntimeWarning: divide by zero encountered in power
  """Entry point for launching an IPython kernel.
/home/whitead/miniconda3/lib/python3.7/site-packages/ipykernel_launcher.py:1: RuntimeWarning: invalid value encountered in subtract
  """Entry point for launching an IPython kernel.

Our initial value was not in the domain!



In [27]:

    
r = np.linspace(0.9, 2, 100)
y = 4 * (r**(-12) - r**(-6))
plt.plot(r, y)
plt.show()



In [28]:

    
result = minimize(lambda r: 4 * (r**(-12) - r**(-6)), x0=1)
print(f'the minimum is at {result.x} and its value is {result.fun}')









    



the minimum is at [1.12246208] and its value is -0.9999999999999768

Maximizing

In order to maximzie a function, just add a minus sign

Maximize the following function:

$$ -\left[x - \cos(x)\right]^2 $$



In [30]:

    
#place - sign to make it a maxmimization problem
result = minimize(lambda x: (x - np.cos(x))**2, x0=1)
print(f'the maximum is at {result.x}')









    



the maximum is at [0.7390852]

Multiple Dimensions

Just indicate multiple dimensions by using a multidimensional x0. Note that your $x$ becomes a vector!

Minimize the following:

$$ f(x, y) = \frac{(x - 4)^2} { 2 } + \frac{(y + 3)^2} { 5 } $$



In [31]:

    
result = minimize(lambda x: (x[0] - 4)**2 / 2 + (x[1] + 3)**2 / 5, x0=[0,0])
print(f'maximum occurs when x = {result.x[0]} and y = {result.x[1]}')









    



maximum occurs when x = 3.999999680451366 and y = -3.0000083683518133

Identifying Convexity for Minimization - Example

The best ways to identify convexity are:

plot it
try optimizing in multiple starting positions



In [32]:

    
x = np.linspace(-1, 1, 1000)
plt.plot(x,  2 * x ** 3 - 0 * x **2  - x)
plt.show()



In [33]:

    
from scipy.optimize import minimize
minimize(lambda x:  2 * x ** 3 - 0 * x **2  - x, x0=0.05)









    Out[33]:





      fun: -0.27216552697590846
 hess_inv: array([[0.20398928]])
      jac: array([-1.11758709e-08])
  message: 'Optimization terminated successfully.'
     nfev: 24
      nit: 5
     njev: 8
   status: 0
  success: True
        x: array([0.40824828])



In [34]:

    
x = np.linspace(-1, 1, 1000)
plt.plot(x,  2 * x ** 3 - 0 * x **2  - x)
plt.show()



In [35]:

    
from scipy.optimize import minimize
minimize(lambda x:  2 * x ** 3 - 0 * x **2  - x, x0=-0.5)









    Out[35]:





      fun: -1.146240476075504e+25
 hess_inv: array([[1]])
      jac: array([0.])
  message: 'Optimization terminated successfully.'
     nfev: 48
      nit: 1
     njev: 16
   status: 0
  success: True
        x: array([-1.78956955e+08])

Powell hybrid method

Type: Root finding

Discrete/Continuous: Continuous

Dimensions: N

Derivative: optional

Non-Convex: not recommended

Python: root unless method argument specifies a different method

1D Example

This is exactly like newton from above. Solve this equation:

$$ \cos x + \sin x = x $$



In [36]:

    
from scipy.optimize import root
#rearranged equation so all terms on one side
result = root(lambda x: np.cos(x) + np.sin(x) - x, x0=1)
print(result)









    



    fjac: array([[-1.]])
     fun: array([2.22044605e-16])
 message: 'The solution converged.'
    nfev: 8
     qtf: array([-2.42716958e-12])
       r: array([1.64467312])
  status: 1
 success: True
       x: array([1.25872818])

The result type is like what we saw for minimize. Similar terms are here, including the root and the value of the function at the root. Notice it's not exactly $0$ at the root.



In [37]:

    
x = result.x 
print(np.cos(x) + np.sin(x) - x)









    



[2.22044605e-16]

Solve the following system of equations: $$ 3 x^2 - 2 y = 4$$ $$ x - 4 y ^ 2 = -2$$



In [38]:

    
def sys(v):
    #I'm using v here to distinguish from the x in the equations
    #extract x and y
    x = v[0]
    y = v[1]
    #compute equations
    eq1 = 3 * x ** 2 - 2 * y - 4
    eq2 = x - 4 * y**2 + 2
    #pack into list
    sys_eq = [eq1, eq2]
    return sys_eq
root(sys, x0=[1,1])









    Out[38]:





    fjac: array([[-0.99129609, -0.13165126],
       [ 0.13165126, -0.99129609]])
     fun: array([2.87769808e-13, 3.16324744e-12])
 message: 'The solution converged.'
    nfev: 10
     qtf: array([4.70588464e-09, 2.08104163e-08])
       r: array([-8.45721872,  2.95677276,  7.06016501])
  status: 1
 success: True
       x: array([1.39555277, 0.92135129])

So the answer is $x = 1.40,\: y = 0.921$

You should not ignore the information in the output of root. Imagine this small modification:

$$ 3 x^2 - 2 y^2 = 4$$$$ x^3 - 4 y = -2$$



In [39]:

    
def sys2(v):
    #I'm using v here to distinguish from the x in the equations
    #extract x and y
    x = v[0]
    y = v[1]
    #compute equations
    eq1 = 3 * x ** 2 - 2 * y**2 - 4
    eq2 = x**3 - 4 * y + 2
    #pack into list
    sys_eq = [eq1, eq2]
    return sys_eq
root(sys2, x0=[1,1])









    Out[39]:





    fjac: array([[-0.79311064, -0.60907759],
       [ 0.60907759, -0.79311064]])
     fun: array([-0.31652211,  0.4128055 ])
 message: 'The iteration is not making good progress, as measured by the \n  improvement from the last ten iterations.'
    nfev: 30
     qtf: array([-3.93523929e-04, -5.20186955e-01])
       r: array([-1.16194697e+01,  6.56882367e+00, -1.16411656e-03])
  status: 5
 success: False
       x: array([1.53592084, 1.30262823])

If you did not read the message or status, you might believe the method succeeded.

More compact version:



In [40]:

    
root(lambda x: [3 * x[0] ** 2 - 2 * x[1] - 4, x[0] - 4 * x[1] ** 2 + 2], x0=[1,1])









    Out[40]:





    fjac: array([[-0.99129609, -0.13165126],
       [ 0.13165126, -0.99129609]])
     fun: array([2.87769808e-13, 3.16324744e-12])
 message: 'The solution converged.'
    nfev: 10
     qtf: array([4.70588464e-09, 2.08104163e-08])
       r: array([-8.45721872,  2.95677276,  7.06016501])
  status: 1
 success: True
       x: array([1.39555277, 0.92135129])

Importance of starting position

By the way, there are two roots to this function!



In [41]:

    
root(lambda x: [3 * x[0] ** 2 - 2 * x[1] - 4, x[0] - 4 * x[1] ** 2 + 2], x0=[0, 0])









    Out[41]:





    fjac: array([[ 0.98296038,  0.18381755],
       [-0.18381755,  0.98296038]])
     fun: array([ 2.66453526e-15, -4.44089210e-16])
 message: 'The solution converged.'
    nfev: 16
     qtf: array([ 2.87224221e-12, -7.25968027e-13])
       r: array([ 5.41242052, -0.44135572,  6.9600742 ])
  status: 1
 success: True
       x: array([ 0.87635927, -0.84799164])

Optimization in 1-D and N-D. Debugging Functions

Unit 11, Lecture 1

Prof. Andrew White, April 19, 2020

Goals:

Numerical Optimization - Root Finding

Newton's Method - For finding Roots

Newton's Method

Steps for Debugging Code

Most Common Mistakes

Review of Functions

Correct Examples

Incorrect Examples

No Return Value

Bad Return Value

Too many arguments

Scope

Scopes nest

Returning to Optimization

Applications of Optimization

Choosing which method to use

Identifying Convexity

BFGS Optimization - Minimization

BFGS

Nomenclature

Minimize Return value

Objective Functions

A good objective function

A bad objective function

A good objective function but bad x0

Maximizing

Multiple Dimensions

Identifying Convexity for Minimization - Example

Powell hybrid method

1D Example

Importance of starting position

A good objective function but bad `x0`