In [2]:
# this line is needed for the web notebook only
%matplotlib inline

CIVL4250 Numerical Methods in Engineering

Dr Dorival Pedroso

Code available on https://github.com/cpmech/CIVL4250py

Newton-Raphson method

See, e.g. Wikipedia

Newton-Raphson method (also known as Newton's method) finds an approximate solution to a nonlinear equation (system of nonlinear equations) by means of successively updating the initial trial solution using the derivative (Jacobian) of the equation/system evaluated at the trial solution.

Suppose we have three unknowns $x_0$, $x_1$ and $x_2$ and the following system of three nonlinear equations

$$ \begin{matrix} x_0^2 + x_1^3 + x_2 \, x_0 = 10 \\ x_1^2 + x_2^3 + x_0 \, x_1 = 20 \\ x_2^2 + x_0^3 + x_1 \, x_2 = 30 \end{matrix} $$

Or, in matrix notation

$$\boldsymbol{r}(\boldsymbol{x}) = 0$$

with

$$ \boldsymbol{r}(\boldsymbol{x}) = \begin{Bmatrix} x_0^2 + x_1^3 + x_2 \, x_0 - 10 \\ x_1^2 + x_2^3 + x_0 \, x_1 - 20 \\ x_2^2 + x_0^3 + x_1 \, x_2 - 30 \end{Bmatrix} $$

The solution via Newton's method is achieved by updating $\boldsymbol{x}$ and, hence, $\boldsymbol{r}$ using the following formula

$$\boldsymbol{x}_{k+1} = \boldsymbol{x}_k + \boldsymbol{\delta x}$$$$\boldsymbol{r}_{k+1} = \boldsymbol{r}(\boldsymbol{x}_{k+1})$$

with

$$\boldsymbol{\delta x} = -\boldsymbol{J}^{-1} \cdot \boldsymbol{r}_k$$

and

$$\boldsymbol{J} = \left. \frac{\mathrm{d}\boldsymbol{r}}{\mathrm{d}\boldsymbol{x}} \right\rvert_k $$

symbolising the Jacobian matrix of the nonlinear system

In Python, the nonlinear system using matrix notation (array from numpy) can be defined as follows


In [3]:
import numpy as np

def my_nonlinear_system(x):
    
    # auxiliary variables
    x0, x1, x2 = x[0], x[1], x[2]
    
    # residuals
    r0 = x0**2.0 + x1**3.0 + x2 * x0 - 10
    r1 = x1**2.0 + x2**3.0 + x0 * x1 - 20
    r2 = x2**2.0 + x0**3.0 + x1 * x2 - 30
    
    # nonlinear system as a vector 'r'
    return np.array([r0, r1, r2])

It is worth noting that Newton's method may not converge due to a number of conditions, among them, an initial guess that is not "around" the closed-form solution.

To obtain an approxinate solution, an initial guess/trial $\boldsymbol{x}^{trial}$ has to be defined. This can be done as follows


In [4]:
x_trial = np.array([1.0, 2.0, 3.0])

The components of $\boldsymbol{r}$ can now be assessed by means of


In [5]:
r_trial = my_nonlinear_system(x_trial)
print 'r(x_trial) =', r_trial


r(x_trial) = [  2.  13. -14.]

which are clearly nonzero; thus, our initial guess is not a solution.

This can be improved; but first the Jacobian function needs to be defined. Mathematically, the Jacobian matrix is given by

$$ \boldsymbol{J} = \begin{bmatrix} \frac{\mathrm{d}\boldsymbol{r_0}}{\mathrm{d}\boldsymbol{x_0}} & \frac{\mathrm{d}\boldsymbol{r_0}}{\mathrm{d}\boldsymbol{x_1}} & \frac{\mathrm{d}\boldsymbol{r_0}}{\mathrm{d}\boldsymbol{x_3}} \\ \frac{\mathrm{d}\boldsymbol{r_1}}{\mathrm{d}\boldsymbol{x_0}} & \frac{\mathrm{d}\boldsymbol{r_1}}{\mathrm{d}\boldsymbol{x_1}} & \frac{\mathrm{d}\boldsymbol{r_1}}{\mathrm{d}\boldsymbol{x_3}} \\ \frac{\mathrm{d}\boldsymbol{r_2}}{\mathrm{d}\boldsymbol{x_0}} & \frac{\mathrm{d}\boldsymbol{r_2}}{\mathrm{d}\boldsymbol{x_1}} & \frac{\mathrm{d}\boldsymbol{r_2}}{\mathrm{d}\boldsymbol{x_3}} \end{bmatrix} = \begin{bmatrix} 2\,x_0+x_2 & 3\,x_1^2 & x_0 \\ x_1 & 2\,x_1+x_0 & 3\,x_2^2 \\ 3\,x_0^2 & x_2 & 2\,x_2+x_1 \end{bmatrix} $$

In [6]:
def my_jacobian_matrix(x):
    
    # auxiliary variables
    x0, x1, x2 = x[0], x[1], x[2]
    
    # Jacobian matrix
    return np.array([
        [2.0*x0 + x2,  3.0*x1**2.0,           x0],
        [         x1,  2.0*x1 + x0,  3.0*x2**2.0],
        [3.0*x0**2.0,           x2,  2.0*x2 + x1]
    ])

Having the Jacobian function defined, the next improved solution can be computed. First we create a copy of xtr because we do not want to change its values. To do so, the copy() method of arrays is called; otherwise, if you modify x, xtr would be changed as well.

Then, the Jacobian is computed and the following linear system is solved

$$\boldsymbol{J} \cdot \boldsymbol{\delta x} = -\boldsymbol{r}_k$$

which is more efficient and sometimes more accurate than simply inverting $\boldsymbol{J}$ and multiplying to the right-hand-side (rhs) as above.

A loop for a fixed number of maximum iterations is also created for the sake of automation. A tolerance is also specified in order to decide when the residual $\boldsymbol{r}$ is small enough. This is accomplished by comparing the Euclidian norm $||\boldsymbol{r}||$ of $\boldsymbol{r}$ against the given tolerance. The resulting Python code is given below (it means iteration)


In [7]:
import scipy.linalg as la

# define constants
nmaxIt = 20 # max number of iterations
tol = 1e-13 # tolerance for the norm of r

# make a copy of x_trial vector
x = x_trial.copy()

# print table header
print '%4s%13s%13s%13s%15s' % ('it', 'x0', 'x1', 'x2', 'error') # to make a nice table

# loop over number of iterations
for it in range(nmaxIt):
    
    # perform computations
    r = my_nonlinear_system(x) # residual
    e = la.norm(r)             # error
    if e < tol:                # converged?
        break                  #   yes => we can stop 'looping' now
    J  = my_jacobian_matrix(x) # compute Jacobian @ x
    dx = la.solve(J, -r)       # solve for delta_x
    x += dx                    # update x
    
    # print table line
    print '%4d%13f%13.6f%13.6f%15.6e' % (it, x[0], x[1], x[2], e)

# fallback point
else: print 'Newton-Raphson method failed to converge after %d iterations' % it

# the last results
print '%4d%13f%13.6f%13.6f%15.6e' % (it, x[0], x[1], x[2], e)


  it           x0           x1           x2          error
   0    11.171378    -2.369258     2.574205   1.920937e+01
   1     7.536949    -6.016252     4.518216   1.371121e+03
   2     5.213785    -4.257941     3.388951   4.193975e+02
   3     3.846175    -3.085263     2.901270   1.176758e+02
   4     3.210926    -2.367179     2.807643   2.965297e+01
   5     3.040960    -2.039219     2.806050   5.863901e+00
   6     3.024322    -1.971146     2.805409   7.525181e-01
   7     3.023969    -1.968479     2.805335   2.807536e-02
   8     3.023968    -1.968475     2.805335   4.238791e-05
   9     3.023968    -1.968475     2.805335   9.363986e-11
  10     3.023968    -1.968475     2.805335   0.000000e+00

Thus, the (approximate) solution (under the specified tolerance) is

$$ \boldsymbol{x} = \begin{Bmatrix} \phantom{-} 3.023968 \\ -1.968475 \\ \phantom{-} 2.805335 \end{Bmatrix} $$

If you call nlsystem again with this solution, you may find the resulting values are very close to zero; but not as so close as above because when copying those numbers we have truncated after the 8th digits and forth. This is what you get


In [8]:
x_sol = np.array([3.023968, -1.968475, 2.805335])
r_sol = my_nonlinear_system(x_sol)
print 'r(xsol) =', r_sol


r(xsol) = [ -5.88909317e-06   6.85136097e-06  -7.39790631e-06]

and you can see that the accuracy droped to $10^{-6}$ corresponding more or less to the number of digits that were kept (after typing the solution back again).

Numerical Jacobian

Sometimes, the analytical determination of the Jacobian is too complicated. Furthermore, it is a good practice to check your Jacobian function by means of numerical differentiation. This can be accomplished by importing the numerical tool from SciPy which works by varying $t$ and computing the derivative of $\phi(t)$; i.e. $\frac{\mathrm{d}\phi}{\mathrm{d}t}$.

In our case, we have to create a loop of a loop to check each component of $\boldsymbol{J}$. Morever, we have to define $\phi(t)$ in such a way that the requested component of $\boldsymbol{r}$, say $r_i$ is computed.

The $\phi(t)$ function can be easily implemented using the lambda command which defines a short function that automatically returns its results. For example:

lambda t: t**2

is a function of $t$ that returns $t^2$.

To use the numerical derivative tool, first an auxiliary/temporary function to compute the residual $\boldsymbol{r}$ for given scalar $t$ and indices on the Jacobian matrix is defined as follows


In [9]:
def auxiliary_residual(i, j, t, x):
    xcopy = x.copy()
    xcopy[j] = t
    return my_nonlinear_system(xcopy)[i]

Now, a dual loop is implemented to go through each component (i,j) of the Jacobian matrix, compute the numerical derivative, and compare against the analytical solution


In [10]:
from scipy.misc import derivative # numerical derivative

J = my_jacobian_matrix(x_trial) # evaluated @ trial x
print '%13s%13s%13s' % ('Jana', 'Jnum', 'error') # to print a nice table
for i in range(3):
    for j in range(3):
        Jnum = derivative(lambda t: auxiliary_residual(i, j, t, x_trial), x_trial[j])
        print '%13.8f%13.8f%23.15e' % (J[i,j], Jnum, abs(J[i,j]-Jnum))


         Jana         Jnum        error
   5.00000000   5.00000000  0.000000000000000e+00
  12.00000000  13.00000000  1.000000000000000e+00
   1.00000000   1.00000000  0.000000000000000e+00
   2.00000000   2.00000000  0.000000000000000e+00
   5.00000000   5.00000000  0.000000000000000e+00
  27.00000000  28.00000000  1.000000000000000e+00
   3.00000000   4.00000000  1.000000000000000e+00
   3.00000000   3.00000000  0.000000000000000e+00
   8.00000000   8.00000000  0.000000000000000e+00

where it can be seen that the analytical and numerical solutions for each component coincide exactly (to the machine precision). Note that this happened only because the system above is fairly simple. In other cases, small differences appear.