In [2]:
# this line is needed for the web notebook only
%matplotlib inline
CIVL4250 Numerical Methods in Engineering
Dr Dorival Pedroso
Code available on https://github.com/cpmech/CIVL4250py
See, e.g. Wikipedia
Newton-Raphson method (also known as Newton's method) finds an approximate solution to a nonlinear equation (system of nonlinear equations) by means of successively updating the initial trial solution using the derivative (Jacobian) of the equation/system evaluated at the trial solution.
Suppose we have three unknowns $x_0$, $x_1$ and $x_2$ and the following system of three nonlinear equations
$$ \begin{matrix} x_0^2 + x_1^3 + x_2 \, x_0 = 10 \\ x_1^2 + x_2^3 + x_0 \, x_1 = 20 \\ x_2^2 + x_0^3 + x_1 \, x_2 = 30 \end{matrix} $$Or, in matrix notation
$$\boldsymbol{r}(\boldsymbol{x}) = 0$$with
$$ \boldsymbol{r}(\boldsymbol{x}) = \begin{Bmatrix} x_0^2 + x_1^3 + x_2 \, x_0 - 10 \\ x_1^2 + x_2^3 + x_0 \, x_1 - 20 \\ x_2^2 + x_0^3 + x_1 \, x_2 - 30 \end{Bmatrix} $$The solution via Newton's method is achieved by updating $\boldsymbol{x}$ and, hence, $\boldsymbol{r}$ using the following formula
$$\boldsymbol{x}_{k+1} = \boldsymbol{x}_k + \boldsymbol{\delta x}$$$$\boldsymbol{r}_{k+1} = \boldsymbol{r}(\boldsymbol{x}_{k+1})$$with
$$\boldsymbol{\delta x} = -\boldsymbol{J}^{-1} \cdot \boldsymbol{r}_k$$and
$$\boldsymbol{J} = \left. \frac{\mathrm{d}\boldsymbol{r}}{\mathrm{d}\boldsymbol{x}} \right\rvert_k $$symbolising the Jacobian matrix of the nonlinear system
In Python, the nonlinear system using matrix notation (array
from numpy
) can be defined as follows
In [3]:
import numpy as np
def my_nonlinear_system(x):
# auxiliary variables
x0, x1, x2 = x[0], x[1], x[2]
# residuals
r0 = x0**2.0 + x1**3.0 + x2 * x0 - 10
r1 = x1**2.0 + x2**3.0 + x0 * x1 - 20
r2 = x2**2.0 + x0**3.0 + x1 * x2 - 30
# nonlinear system as a vector 'r'
return np.array([r0, r1, r2])
It is worth noting that Newton's method may not converge due to a number of conditions, among them, an initial guess that is not "around" the closed-form solution.
To obtain an approxinate solution, an initial guess/trial $\boldsymbol{x}^{trial}$ has to be defined. This can be done as follows
In [4]:
x_trial = np.array([1.0, 2.0, 3.0])
The components of $\boldsymbol{r}$ can now be assessed by means of
In [5]:
r_trial = my_nonlinear_system(x_trial)
print 'r(x_trial) =', r_trial
which are clearly nonzero; thus, our initial guess is not a solution.
This can be improved; but first the Jacobian function needs to be defined. Mathematically, the Jacobian matrix is given by
$$ \boldsymbol{J} = \begin{bmatrix} \frac{\mathrm{d}\boldsymbol{r_0}}{\mathrm{d}\boldsymbol{x_0}} & \frac{\mathrm{d}\boldsymbol{r_0}}{\mathrm{d}\boldsymbol{x_1}} & \frac{\mathrm{d}\boldsymbol{r_0}}{\mathrm{d}\boldsymbol{x_3}} \\ \frac{\mathrm{d}\boldsymbol{r_1}}{\mathrm{d}\boldsymbol{x_0}} & \frac{\mathrm{d}\boldsymbol{r_1}}{\mathrm{d}\boldsymbol{x_1}} & \frac{\mathrm{d}\boldsymbol{r_1}}{\mathrm{d}\boldsymbol{x_3}} \\ \frac{\mathrm{d}\boldsymbol{r_2}}{\mathrm{d}\boldsymbol{x_0}} & \frac{\mathrm{d}\boldsymbol{r_2}}{\mathrm{d}\boldsymbol{x_1}} & \frac{\mathrm{d}\boldsymbol{r_2}}{\mathrm{d}\boldsymbol{x_3}} \end{bmatrix} = \begin{bmatrix} 2\,x_0+x_2 & 3\,x_1^2 & x_0 \\ x_1 & 2\,x_1+x_0 & 3\,x_2^2 \\ 3\,x_0^2 & x_2 & 2\,x_2+x_1 \end{bmatrix} $$
In [6]:
def my_jacobian_matrix(x):
# auxiliary variables
x0, x1, x2 = x[0], x[1], x[2]
# Jacobian matrix
return np.array([
[2.0*x0 + x2, 3.0*x1**2.0, x0],
[ x1, 2.0*x1 + x0, 3.0*x2**2.0],
[3.0*x0**2.0, x2, 2.0*x2 + x1]
])
Having the Jacobian function defined, the next improved solution can be computed. First we create a copy of xtr because we do not want to change its values. To do so, the copy()
method of array
s is called; otherwise, if you modify x, xtr would be changed as well.
Then, the Jacobian is computed and the following linear system is solved
$$\boldsymbol{J} \cdot \boldsymbol{\delta x} = -\boldsymbol{r}_k$$which is more efficient and sometimes more accurate than simply inverting $\boldsymbol{J}$ and multiplying to the right-hand-side (rhs) as above.
A loop for a fixed number of maximum iterations is also created for the sake of automation. A tolerance is also specified in order to decide when the residual $\boldsymbol{r}$ is small enough. This is accomplished by comparing the Euclidian norm $||\boldsymbol{r}||$ of $\boldsymbol{r}$ against the given tolerance. The resulting Python code is given below (it means iteration)
In [7]:
import scipy.linalg as la
# define constants
nmaxIt = 20 # max number of iterations
tol = 1e-13 # tolerance for the norm of r
# make a copy of x_trial vector
x = x_trial.copy()
# print table header
print '%4s%13s%13s%13s%15s' % ('it', 'x0', 'x1', 'x2', 'error') # to make a nice table
# loop over number of iterations
for it in range(nmaxIt):
# perform computations
r = my_nonlinear_system(x) # residual
e = la.norm(r) # error
if e < tol: # converged?
break # yes => we can stop 'looping' now
J = my_jacobian_matrix(x) # compute Jacobian @ x
dx = la.solve(J, -r) # solve for delta_x
x += dx # update x
# print table line
print '%4d%13f%13.6f%13.6f%15.6e' % (it, x[0], x[1], x[2], e)
# fallback point
else: print 'Newton-Raphson method failed to converge after %d iterations' % it
# the last results
print '%4d%13f%13.6f%13.6f%15.6e' % (it, x[0], x[1], x[2], e)
Thus, the (approximate) solution (under the specified tolerance) is
$$ \boldsymbol{x} = \begin{Bmatrix} \phantom{-} 3.023968 \\ -1.968475 \\ \phantom{-} 2.805335 \end{Bmatrix} $$If you call nlsystem
again with this solution, you may find the resulting values are very close to zero; but not as so close as above because when copying those numbers we have truncated after the 8th digits and forth. This is what you get
In [8]:
x_sol = np.array([3.023968, -1.968475, 2.805335])
r_sol = my_nonlinear_system(x_sol)
print 'r(xsol) =', r_sol
and you can see that the accuracy droped to $10^{-6}$ corresponding more or less to the number of digits that were kept (after typing the solution back again).
Sometimes, the analytical determination of the Jacobian is too complicated. Furthermore, it is a good practice to check your Jacobian function by means of numerical differentiation. This can be accomplished by importing the numerical tool from SciPy which works by varying $t$ and computing the derivative of $\phi(t)$; i.e. $\frac{\mathrm{d}\phi}{\mathrm{d}t}$.
In our case, we have to create a loop of a loop to check each component of $\boldsymbol{J}$. Morever, we have to define $\phi(t)$ in such a way that the requested component of $\boldsymbol{r}$, say $r_i$ is computed.
The $\phi(t)$ function can be easily implemented using the lambda
command which defines a short function that automatically returns its results. For example:
lambda t: t**2
is a function of $t$ that returns $t^2$.
To use the numerical derivative tool, first an auxiliary/temporary function to compute the residual $\boldsymbol{r}$ for given scalar $t$ and indices on the Jacobian matrix is defined as follows
In [9]:
def auxiliary_residual(i, j, t, x):
xcopy = x.copy()
xcopy[j] = t
return my_nonlinear_system(xcopy)[i]
Now, a dual loop is implemented to go through each component (i,j) of the Jacobian matrix, compute the numerical derivative, and compare against the analytical solution
In [10]:
from scipy.misc import derivative # numerical derivative
J = my_jacobian_matrix(x_trial) # evaluated @ trial x
print '%13s%13s%13s' % ('Jana', 'Jnum', 'error') # to print a nice table
for i in range(3):
for j in range(3):
Jnum = derivative(lambda t: auxiliary_residual(i, j, t, x_trial), x_trial[j])
print '%13.8f%13.8f%23.15e' % (J[i,j], Jnum, abs(J[i,j]-Jnum))
where it can be seen that the analytical and numerical solutions for each component coincide exactly (to the machine precision). Note that this happened only because the system above is fairly simple. In other cases, small differences appear.