This is a Jupyter notebook using Python. You can install Jupyter locally to edit and interact with this notebook.
Rootfinding is the process of solving $$f(x) = 0$$ for $x$. The standard assumption is that $f : R \to R$ is continuous. We are interested in developing general-purpose algorithms---those that can use $f(x)$ as a black box without needing to look inside. When we implement our rootfinding algorithm in software, the user will pass a function or program to compute $f(x)$. Rootfinding methods for differentiable functions may also use the derivative $f'(x)$.
Some questions immediately arise:
Let's consider some test functions, defined here along with their derivatives which we'll use later.
In [114]:
%matplotlib notebook
from matplotlib import pyplot
import numpy
tests = []
@tests.append
def f0(x):
return x*x - 2, 2*x
@tests.append
def f1(x):
return numpy.cos(x) - x, -numpy.sin(x) - 1
@tests.append
def f2(x):
return numpy.exp(-numpy.abs(x)) + numpy.sin(x), numpy.exp(-numpy.abs(x))*(-numpy.sign(x)) + numpy.cos(x)
@tests.append
def f3(x):
return x*x - x + 0.25, 2*x - 1
@tests.append
def f4(x):
return numpy.tan(x+2), numpy.cos(x+2)**(-2)
return numpy.exp(-x*x), numpy.exp(-x*x)*(-2*x)
@tests.append
def f5(x):
return (x <= 1)*1.0, 0*x
@tests.append
def f6(x):
return x*numpy.sin(5/x), numpy.sin(5/x) - numpy.cos(5/x)*5/x
x = numpy.linspace(-2,2,100)
pyplot.plot(x, 0*x, color='k')
for f in tests:
pyplot.plot(x, f(x)[0], label=f.__name__)
pyplot.legend(loc='upper right')
pyplot.style.use('ggplot')
pyplot.show()
Bisection is a rootfinding technique that starts with an interval $[a,b]$ containing a root and does not require derivatives.
In [65]:
def hasroot(f, a, b):
return f(a)[0]*f(b)[0] < 0
def bisect(f, a, b, verbose=False):
mid = (a + b)/2.
if b-a < 1e-5:
return mid
if verbose:
print('bisect', mid)
if hasroot(f, a, mid):
return bisect(f, a, mid, verbose)
else:
return bisect(f, mid, b, verbose)
hasroot
above.Let's try running it:
In [66]:
bisect(tests[0], 0, 2)
Out[66]:
In [67]:
numpy.sqrt(2) - bisect(tests[0], 0, 2)
Out[67]:
We get about 5 digits of accuracy. Why? How fast did we get there?
In [68]:
bisect(tests[0], 0, 2, verbose=True)
Out[68]:
Can you find any problems with this implementation? List them below:
Let's try running it on the rest of the test problem set:
In [69]:
for f in tests:
print(f.__name__, bisect(f, -2, 2.1))
What's going wrong here? How can we improve the implementation and what are fundamental limitations of the algorithm?
Let's quantitatively revisit the convergence rate. A convergent rootfinding algorithm produces a sequence of approximations $x_i$ such that $$\lim_{i \to \infty} x_i \to x_*$$ where $f(x_*) = 0$. For analysis, it is convenient to define the errors $e_i = x_i - x_*$. We say that an algorithm is linearly convergent if $$\lim_{i \to \infty} |e_{i+1}| / |e_i| = \rho < 1.$$ A smaller convergence factor $\rho$ represents faster convergence.
What is $\rho$ for bisection?
Much of numerical analysis reduces to Taylor series, the approximation $$ f(x) = f(x_0) + f'(x_0) (x-x_0) + f''(x_0) (x - x_0)^2 / 2 + \dotsb $$ centered on some reference point $x_0$. In numerical computation, it is exceedingly rare to look beyond the first-order approximation $$ \tilde f_{x_0}(x) = f(x_0) + f'(x_0)(x - x_0) . $$ Since $\tilde f_{x_0}(x)$ is a linear function, we can explicitly compute the unique solution of $\tilde f_{x_0}(x) = 0$ as $$ x = x_0 - f(x_0) / f'(x_0) . $$ This is Newton's Method (aka Newton-Raphson or Newton-Raphson-Simpson) for finding the roots of differentiable functions.
In [116]:
def newton(f, x, verbose=False):
for i in range(100):
fx, dfx = f(x)
if verbose:
print(f.__name__, i, x, fx)
if numpy.abs(fx) < 1e-12:
return x, fx, i
try:
x -= fx / dfx
except ZeroDivisionError:
return x, numpy.NaN, i
for f in tests:
print(f.__name__, newton(f, 1))
In [71]:
for f in tests:
print(f.__name__, '{0:15.12f} {1:8.2e} {2:2d}'.format(*newton(f, -0.1)))
We would like to know sharp conditions on when Newton-type algorithms converge, and if so, how fast. This theory will build on that for a general Fixed Point Iteration $x_{i+1} = g(x_i)$ where $g$ is a continuously differentiable function. Suppose that there exists a fixed point $r = g(r)$. By the mean value theorem, we have that $$ x_{i+1} - r = g(x_i) - g(r) = g'(c_i) (x_i - r) $$ for some $c$ with $|c - r| < |x_i - r|$. In other words, $|e_{i+1}| = |g'(c_i)| |e_i|$, which converges to zero if $|g'(c_i)| < 1$. If $|g'(r)| < 1$ then for any $\epsilon > 0$ there is a neighborhood of $r$ such that $|g'(c)| < |g'(r)| + \epsilon$ for all $c$ in that neighborhood. Consequently, we have:
If $g$ is continuously differentiable, $r = g(r)$, and $|g'(r)| < 1$ then the fixed point iteration $x_{i+1} = g(x_i)$ is locally linearly convergent with rate $|g'(r)|$.
In other words, $$ x_{i+1} = x_i - ??? . $$
The following code appeared literally (including comments) in the Quake III Arena source code (late 1990s).
float Q_rsqrt( float number )
{
long i;
float x2, y;
const float threehalfs = 1.5F;
x2 = number * 0.5F;
y = number;
i = * ( long * ) &y; // evil floating point bit level hacking
i = 0x5f3759df - ( i >> 1 ); // what the fuck?
y = * ( float * ) &i;
y = y * ( threehalfs - ( x2 * y * y ) ); // 1st iteration
// y = y * ( threehalfs - ( x2 * y * y ) ); // 2nd iteration, this can be removed
return y;
}
We now have vector instructions for approximate inverse square root. More at https://en.wikipedia.org/wiki/Fast_inverse_square_root
In [72]:
def fquartic(x):
return (x - 0.9)**4, 4*(x - 0.9)**3
newton(fquartic, 0)
Out[72]:
We only get three digits correct despite a very small residual (and it takes many iterations to get there).
Difficulty computing zeros of polynomials can also arise when all the roots are simple. For example, the Wilkinson polynomial
$$ \prod_{i=1}^{20} (x - i) = \sum_{i=0}^{20} a_i x^i $$
has roots at each of the positive integers up to 20, but the roots are extremely sensitive to perturbations of the coefficients $a_i$, as shown in this figure from Trefethen and Bau (1999).
Numerical difficulties in which "correct" algorithms produce unreliable solutions almost always stem from lack of stability and/or ill conditioning.
Consider a function $f: X \to Y$ and define the absolute condition number $$ \hat\kappa = \lim_{\delta \to 0} \max_{|\delta x| < \delta} \frac{|f(x + \delta x) - f(x)|}{|\delta x|} = \max_{\delta x} \frac{|\delta f|}{|\delta x|}. $$ If $f$ is differentiable, then $\hat\kappa = |f'(x)|$.
Floating point arithmetic $x \circledast y := \text{float}(x * y)$ is exact within a relative accuracy $\epsilon_{\text{machine}}$. Formally, $$ x \circledast y = (x * y) (1 + \epsilon) $$ for some $|\epsilon| \le \epsilon_{\text{machine}}$.
In [73]:
format((.2 - 1/3) + 2/15, '.20f')
# format((.2 - 1/3) + (1/3 - 0.2), '.20f')
# format((1 + 1e-12) - 1, '.20f')
Out[73]:
In [74]:
eps = 1
while 1 + eps > 1:
eps /= 2
eps_machine = eps # We call this "machine epsilon"
In [75]:
numpy.log(1 + 1e-12) - numpy.log1p(1e-12)
Out[75]:
In [76]:
(numpy.log(1 + 1e-12) - numpy.log1p(1e-12)) / numpy.log1p(1e-12)
Out[76]:
In [77]:
x = numpy.array([1,1e5,1e10,1e15])
numpy.sin(numpy.pi*x)
Out[77]:
In [78]:
numpy.sin(x)**2 + numpy.cos(x)**2 - 1
Out[78]:
In [79]:
[numpy.tan((3.14159+eps)/2) for eps in [1e-6,1e-8]], 1/numpy.cos(3.14159)**2
Out[79]:
Given the relative nature of floating point arithmetic, it is more useful to discuss relative condition number, $$ \kappa = \max_{\delta x} \frac{|\delta f|/|f|}{|\delta x|/|x|} = \max_{\delta x} \Big[ \frac{|\delta f|/|\delta x|}{|f| / |x|} \Big] $$ or, if $f$ is differentiable, $$ \kappa = \max_{\delta x} |f'(x)| \frac{|x|}{|f|} . $$
How does a condition number get big?
The relative accuracy of the best-case algorithm will not be reliably better than $\epsilon_{\text{machine}}$ times the condition number. $$ \max_{\delta x} \frac{|\delta f|}{|f|} \ge \kappa \cdot \epsilon_{\text{machine}} $$
Suppose we want to apply Newton's method to a function that we know how to evaluate, but don't have code to differentiate. This is often because it's difficult/error-prone to write or because the interface by which we call it does not support derivatives. (Commercial packages often fall in this category.)
In [80]:
def diff(f, x, epsilon=1e-5):
return (f(x + epsilon) - f(x)) / epsilon
diff(numpy.sin, 0.7, 1e-8) - numpy.cos(0.7)
Out[80]:
In [81]:
x = .5
diff(numpy.tan, x) - 1/numpy.cos(x)**2
Out[81]:
In [104]:
x = 3.14/2
[(eps, diff(numpy.tan, x, eps) - 1/numpy.cos(x)**2) for eps in [1e-14, 1e-12, 1e-10, 1e-8, 1e-6, 1e-4]]
Out[104]:
In [108]:
x = 1e4
[(eps, diff(numpy.log, x, eps) - 1/x) for eps in [1e-14, 1e-12, 1e-10, 1e-8, 1e-6, 1e-4, 1e-2]]
Out[108]:
In [109]:
def diff_wp(f, x, eps=1e-8):
"""Numerical derivative with Walker and Pernice (1998) choice of step"""
h = eps * (1 + abs(x))
return (f(x+h) - f(x)) / h
x = 1
[(eps, diff_wp(numpy.log, x, eps) - 1/x) for eps in [1e-14, 1e-12, 1e-10, 1e-8, 1e-6, 1e-4, 1e-2]]
Out[109]:
In [110]:
x = 1e-4
[(eps, diff_wp(numpy.log, x, eps) - 1/x) for eps in [1e-14, 1e-12, 1e-10, 1e-8, 1e-6, 1e-4, 1e-2]]
Out[110]:
This algorithm is imperfect, leaving some scaling responsibility to the user. It is the default in PETSc's "matrix-free" Newton-type solvers.
In [86]:
numpy.log(-1)
Out[86]:
In [95]:
x = numpy.sqrt(-1)
1/numpy.inf
Out[95]:
In [99]:
x = numpy.linspace(0,3,100)
pyplot.plot(x, numpy.sqrt(x + 1e-1))
pyplot.show()
In [100]:
numpy.tan(1e100)
Out[100]:
In [102]:
numpy.tan(1e100*(1 + 2*eps_machine))
Out[102]:
In [103]:
1e100 - 1e100*(1 + 2*eps_machine)
Out[103]:
In [123]:
x = numpy.linspace(-5,5)
pyplot.figure()
f = numpy.sin(x) + 1.1*x
pyplot.plot(x, f)
Out[123]:
The diff
and diff_wp
functions use a "forward difference" formula: $\tilde f'(x) := (f(x+h) - f(x))/h$.
Using the Taylor expansion of $f(x+h)$, we compute the discretization error
$$ \begin{split} \frac{f(x+h) - f(x)}{h} - f'(x) &= \frac{f(x) + f'(x) h + f''(x) h^2/2 + O(h^3) - f(x)}{h} - f'(x) \\
&= \frac{f'(x) h + f''(x) h^2/2 + O(h^3)}{h} - f'(x) \\
&= f''(x) h/2 + O(h^2)
\end{split} . $$
This is the discretization error caused by choosing a finite (not infinitesimal) differencing parameter $h$, and the leading order term depends linearly on $h$.
We have an additional source of error, rounding error, which comes from not being able to compute $f(x)$ or $f(x+h)$ exactly, nor subtract them exactly. Suppose that we can, however, compute these functions with a relative error on the order of $\epsilon_{\text{machine}}$. This leads to $$ \begin{split} \tilde f(x) &= f(x)(1 + \epsilon_1) \\ \tilde f(x \oplus h) &= \tilde f((x+h)(1 + \epsilon_2)) \\ &= f((x + h)(1 + \epsilon_2))(1 + \epsilon_3) \\ &= [f(x+h) + f'(x+h)(x+h)\epsilon_2 + O(\epsilon_2^2)](1 + \epsilon_3) \\ &= f(x+h)(1 + \epsilon_3) + f'(x+h)x\epsilon_2 + O(\epsilon_{\text{machine}}^2 + \epsilon_{\text{machine}} h) \end{split} $$
where each $\epsilon_i$ is an independent relative error on the order of $\epsilon_{\text{machine}}$ and we have used a Taylor expansion at $x+h$ to approximate $f(x \oplus h)$. We thus write the rounding error in the forward difference approximation as $$ \begin{split} \left\lvert \frac{\tilde f(x+h) \ominus \tilde f(x)}{h} - \frac{f(x+h) - f(x)}{h} \right\rvert &= \left\lvert \frac{f(x+h)(1 + \epsilon_3) + f'(x+h)x\epsilon_2 + O(\epsilon_{\text{machine}}^2 + \epsilon_{\text{machine}} h) - f(x)(1 + \epsilon_1) - f(x+h) + f(x)}{h} \right\rvert \\ &\le \frac{|f(x+h)\epsilon_3| + |f'(x+h)x\epsilon_2| + |f(x)\epsilon_1| + O(\epsilon_{\text{machine}}^2 + \epsilon_{\text{machine}}h)}{h} \\ &\le \frac{(2 \max_{[x,x+h]} |f| + \max_{[x,x+h]} |f' x| \epsilon_{\text{machine}} + O(\epsilon_{\text{machine}}^2 + \epsilon_{\text{machine}} h)}{h} \\ &= (2\max|f| + \max|f'x|) \frac{\epsilon_{\text{machine}}}{h} + O(\epsilon_{\text{machine}}) \\ \end{split} $$ where we have assumed that $h \ge \epsilon_{\text{machine}}$. This error becomes large (relative to $f'$ -- we are concerned with relative error after all)
Suppose we would like to choose $h$ to minimize the combined discretization and rounding error, $$ h^* = \arg\min_h | f''(x) h/2 | + (2\max|f| + \max|f'x|) \frac{\epsilon_{\text{machine}}}{h} $$ (dropping the higher order terms), which we can compute by differentiating with respect to $h$ and setting the result equal to zero $$ |f''|/2 - (2\max|f| + \max|f'x|) \frac{\epsilon_{\text{machine}}}{h^2} = 0 $$ which can be rearranged as $$ h^* = \sqrt{\frac{4\max|f| + 2\max|f'x|}{|f''|}} \sqrt{\epsilon_{\text{machine}}} .$$ Of course this formula is of little use for computing $h$ because all this is to compute $f'$, which we obviously don't know yet, much less $f''$. However, it does have value:
1e-8
(i.e., $\sqrt{\epsilon_{\text{machine}}}$) was empirically found to be about optimal for well-behaved functions.Instead of the forward difference approximation $$ \frac{f(x+h) - f(x)}{h} $$ we could use the centered difference formula, $$ \frac{f(x+h) - f(x-h)}{2h} . $$ (One way to derive this formula is to average a forward and backward difference. We will learn a more general method later in the course when we do interpolation.) We can compute the discretization error by Taylor expansion, $$ \frac{f(x) + f'(x)h + f''(x)h^2/2 + f'''(x)h^3/6 - f(x) + f'(x)h - f''(x)h^2/2 + f'''(x) h^3/6 + O(h^4)}{2h} = f'(x) + f'''(x)h^2/6 + O(h^3) $$ showing that the leading error term is of order $h^2$, versus order $h$ for forward differences. A similar computation including rounding error will find that the optimal $h$ is now of order $\sqrt[3]{\epsilon_{\text{machine}}}$ so the best attainable accuracy is $\epsilon_{\text{machine}}^{2/3}$. This accuracy improvement (versus $\sqrt{\epsilon_{\text{machine}}}$) is significant, but we'll also see that it is twice as expensive when computing derivatives of multi-variate functions.
We use the notation $\tilde f(x)$ to mean a numerical algorithm for approximating $f(x)$. Additionally, $\tilde x = x (1 + \epsilon)$ is some "good" approximation of the exact input $x$.
"nearly the right answer to nearly the right question" $$ \frac{\lvert \tilde f(x) - f(\tilde x) \rvert}{| f(\tilde x) |} \in O(\epsilon_{\text{machine}}) $$ for some $\tilde x$ that is close to $x$
"exactly the right answer to nearly the right question" $$ \tilde f(x) = f(\tilde x) $$ for some $\tilde x$ that is close to $x$
The algorithm computes $$\tilde f(x) = \text{float}(x) \oplus 1 = [x(1+\epsilon_1) + 1](1 + \epsilon_2) = (x + 1 + x\epsilon_1)(1 + \epsilon_2) $$ and we can express any $\tilde x = x(1 + \epsilon_3)$. To see if if the algorithm is stable, we compute $$ \frac{\tilde f(x) - f(\tilde x)}{|f(\tilde x)|} = \frac{(x + 1 + x\epsilon_1)(1 + \epsilon_2) - [x(1+ \epsilon_3) + 1]}{\tilde x + 1} = \frac{(x + 1)\epsilon_2 + x(\epsilon_1 - \epsilon_3) + O(\epsilon^2)}{x + 1 + x\epsilon_3} . $$ If we can choose $\epsilon_3$ to make this small, then the method will be (forward) stable, and if we can make this expression exactly zero, then we'll have backward stability. Trying for the latter, we solve for $\epsilon_3$ by setting the numerator equal to zero, $$ \epsilon_3 = \frac{x + 1}{x}\epsilon_2 + \epsilon_1 + O(\epsilon^2)/x $$ which is small so long as $|x| \gg 0$, but the first term blows up as $x \to 0$. In other words, the fact that $\epsilon_2$ can produce a large error relative to the input causes this algorithm to not be backward stable. In contrast, this $x\to 0$ case is not a problem for forward stability because $\epsilon_3 = \epsilon_1$ yields error on the order of $\epsilon_2$.
Now we are interested in $$ \frac{\tilde f(x,y) - f(\tilde x,\tilde y)}{f(\tilde x,\tilde y)} $$ and we can vary both $\tilde x$ and $\tilde y$. If we choose $y=1$, then the ability to vary $\tilde y$ is powerful enough to ensure backward stability.
A backward stable algorithm for computing $f(x)$ has relative accuracy $$ \left\lvert \frac{\tilde f(x) - f(x)}{f(x)} \right\rvert \in O(\kappa(f) \epsilon_{\text{machine}}) . $$ This is a rewording of a statement made earlier -- backward stability is the best case.