This is a Jupyter notebook using Python. You can install Jupyter locally to edit and interact with this notebook.
Here, we consider the problem of numerically evaluating definite integrals $$\int_a^b f(x) dx $$ and will mostly consider finite domains $-\infty < a < b < \infty$. As with interpolation, we will be able to create very accurate and efficient methods for certain classes of functions while other functions will be more challenging. For each integration technique we try, we're going to need to test its accuracy. Let's create some test functions and their derivatives.
In [1]:
%matplotlib notebook
import numpy
from matplotlib import pyplot
tests = []
@tests.append
class f_expx:
def F(x):
return numpy.exp(2*x)/(1+x**2)
def f(x):
return 2*numpy.exp(2*x)/(1+x**2) - numpy.exp(2*x)/(1+x**2)**2 * 2*x
@tests.append
class f_quadratic:
def F(x):
return x**3/3 - x**2 + 3*x - 1
def f(x):
return x**2 - 2*x + 3
@tests.append
class f_tanh:
def F(x):
return numpy.tanh(x)
def f(x):
return numpy.cosh(x)**(-2)
@tests.append
class f_logx2:
def F(x):
return numpy.log(x**2)
def f(x):
return 1/x**2 * 2*x
pyplot.style.use('ggplot')
pyplot.rcParams['figure.max_open_warning'] = False
pyplot.figure()
x = numpy.linspace(-2,2)
for t in tests:
pyplot.plot(x, t.f(x), label=t.__name__)
pyplot.ylim(-10,10)
pyplot.legend(loc='upper left')
Out[1]:
Let $f(x)$ be a continuous function and define $F(x)$ by $$ F(x) = \int_a^x f(s) ds . $$ Then $F(x)$ is uniformly continuous, differentiable, and $$ F'(x) = f(x) . $$ We say that $F$ is an antiderivative of $f$. This implies that $$ \int_a^b f(x) dx = F(b) - F(a) . $$ We will test the accuracy of our integration schemes using an antiderivative provided in our tests.
Analytically integrating an arbitrary function is hard, tends to require trickery, and is not always possible to express in closed form (e.g., elliptic integrals). In contrast, analytic differentiation involves straightforward application of the product rule and chain rule. So if we just choose an arbitrary function $F$ (the antiderivative), we can compute $f = F'$, then numerically integrate $\int_a^b f$ and compare to $F(b) - F(a)$. We have used this technique to produce the tests
array above.
Newton-Cotes methods approximate $f(x)$ using piecewise polynomial functions. Polynomials are easy to integrate analytically so we need only sum up the integrals over each piecewise interval.
The midpoint method uses piecewise constant functions over some number $n$ of intervals.
In [2]:
def fint_midpoint(f, a, b, n=20):
dx = (b - a)/n # Width of each interval
x = numpy.linspace(a+dx/2, b-dx/2, n)
return numpy.sum(f(x))*dx
for t in tests:
a, b = -2, 2.01
I = fint_midpoint(t.f, a, b, 51)
print('{:12s}: I={: 10e} error={: 10e}'.
format(t.__name__, I, I - (t.F(b) - t.F(a))))
In [3]:
# How fast does this method converge?
def plot_accuracy(fint, tests, ns, ref=[1,2], plot=pyplot.loglog):
a, b = -2, 2.001
ns = numpy.array(ns)
pyplot.figure()
for t in tests:
Is = numpy.array([fint(t.f, a, b, n) for n in ns])
Errors = numpy.abs(Is - (t.F(b) - t.F(a)))
plot(ns, Errors, '*', label=t.__name__)
for k in ref:
plot(ns, 1/ns**k, label='$1/n^{:d}$'.format(k))
pyplot.ylabel('Error')
pyplot.xlabel('n')
pyplot.legend(loc='lower left')
plot_accuracy(fint_midpoint, tests, range(2,30))
A common situation with numerical integration is that the function $f(x)$ has a significant cost relative to the additions and multiplications needed to scale and sum the values. The cost per integration point is constant for most methods that we will consider, therefore the number of function evaluations $n$ is a good measure of the cost. Some algorithms may have higher cost per quadrature point in which case we will need to trade off the superlinear cost of quadrature with the linear cost of evaluating the function.
The trapezoid rule uses piecewise linear functions on each interval.
$$\begin{split} \int_a^b f(a) + \frac{f(b) - f(a)}{b - a} (x - a) &= f(a) (x-a) + \frac{f(b) - f(a)}{2(b - a)} (x - a)^2 \Big|_{x=a}^b \\ &= f(a) (b-a) + \frac{f(b) - f(a)}{2(b - a)} (b-a)^2 \\ &= \frac{b-a}{2} \big( f(a) + f(b) \big) . \end{split} $$
In [4]:
x = numpy.linspace(-1,2,4)
print(x[0:2])
ix = [0,2]
print(x[ix])
print(x[[0,-1]])
In [5]:
def fint_trapezoid(f, a, b, n=20):
dx = (b - a) / (n - 1) # We evaluate endpoints of n-1 intervals
x = numpy.linspace(a, b, n)
fx = f(x)
fx[[0,-1]] *= .5
# fx[0] *= .5; fx[-1] *= .5
return numpy.sum(fx)*dx
plot_accuracy(fint_trapezoid, tests, range(2,30))
In [6]:
for t in tests:
a, b = -2, 2
npoints = 10*2**numpy.arange(5)
for n in npoints:
I = fint_trapezoid(t.f, a, b, n)
print('{:12s}: n={: 4d} I={: 10f} error={: 10f}'.
format(t.__name__, n, I, I - (t.F(b) - t.F(a))))
In [ ]:
In [7]:
pyplot.figure()
for t in tests[:-1]:
h = 1/(npoints - 1)
pyplot.loglog(h, [numpy.abs(fint_trapezoid(t.f, a, b, n) - (t.F(b) - t.F(a)))
for n in npoints], '*', label=t.__name__)
pyplot.loglog(h, h**2, label='$h^2$')
pyplot.xlabel('h')
pyplot.ylabel('I')
pyplot.legend(loc='upper left')
Out[7]:
The trapezoid rule with $n$ points has an interval spacing of $h = 1/(n-1)$. Let $I_h$ be the value of the integral approximated using an interval $h$. We have numerical evidence that the leading error term is $O(h^2)$, i.e., $$ I_h - I_0 = c h^2 + O(h^3) $$ for some as-yet unknown constant $c$ that will depend on the function being integrated and the domain of integration. If we can determine $c$ from two approximations, say $I_h$ and $I_{2h}$, then we can extrapolate to $h=0$. For sufficiently small $h$, we can neglect $O(h^3)$ and write $$\begin{split} I_h - I_0 &= c h^2 \\ I_{2h} - I_0 &= c (2h)^2 . \end{split}$$ Subtracting these two lines, we have $$ I_{h} - I_{2h} = c (h^2 - 4 h^2) $$ which can be solved for $c$ as $$ c = \frac{I_{h} - I_{2h}}{h^2 - 4 h^2} . $$ Substituting back into the first equation, we solve for $I_0$ as $$ I_0 = I_h - c h^2 = I_h + \frac{I_{h} - I_{2h}}{4 - 1} .$$ This is called Richardson extrapolation.
In [8]:
for t in tests[:-1]:
a, b = -2, 2
for n in [10, 20, 40, 80]:
I_h = fint_trapezoid(t.f, a, b, n)
I_2h = fint_trapezoid(t.f, a, b, n//2)
I_extrap = I_h + (I_h - I_2h) / 3
I_exact = t.F(b) - t.F(a)
print('{:12s}: n={: 4d} error={: 10f} {: 10f} {: 10f}'.
format(t.__name__, n, I_h-I_exact, I_2h-I_exact, I_extrap-I_exact))
In [9]:
@tests.append
class f_sin10:
def F(x):
return numpy.sin(10*x)
def f(x):
return 10*numpy.cos(10*x)
def fint_richardson(f, a, b, n):
n = (n // 2) * 2 + 1
h = (b - a) / (n - 1)
x = numpy.linspace(a, b, n)
fx = f(x)
fx[[0,-1]] *= .5
I_h = numpy.sum(fx)*h
I_2h = numpy.sum(fx[::2])*2*h
return I_h + (I_h - I_2h) / 3
plot_accuracy(fint_richardson, tests, range(3,60,2), ref=[2,3,4])
Evidently we created a fourth order accurate method by extrapolating a second order accurate method. This process can even be applied recursively!
Recall the Legendre polynomials from the Interpolation unit. These polynomials were orthogonal, so if we have a polynomial represented in the basis of Legendre polynomials, i.e., $$ p_n(x) = \sum_i c_i P_i(x) $$ where $P_i(x)$ is the $i$th Legendre polynomial, then $$ \int_{-1}^1 p_n(x) = 2 c_0 . $$
In [10]:
def vander_legendre(x, n=None):
if n is None:
n = len(x)
P = numpy.ones((len(x), n))
if n > 1:
P[:,1] = x
for k in range(1,n-1):
P[:,k+1] = ((2*k+1) * x * P[:,k] - k * P[:,k-1]) / (k + 1)
return P
def fint_legendre_lin(f, a, b, n):
x = numpy.linspace(a, b, n)
fx = f(x)
P = vander_legendre(numpy.linspace(-1,1,n))
c = numpy.linalg.solve(P, fx)
return c[0] * (b-a)
plot_accuracy(fint_legendre_lin, tests[:-1], range(3,25), ref=[2,3,4])
In [11]:
def cosspace(a, b, n=50):
return (a + b)/2 + (b - a)/2 * (numpy.cos(numpy.linspace(0, numpy.pi, n)))
def fint_legendre_cos(f, a, b, n):
x = cosspace(a, b, n)
fx = f(x)
P = vander_legendre(cosspace(-1,1,n))
c = numpy.linalg.solve(P, fx)
return c[0] * (b-a)
plot_accuracy(fint_legendre_cos, tests[:-1], range(3,25), ref=[2,3,4])
This scheme fits the $n$ data points with an $n-1$ degree polynomial, thus is exact for integrating polynomials of degree $n-1$. The leading error term is $O(h^n)$ where $h$ is the width of the interval.
To integrate a complicated function, one can either raise the degree $n$ or partition the interval $(a,b)$.
In fint_legendre_cos
, we created the Vandermonde matrix $P$ in the Legendre basis. But since we only need $c_0$, we could have written c0 = inv(P)[0].dot(fx)
(just use the first row of $P^{-1}$). In other words, our integration scheme is
$$ \int_{a}^b \approx (b-a) \sum_i w_i f(x_i) $$
where w = inv(P)[0]
. This set $\{x_i, w_i\}$ is called a quadrature.
We used the cosspace
points because they make interpolation stable (versus linsace
, for example). We needed interpolation to be well-behaved for the integral of the interpolating function to be a good approximation. But perhaps there are other stable points that could allow us to integrate higher degree polynomials. To this end, suppose a polynomial on the interval $[-1,1]$ can be written as
where $P_n(x)$ is the $n$th Legendre polnomials and both $q(x)$ and $r(x)$ are polynomials of maximum degree $n-1$.
If $P_n(x_i) = 0$ for each $x_i$, then we need only integrate $r(x)$, which is done exactly by integrating its interpolating polynomial. How do we find these roots $x_i$?
In the Interpolation unit, we learned that Legendre polynomials satisfy the recurrence $$\begin{split} P_0(x) &= 1 \\ P_1(x) &= x \\ (n+1) P_{n+1}(x) &= (2n+1) x P_n(x) - n P_{n-1}(x) . \end{split}$$ It can also be shown that the derivatives satisfy a recurrence, $$ P_{n+1}'(x) = (2n+1) P_n(x) + P_{n-1}'(x) . $$
With the ability to compute $P_n(x)$ and $P_n'(x)$, we can use Newton's method to compute the roots.
It turns out that cos(linspace(.5/(n-1), 1-.5/(n-1), n) * pi)
is a good initial guess.
Implementing Gauss-Legendre integration using this procedure will be part of HW6.
Below, I use the Golub-Welsch algorithm to compute the quadrature points and weights. This method is recommended for production use.
In [12]:
def fint_legendre(f, a, b, n):
"""Gauss-Legendre integration using Golub-Welsch algorithm"""
beta = .5/numpy.sqrt(1-(2.*numpy.arange(1,n))**(-2))
T = numpy.diag(beta,-1) + numpy.diag(beta, 1)
D, V = numpy.linalg.eig(T)
w = V[0,:]**2 * (b-a)
x = (a+b)/2 + (b-a)/2 * D
return w.dot(f(x))
fint_legendre(tests[0].f, -1, 1, 3)
plot_accuracy(fint_legendre, tests, range(3,25), ref=[2,3,4], plot=pyplot.semilogy)
pyplot.title('Gauss-Legendre Integration')
Out[12]:
In [13]:
fint_legendre(lambda x: (x-.5)**9, -1, 1, n=5) - ((1-.5)**10/10 - (-1-.5)**10/10)
Out[13]:
In [ ]: