First define a simple function for which we know the exact answer:

```
In [ ]:
```def f1(x):
return 1. + x**3
a1 = 0.
b1 = 2.
int_true1 = (b1-a1) + (b1**4 -a1**4) / 4.
print "true integral: %22.14e" % int_true1

```
In [ ]:
```def plot_trap(f,a,b,n):
x = linspace(a-0.2, b+0.2, 10000) # points for smooth plot
plot(x,f(x),'b-')
xj = linspace(a,b,n)
plot(xj,f(xj),'ro-')
for xi in xj:
plot([xi,xi], [0,f(xi)], 'r')
plot([a,b], [0,0], 'r') # along x-axis
plot_trap(f1,a1,b1,5)

The area of a single trapezoid is the width of the base times the average height, so between points $x_j$ and $x_{j+1}$ this gives: $$ \frac{h}{2} (f(x_j) + f(x_{j+1}). $$

Summing this up over all the trapezoids gives: $$ h\left(\frac 1 2 f(x_0) + f(x_1) + f(x_2) + \cdots + f(x_{n-2}) + \frac 1 2 f(x_{n-1})\right) = h\sum_{j=0}^{n-1} f(x_j) - \frac h 2 \left(f(x_0) + f(x_{n-1})\right) = h\sum_{j=0}^{n-1} f(x_j) - \frac h 2 \left(f(a) + f(b))\right). $$

This can be implemented as follows (note that in Python fj[-1] refers to the last element of fj, and similarly fj[-2] would be the next to last element).

```
In [ ]:
```def trapezoid(f,a,b,n):
h = (b-a)/(n-1)
xj = linspace(a,b,n)
fj = f(xj)
int_trapezoid = h*sum(fj) - 0.5*h*(fj[0] + fj[-1])
return int_trapezoid

We can test it out for the points used in the figure above:

```
In [ ]:
```n = 5
int_trap = trapezoid(f1,a1,b1,n)
error = abs(int_trap - int_true1)
print "trapezoid rule approximation: %22.14e, error: %10.3e" % (int_trap, error)

Using more points will give a better approximation, try changing it in the cell above.

If we increase n, the number of points used, and hence decrease h, the spacing between points, we expect the error to converge to zero for reasonable functions $f(x)$.

The trapezoid rule is "second order accurate", meaning that the error goes to zero like $O(h^2)$ for a function that is sufficiently smooth (for example if its second derivative is continuous). For small $h$, the error is expected to be behave like $Ch^2 + O(h^3)~$ as $h$ goes to zero, where $C$ is some constant that depends on how smooth $h$ is.

If we double n (and halve h) then we expect the error to go down by a factor of 4 roughly (from $Ch^2$ to $C(h/2)^2~$).

We can check this by trying several values of n and making a table of the errors and the ratio from one n to the next:

```
In [ ]:
```def error_table(f,a,b,nvals,int_true,method=trapezoid):
"""
An improved version that takes the function defining the method as an
input argument.
"""
print " n approximation error ratio"
last_error = 0. # need something for first ratio
for n in nvals:
int_approx = method(f,a,b,n)
error = abs(int_approx - int_true)
ratio = last_error / error
last_error = error # for next n
print "%8i %22.14e %10.3e %10.3e" % (n,int_approx, error, ratio)
nvals = array([5, 10, 20, 40, 80, 160, 320])
error_table(f1,a1,b1,nvals,int_true1,trapezoid)

(Note that the first ratio reported is meaningless.)

Convergence might be easier to see in a plot. If a method is p'th order accurate then we expect the error to behave like $E\approx Ch^p$ for some constant $C$, for small $h$. This is hard to visualize. It is much easier to see what order accuracy we are achieving if we produce a log-log plot instead, since $E = Ch^p~$ means that $\log E = \log C + p\log h$

In other words $\log E~$ is a linear function of $\log h~$.

```
In [ ]:
```def error_plot(f,a,b,nvals,int_true,method=trapezoid):
errvals = zeros(nvals.shape) # initialize to right shape
for i in range(len(nvals)):
n = nvals[i]
int_approx = method(f,a,b,n)
error = abs(int_approx - int_true)
errvals[i] = error
hvals = (b - a) / (nvals - 1) # vector of h values for each n
loglog(hvals,errvals, 'o-')
xlabel('spacing h')
ylabel('error')
error_plot(f1,a1,b1,nvals,int_true1,trapezoid)

```
In [ ]:
```def f2_factory(k, a, b):
def f2(x):
return 1 + x**3 + sin(k*x)
int_true = (b-a) + (b**4 - a**4) / 4. - (1./k) * (cos(k*b) - cos(k*a))
return f2, int_true

First create a version of $f_2$ with $k=50$:

```
In [ ]:
```k = 50.
a2 = 0.
b2 = 2.
f2, int_true2 = f2_factory(k, a2, b2)
print "true integral: %22.14e" % int_true2

For this function with k=50, using n=10 points is not going to give a very good approximation:

```
In [ ]:
```plot_trap(f2,a2,b2,10)

```
In [ ]:
```error_plot(f2,a2,b2,nvals,int_true2)

Now make the function much more oscillatory with a larger value of $k$...

```
In [ ]:
```k = 1000.
f2, int_true2 = f2_factory(k,a2,b2)
print "true integral: %22.14e" % int_true2

For the previous choice of nvals the method does not seem to be doing well:

```
In [ ]:
```nvals = array([5, 10, 20, 40, 80, 160, 320])
print "nvals = ",nvals
error_plot(f2,a2,b2,nvals,int_true2, trapezoid)

```
In [ ]:
```nvals = array([5 * 2**i for i in range(12)])
print "nvals = ",nvals
error_plot(f2,a2,b2,nvals,int_true2,trapezoid)

Eventually we see second order convergence and ratios that approach 4:

```
In [ ]:
```error_table(f2,a2,b2,nvals,int_true2,trapezoid)

There are much better methods than the Trapezoidal rule that are not much harder to implement but get much smaller errors with the same number of function evaluations. One such method is Simpson’s rule, which approximates the integral over a single interval from $x_i$ to $x_{i+1}$ by $$\int_{x_i}^{x_{i+1}} f(x)\, dx \approx \frac h 6 (f(x_i) + 4f(x_{i+1/2}) + f(x_{i+1})),$$ where $x_{i+1/2} = \frac 1 2 (x_i + x_{i+1}) = x_i + h/2.$

Derivation: The trapezoid method is derived by approximating the function on each interval by a linear function interpolating at the two endpoints of each interval and then integrating this linear function. Simpson's method is derived by approximating the function by a quadratic function interpolating at the endpoints and the center of the interval and integrating this quadratic function.

```
In [ ]:
```def simpson(f,a,b,n):
h = (b-a)/(n-1)
xj = linspace(a,b,n)
fj = f(xj)
xc = linspace(a+h/2,b-h/2,n-1) # midpoints of cells
fc = f(xc)
int_simpson = (h/6.) * (2.*sum(fj) - (fj[0] + fj[-1]) + 4.*sum(fc))
return int_simpson

```
In [ ]:
```k = 1000.0
f2, int_true2 = f2_factory(k,a2,b2)
print "true integral: %22.14e" % int_true2
error_table(f2,a2,b2,nvals,int_true2,simpson)

```
In [ ]:
```error_plot(f2,a2,b2,nvals,int_true2,trapezoid)
error_plot(f2,a2,b2,nvals,int_true2,simpson)

You might want to experiment with changing $k$ in the two cells above.

Simpson's method is derived by integrating a quadratic approximation of the function (as opposed to a linear approximation as with the Trapezoid Rule). In fact, Simpson's method integrates cubic polynomials exactly; we can see this when we try it out with the function `f1`

defined at the top of this notebook. (This is because the error between the cubic and the quadratic approximation on each interval is not zero but does have integral equal to zero since it turns out to be an odd function about the midpoint.) For this reason, Simpson's Rule is fourth-order accurate in general rather than only third-order (as one might expect when going from a linear to quadratic approximation).

As a result, the error ratios computed when trying to compute the `error_table`

associated with `f1`

appear nonsensical. Notice also that the errors are not precisely zero; they are effectively zero to within machine precision.

```
In [ ]:
```error_table(f1,a1,b1,nvals,int_true1,simpson)