Approximating square root

There are a lot of useful functions that have no closed form solution, meaning we can't just do a computation and return the value. Instead, we need to use an iterative method to approximate the function value. We can use this to approximate sine (with Taylor series expansion), approximate square root (as we'll do in this lecture), or optimize a cost or error function (gradient descent in next lecture).

As with the previous uniform random variable lecture, we must translate a recurrence relation to Python. Instead of returning a single value in the recurrence series, we will look for convergence of the series. In other words, if we run the series out far enough, $x_{i+1}$ will be close to $x_i$ leaving $x_i$ as a very accurate approximation of square root. This will teach us the basics of iterative computing and prepare us for the more complicated function optimization material.

Babylonian method

To approximate square root, the idea is to pick an initial estimate, $x_0$, and then iterate with better and better estimates, $x_i$, using the (Babylonian method) recurrence relation:

$x_{i+1} = \frac{1}{2}(x_i + \frac{n}{x_i})$

There’s a great deal on the web you can read to learn more about why this process works but it relies on the average (midpoint) of $x_i$ and $n/x_i$ getting us closer to the square root of n. The cool thing is that the iteration converges quickly.

Our goal is to write a function that takes a single number and returns it square root. What do we know about this function before even beginning to code? Well, we have a clear description of the problem per our function workplan, and we also have the function signature we want:

def sqrt(n):

Because we are implementing a recurrence relation, we know that we will have a loop that computes $x_{i+1}$ from $x_{i}$.

Convergence

The terminating condition of the loop is when we have reached convergence or close to it. Convergence just means that $x_{i+1}$ is pretty close to $x_i$. Because we can never compare to real numbers for equality, we have to check for the difference being smaller than some precision like 0.00000001.

Iterative method outline

Just as we have an outline for an analytics program, iterative methods all share the same basic outline. (I'm assuming here that $x_{i+1}$ depends only on a single previous value and that $i$ implicitly increments as the loop goes around.)

set $x_0$ to initial value
repeat:
    $x_{i+1} =$ function-giving-next-value$(x_i)$
until $abs(x_{i+1} - x_i) \lt precision$
return $x_{i+1}$

Because Python does not have a repeat-until loop, we fake it with an infinite loop containing a conditional that breaks us out upon convergence:

set $x_0$ to initial value
while True:
    $x_{i+1} =$ function-giving-next-value$(x_i)$
    if $abs(x_{i+1} - x_i) \lt precision$
        return $x_{i+1}$

That is a fairly straightforward implementation of the recurrence relation, but you will notice that we don't actually need to keep all previous $x_i$ around except for the new value and the previous value. Here is a Python implementation that tracks only two values and follows the infinite loop pattern:


In [19]:
def sqrt(n):
    "compute square root of n"
    PRECISION = 0.00000001 # stop iterating when we converge with this delta
    x_0 = 1.0 # pick any old initial value
    x_prev = x_0
    while True: # Python doesn't have repeat-until loop so fake it
        #print(x_prev)
        x_new = 0.5 * (x_prev + n/x_prev)
        if abs(x_new - x_prev) < PRECISION:
            return x_new
        x_prev = x_new # x_i+1 becomes x_i (previous value)

In [17]:
sqrt(100)


1.0
50.5
26.24009900990099
15.025530119986813
10.840434673026925
10.032578510960604
10.000052895642693
10.000000000139897
Out[17]:
10.0

To test our square root approximation, we can compare it to math.sqrt() and use numpy's isclose to do the comparison.


In [22]:
import numpy as np
def check(n):
    assert np.isclose(sqrt(n), np.sqrt(n))
def test_sqrt():
    check(125348)
    check(89.2342)
    check(100)
    check(1)
    check(0)

test_sqrt()

As you can see we can define a function within a function. It's not special in any way except that code outside of test_sqrt() cannot see function check(). On the other hand, check() can see the symbols outside of test_sqrt(), such as our sqrt().

Exercise

Type in (don't cut/paste) the sqrt(n) function and test with, for example, sqrt(125348.0). Make sure you get the right answer (354.045195) and then add print statements so that you can see the sequence of $x_{i}$ values. I get:

1.0
62674.5
31338.249992
15671.1249162
7839.56178812
3927.77547356
1979.84435152
1021.5781996
572.139273508
395.612894667
356.228988269
354.051888518
354.045194918
354.045194855

Notice how quickly it converges!


In [3]:
def sqrt_with_trace(n):
    "compute square root of n"
    PRECISION = 0.00000001 # stop iterating when we converge with this delta
    x_0 = 1.0 # pick any old initial value
    x_prev = x_0
    while True: # Python doesn't have repeat-until loop so fake it
        print(x_prev)
        x_new = 0.5 * (x_prev + n/x_prev)
        if abs(x_new - x_prev) < PRECISION:
            return x_new
        x_prev = x_new

sqrt_with_trace(125348.000000)


1.0
62674.5
31338.249992022273
15671.12491623693
7839.561788117747
3927.7754735639414
1979.844351517673
1021.5781996033152
572.1392735077299
395.612894666841
356.2289882689982
354.0518885182295
354.04519491839494
354.04519485512014
Out[3]:
354.04519485512014

Now that we know how to implement a recurrence relation that converges, let's take a look at function optimization. At first glance, it seems completely different, but uses the same abstraction of an iterative method.