The Newton method is an iterative method to solve equations of the form $f(x)=0$, i.e. to find roots or zeros $x^\ast$ such that $f(x^\ast) = 0$. Given an initial guess $x_0$, we repeat the iteration
$$x_{n+1} = x_n - \frac{f(x_n)}{f'(x_n)}.$$Let's implement the Newton algorithm in Julia. We start from an initial condition $x_0$:
In [1]:
x_0 = 3
Out[1]:
Julia always returns a value from any expression:
In [2]:
x_0
Out[2]:
We can use LaTeX notation and tab completion for Unicode, e.g. x\_0<TAB>
:
In [3]:
x₀ = 4
Out[3]:
In [4]:
x₀
Out[4]:
Values in Julia have associated types. We can find the type of a variable using the appropriately-named typeof
function:
In [5]:
typeof(x₀)
Out[5]:
We can guess that this means an integer with 64 bits. [This result will be Int32
if you have a 32-bit machine.]
We need to define a function whose roots we wish to find. Let's find square roots of two, for example. Julia provides a concise mathematical syntax for defining simple functions:
In [6]:
f(x) = x^2 - 2
Out[6]:
In [7]:
f(x_0)
Out[7]:
We also need the derivative function, $f'$. For the moment, let's just give it my hand. (Later, we will see a neat way to avoid this.) We may like to write f'
, using the apostrophe, '
), but the apostrophe turns out to be a special character in Julia, so we get an error if we try to define a variable or function named f'
:
In [8]:
f'(x) = 2x
However, Unicode comes to our rescue: f\prime<TAB>
:
In [9]:
f′(x) = 2x
Out[9]:
Now we can do one step of our algorithm; mathematical operations work like we expect:
In [10]:
x_1 = x_0 - f(x_0) / f′(x_0)
Out[10]:
Note that division of integers using /
gives a floating-point result:
In [11]:
typeof(x_1)
Out[11]:
Integer division
In [12]:
div(4,2)
Out[12]:
Integer division gives me a truncated value of a division if the result is actually a float
In [13]:
div(5,2)
Out[13]:
Julia has a neat built-in funcion called bits, that gives me the bynary representation of a variable
In [14]:
ξ = 1000000000000000
Out[14]:
In [15]:
bits(ξ)
Out[15]:
Non-sense could happen if you try to fit a really big number in a int64
variable for example
In [16]:
τ = 2^62
Out[16]:
In [17]:
τ^2
Out[17]:
In [18]:
bits(τ)
Out[18]:
In [19]:
τ = τ * 2
Out[19]:
A solution to this is Big Integers
In [20]:
x = big(2)
Out[20]:
In [21]:
typeof(x)
Out[21]:
In [22]:
x = big(2)^62
Out[22]:
In [23]:
x *= 2
Out[23]:
We now need to repeat such steps several times. Julia has for
loops and while
loops. As usual, we tend to use for
loops when we know how many iterations we want, and while
when we iterate until a certain condition is attained.
Let's start with a simple for
. Blocks of code in Julia always end with end
:
In [24]:
for i in 1:5
println(i) # print the value of i followed by a new line
end
Here, a variable i
is introduced that is local to the loop, i.e. it exists only inside the loop:
In [25]:
i
i
takes each value in the iterable collection 1:5
. Let's ask Julia what this object 1:5
is:
In [26]:
1:5
Out[26]:
As usual, Julia returns a value, but in this case it is (at first glance) apparently not very helpful. What type is this object?
In [27]:
typeof(1:5)
Out[27]:
We see that Julia has a special type (actually, several different types) to represent ranges, in which the elements are calculated each time a new element is required, rather than stored. We can see all the elements that will be produced using the collect
function:
In [28]:
v = collect(1:5)
Out[28]:
The result is an object of a new type, an Array
, in this case one whose elements are integers and that is of dimension 1. Note that 1
is not the number of elements in the array, which is called length
:
In [29]:
length(v)
Out[29]:
Array
s are also iterable, so we can iterate over an Array
using a for
loop. 1-dimensional arrays, also called Vector
s, are constructed using square brackets:
In [30]:
w = [3, 4, 7]
Out[30]:
In [31]:
for i in w
println(2*i)
end
We are now ready to implement the Newton method:
In [32]:
x_0 = 3
x = x_0
for i in 1:10
x_new = x - f(x) / f′(x)
println(i, "\t", x_new)
x = x_new
end
In this case, we see that the method rapidly converges to one of the square roots of two. Which root it converges to depends on the initial condition:
In [33]:
x_0 = -3
x = x_0
for i in 1:10
x_new = x - f(x) / f′(x)
println(i, "\t", x_new)
x = x_new
end
The Newton method is, in fact, not guaranteed to converge to a root (although it always does so if started "sufficiently close" to a root, at a rate that is known). Furthermore, which root it converges to can depend sensitively on the initial condition. Let's calculate this for several initial conditions.
First we create a set of initial conditions on the real line, say between -5 and 5. We now include a step size in the range:
In [34]:
initial_conditions = -5:0.125:5
collect(initial_conditions) # use tab completion for long variable names!
Out[34]:
In [35]:
showall(collect)
This range type is different:
In [36]:
typeof(-5:0.1:5)
Out[36]:
The array is also a new type: it is now an array of 64-bit floating-point numbers. We can also see that the {...}
syntax thus gives the parameters of the Array
type.
For each of these initial conditions, we will run the Newton algorithm for a certain number of steps and store the resulting value. We thus need a new array in which to store the results. One way of creating an array is using the similar
function, which, by default, creates an array of the same type and same size, but with (currently) uninitialized values:
In [37]:
roots = similar(initial_conditions)
Out[37]:
Now we do the work:
In [38]:
enumerate(initial_conditions)
Out[38]:
In [39]:
for (j, x_0) in enumerate(initial_conditions)
x = x_0
for i in 1:100
x = x - f(x) / f′(x)
end
roots[j] = x
end
Here, enumerate
iterates over initial_conditions
but returns not only the value at each step, but also a counter. (j, x_0
) is called a tuple (an ordered pair):
In [40]:
for (j, x_0) in enumerate([2,17,-100])
@show (j,x_0)
end
In [41]:
t = (3, 4)
typeof(t)
Out[41]:
NB: In Julia v0.4, tuples have been completely reworked, and the resulting type is now
Tuple{Int64,Int64}
.
In [42]:
roots
Out[42]:
Julia does not show all of the contents of an array by default. We can see everything using showall
:
In [43]:
showall(roots)
We see that, apart from the NaN
value, the results are not very exciting. Let's work harder with more initial conditions. We can find out how long the calculation takes using @time
by wrapping the code in a begin...end
block:
In [44]:
@time begin
initial_conditions = -100:0.01:100
roots = similar(initial_conditions)
for (j, x_0) in enumerate(initial_conditions)
x = x_0
for i in 1:1000
x = x - f(x) / f′(x)
end
roots[j] = x
end
end
There are now many values stored in the array, so it is hopeless to examine them:
In [45]:
length(roots)
Out[45]:
Instead, we turn to visualisation. There are several plotting packages in Julia: Gadfly
is a native Julia library that produces beautiful plots; [PyPlot
] is a Julian interface to the well-known matplotlib
Python library.
Let's start with PyPlot
. First we need to download the package. Julia provides a built-in package manager, called Pkg
, that gracefully handles dependencies, etc. To tell Julia that we require the package, we do
In [46]:
Pkg.add("PyPlot")
In [47]:
Pkg.update()
In [48]:
ENV["PYTHON"] = "/usr/bin/python3"
Pkg.build("PyCall")
This step is necessary only once. In each session where we need to use PyPlot
we do
In [ ]:
using PyPlot
Note that this process of loading a package currently can take a considerable time. Work is in progress to reduce this loading time.
In [ ]:
figure(figsize=(6,4))
plot(initial_conditions, roots, ".");
If we are used to the performance of C or Fortran, we might start to be unhappy with Julia's speed in this rather simple calculation. A close inspection of the output of the @time
operation, however, gives us a very important clue: Julia apparently allocated over a gigabyte of memory to do a simple loop with some floating-point numbers!
This is almost always a very strong signal that there is something very wrong in your Julia code! In our case, it is not at all clear what that could be. It turns out to be something very fundamental in Julia:
[almost] NEVER WORK WITH GLOBAL OBJECTS!
Due to technical details about the way that Julia works, it turns out that GLOBALS ARE BAD. What is the solution? PUT EVERYTHING INTO A FUNCTION! Let's try following this advice. We take exactly the same code and just plop it into a new function. For longer functions, Julia has an alternative syntax:
In [ ]:
function do_roots()
initial_conditions = -100:0.01:100
roots = similar(initial_conditions)
for (j, x_0) in enumerate(initial_conditions)
x = x_0
for i in 1:1000
x = x - f(x) / f′(x)
end
roots[j] = x
end
roots
end
Note the last line of the function. This will automatically return the value of the roots
object as the output of the function. So we can call it like this:
In [ ]:
roots = do_roots()
Now how long did it take?
In [ ]:
# a semi-colon suppresses output
@time roots = do_roots();
@time roots = do_roots();
It allocates a million times less memory, and is 50 times faster! This is the first lesson about performance in Julia: always put everything in a function.
Note that the first time we ran the function, it took longer. This is due to the fact that the first time a function is run with arguments of given types, the function is compiled. Subsequent runs with the same types of arguments reuse the previously-compiled code.
Exercise: Use a while
loop with a suitable condition to improve the code for the Newton method.
Our code currently is not very flexible. To make it more flexible, we would like to pass in arguments to the do_roots
function. We can make a version which takes as arguments the functions f
and f'
, for example. Functions are "first-class objects" in Julia, so they can just be passed around by name.
Let's redefine our function do_roots
to accept these arguments:
In [ ]:
function do_roots(f, f′)
initial_conditions = -100:0.01:100
roots = similar(initial_conditions)
for (j, x_0) in enumerate(initial_conditions)
x = x_0
for i in 1:1000
x = x - f(x) / f′(x)
end
roots[j] = x
end
roots
end
Note the output that Julia returns: "generic function with 2 methods". This is a sign that something interesting is happening. In fact, we have not "redefined" the function do_roots
; rather, we have defined a new version of do_roots
, which accepts a different set of arguments. (The collection and types of the arguments that a function accepts are called its type signature.)
Indeed, the function do_roots
now has two different methods or versions:
In [ ]:
methods(do_roots)
If we call do_roots
with no arguments, the first version will be used; calling it with two arguments will call the second version. The process of choosing which "version" of a function to call is called dispatch. The fundamental fact in Julia is that (almost) all functions are such "generic functions" with multiple version, i.e. Julia is one of very few languages that use multiple dispatch. This turns out to be very natural for many applications in scientific computing.
The arguments f
and f'
in the second method of do_roots
are names that are local to the function. We have functions of the same name defined globally, so we can pass those in:
In [ ]:
@time do_roots(f, f′);
You can specify the type of the variable
In [ ]:
h(x) = x^2
In [ ]:
h(3)
In [ ]:
h(3.5)
In [ ]:
h(x::Float64) = x^3
In [ ]:
methods(h)
This is faster than the first version of do_roots
, but much slower than the good version. It turns out that Julia currently cannot optimize (inline) functions passed in this way. This is something to bear in mind -- there is (currently) a trade-off between user convenience and speed.
Julia also has anonymous functions, which allow us to pass in a function that we define "in the moment", without giving it a name. For example, let's do the exercise with a more interesting function:
In [ ]:
@time roots = do_roots(x->(x-1)*(x-2)*(x-3), x->3x^2-12x+11);
We see that anonymous functions are currently very slow. However, there are workarounds, e.g. the FastAnonymous
package.
In [ ]:
Pkg.add("FastAnonymous")
using FastAnonymous
In [ ]:
f1 = @anon x->(x-1)*(x-2)*(x-3)
f2 = @anon x->3x^2-12x+11
@time roots = do_roots(f1,f2);
Let's visualize the results for this function:
In [ ]:
figure(figsize=(6,4))
plot(-100:0.01:100, roots)
xlim()
In [ ]:
figure(figsize=(6,4))
plot(-100:0.01:100, roots)
xlim(1, 3)
The previous result is still pretty boring. It turns out that the Newton method gets interesting if we look for roots of functions of complex numbers. [If you are not familiar with complex numbers, you can think of them as pairs of real numbers that have certain mathematical operations defined.]
Let's try to use the Newton method starting from initial conditions distributed in the complex plane, i.e. pairs $a + bi$, where $i = \sqrt{-1}$. First of all let's see how Julia handles complex numbers:
In [ ]:
sqrt(-1)
Oh dear, that didn't work very well. It turns out that Julia is carefully designed to respect, when possible, the type of the input argument. Indeed, let's ask Julia what it thinks sqrt
means:
In [ ]:
sqrt
We see that sqrt
is a generic function, with the following methods:
In [ ]:
methods(sqrt)
Julia gives us a list of the available methods, together with links direct to the source code on GitHub (in IJulia) or locally (in Juno).
sqrt()
acting on a Float64
returns a Float64
when it can, or throws a DomainError
when its argument is negative. To get square roots in the complex plane, we must start with a complex number.
The names of types in Julia start with capital letters, so let's try Complex
:
In [ ]:
Complex
As we will see later, types have functions with the same name that act as constructors to make objects of the type. Let's see the available functions with the name Complex
. Note that output has changed rather a lot between Julia v0.3 and Julia v0.4:
In [ ]:
methods(Complex)
Now let's try playing with Complex
:
In [ ]:
a = Complex(3)
In [ ]:
typeof(a)
In [ ]:
b = Complex(3, 4.5)
In [ ]:
typeof(b)
We see that Complex
is also parametrised by the type of its real and imaginary parts.
We can also make complex numbers directly using im
:
In [ ]:
3.0 + 4.0im
(Here, 4.0im is multiplication of 4.0 by im
, which represents $i$, the imaginary unit.)
We can do complex arithmetic:
In [ ]:
a * b
What is happening here? Julia knows how to do *
for complex numbers. Let's ask Julia what *
is:
In [ ]:
*
So, mathematical operators are generic functions too! We can list all the ways to do *
:
In [ ]:
methods(*)
All of these are defined in Julia itself. (Although the definitions for basic types like Int
are only shallow wrappers around underlying C code.) We see that generic functions can be a complicated "patchwork" made of different methods for different types.
We can find the exact method used for a given operation using @which
:
In [ ]:
@which a * b
We are now ready to think about how to generate a grid of initial conditions of the form $a+bi$ in the complex plane, $\mathbb{C}$. Firstly, we could just iterate over the initial conditions in two repeated for
s, e.g.
In [ ]:
for i in -2:1
for j in -2:1
println("($i, $j)")
end
end
In [ ]:
for i in -2:1, j in -2:1
println("($i, $j)")
end
Here we have used string interpolation: the value of the variable i
is substituted into the string instead of the sequence $i$
. [Note that this is not recommended for performance-critical applications.]
But we still require somewhere to store the results. It is natural to use a matrix. A simple way of generating a matrix is the zeros
function:
In [ ]:
zeros(3)
We see that with a single element, we generate a vector of zeros, while
In [ ]:
zeros(3, 3)
gives a matrix, i.e. a 2-dimensional Array
.
Multiple dispatch allows Julia to provide convenience versions of functions like this. For example:
In [ ]:
zeros(-3:2)
creates a vector of the same length as the range!
However, this does not work for two different ranges:
In [ ]:
zeros(-3:2, -3:2)
We can use length
for example:
In [ ]:
linear_initial_conditions = -5:0.1:5
roots = zeros(length(linear_initial_conditions), length(linear_initial_conditions))
However, if we try to store a complex number in this matrix, we find a problem:
In [ ]:
roots[1, 1] = 3+4im
An InexactError
is a sign that we are trying to put a value into a type that it "doesn't fit into", for example a Float64
into an Int64
, or, in this case, a complex number into a float. We must instead create the matrix to hold complex numbers:
In [ ]:
linear_initial_conditions = -5:0.1:5
roots = zeros(Complex128, length(linear_initial_conditions), length(linear_initial_conditions))
Here, Complex128
is just an alias (an alternative name) for Complex{Float64}
, so called because two 64-bit Float64
s require 128
bits of storage in total.
Now we can insert complex values into the matrix:
In [ ]:
roots[1, 1] = 3+4im
In [ ]:
roots
We are now ready to make a version of Newton for complex functions. We will try to find cube roots of $1$ in the complex plane, by finding zeros of the function
In [ ]:
f(z) = z^3 - 1
with derivative
In [ ]:
f′(z) = 3z^2
In [ ]:
function do_complex_roots(range=-5:0.1:5) # default value
L = length(range)
roots = zeros(Complex128, L, L)
for (i, x) in enumerate(range)
for (j, y) in enumerate(range)
z = x + y*im
for k in 1:1000
z = z - f(z) / f′(z)
end
roots[i,j] = z
end
end
roots
end
In [ ]:
roots = do_complex_roots(-5:0.1:5)
Now let's use PyPlot
to plot the result. PyPlot
only understands floating-point matrices, so we'll take the imaginary part:
In [ ]:
using PyPlot
In [ ]:
imshow(imag(roots))
Julia uses "column-major" storage, whereas Python uses "row-major", so in fact we need to flip $x$ and $y$:
In [ ]:
function do_complex_roots(range=-5:0.1:5) # default value
L = length(range)
roots = zeros(Complex128, L, L)
for (i, x) in enumerate(range)
for (j, y) in enumerate(range)
z = y + x*im
for k in 1:1000
z = z - f(z) / f′(z)
end
roots[i,j] = z
end
end
roots
end
In [ ]:
imshow(imag(do_complex_roots(-3:0.01:3)), extent=(-2,2,-2,2))
text(1, 0, L"1")
text(reim(exp(2π*im/3))..., L"e^{2\frac{\pi}{3}}")
text(reim(exp(-2π*im/3))..., L"e^{-2\frac{\pi}{3}}")
Julia has a neat syntax for constructing arrays from iterables that is very similar to mathematical notation. For example, the squares of the numbers from 1 to 10 is
$$\{x^2: x \in \{1,\ldots,10\} \},$$i.e. "the set of $x^2$ for $x$ from $1$ to $10$. In Julia we can write
In [ ]:
squares = [x^2 for x in 1:10]
Let's define a Newton function by
In [ ]:
function newton(x0, N=100)
x = x0
for i in 1:N
x = x - f(x) / f′(x)
end
x
end
Then our Newton fractal can be written very concisely as
In [ ]:
methods(newton)
Note that the effect of a default argument is simply to create an additional method.
In [ ]:
function newton_fractal(range)
[newton(b+a*im) for a in range, b in range]
end
We can add labels using PyPlot
In [ ]:
?text
In [ ]:
imshow(imag(newton_fractal(-3:0.01:3)), extent=(-3, 3, -3, 3))
text(1, 0, L"1")
text(reim(exp(2π*im/3))..., L"e^{2\frac{\pi}{3}}")
text(reim(exp(-2π*im/3))..., L"e^{-2\frac{\pi}{3}}")
Here, we have used Julia's reim
function:
In [ ]:
reim(exp(2π*im/3))
It returns a tuple. The ...
, or splat, operator, unpacks the tuple into two arguments.
The L"..."
notation is a special string macro available in the LateXStrings
package used by PyPlot
,
that makes a LaTeX string.
Exercise: Make a version that accepts functions and experiment with other complex polynomials.
How does Julia know how to iterate using for
through a vector or range? Let's look at a Unicode string:
In [1]:
s = "aαbβ" # use `\alpha<TAB>`
Out[1]:
In [2]:
typeof(s)
Out[2]:
Julia provides access to several layers between the high-level code we write and the low-level machine code that is finally produced by the compilation process. The first of those is a "lowered" version of the code, in which high-level syntax is transformed to Julia code at a lower level
In [3]:
function iterate(s)
for c in s
println(c)
end
end
Out[3]:
In [4]:
@code_lowered iterate(10)
Out[4]:
We see that there are three important functions: start
, next
and done
.
For example, iterating through a Unicode UTF8String
is complicated, since characters have different lengths:
In [5]:
s[1]
Out[5]:
In [6]:
s[2]
Out[6]:
In [7]:
s[3]
Nonetheless, we can iterate through s
:
In [8]:
function string_iterate(s)
for c in s
println(c)
end
end
Out[8]:
In [9]:
string_iterate(s)
For example, we can extract a list of the characters in s
with
In [10]:
chars = [c for c in s]
Out[10]:
In [11]:
chars[1]
Out[11]:
In [12]:
typeof(chars[1])
Out[12]:
Note that in Julia, strings are written with "
and characters with '
(as in C).
The interface that allows us to iterate over an object using for
is provided by three functions start
, next
and done
that must be defined for that type:
In [15]:
@code_lowered string_iterate(s)
Out[15]:
In [16]:
@code_typed string_iterate(s)
Out[16]:
In [17]:
@code_llvm string_iterate(s)
In [18]:
@code_native string_iterate(s)
In [13]:
start(s)
Out[13]:
In [19]:
next(s, 1)
Out[19]:
In [20]:
next(s, 2)
Out[20]:
In [21]:
done(s, 2)
Out[21]:
In [22]:
@which start(s)
Out[22]:
For more details about introspection, check out Leah Hanson's blog post.
In [23]:
@which next(s, 1)
Out[23]: