This is a small notebook to teach you some basics of Julia, and to use the power of the ipyhton notebook to explore finite precision arithmetic.
Julia is a new scripting language that uses a jit (Just-In-Time) compiler that allows your code to run as fast as compiled code without all the hassle.

You can use it as your personal calculator, for example to add two numbers:


In [ ]:
1+1

You need to hit shift+enter in the corresponding cell to run the code and obtain the answer. This Julia notebook allows you to modify the code (and text) and re-run it with new values.


In [ ]:
0.1949 / 15

It can handle more complicated computations, for example let's check that \begin{equation} \lim_{n\rightarrow \infty} \left ( 1 +\frac{1}{n} \right)^n = e \end{equation} by computing


In [ ]:
n = 10;
(1+1/n)^n - e

You can increase $n$ to see how fast the expression converges, as you increase $n$, the expression will converge; however, for $n$ large enough the expressino will start to diverge. You will have an idea why this happens in the sequel.

In the example above you declared the variable, $n$; assigned a value, and use that value to carry out a computation. In Julia you can declare and assign values to variables by


In [ ]:
a = 3;
b = 12;
c = pi;

(with the semicolon at the end of each line you don't print any ouput on you console) and you can perform operations with those variables:


In [ ]:
(a+b)/c

Given that julia is a scripting language you can easily modify the values on the variables and perform the same operation with different values


In [ ]:
b = 0
(a+b)/c

Moreover, Julia has predefined a large amount of common functions (see http://docs.julialang.org/en/release-0.3/stdlib/math/), for example, trigonometric functions


In [ ]:
sin(3/2*pi)

In [ ]:
cos(-pi/2)

As you can see, Julia uses floating point arithmetic: cos(-pi/2) should be equal to zero, but given that we use an expansion to evaluate the cos, we can expect to have numerical cancelations that will not be exact. We will explore in the sequel some of these issues.

We can easily define functions in Julia. For example we can write the funtion to evaluate a polinomial, in this case p(x) = x^2 +1, which can be writen as


In [ ]:
function foo(x)
    return x^2 + 1
end

We need some fields and keywords that are necessary to sucesfully define a function:

function: keyword necessary to define a function;

foo : the name of the function;

(arg1, arg2, arg3): the arguments of the function, there can be more than one;

return : these is the output of the funtion;

end : keyword to declare the end of the declaration of the function.

We can call the function by typing


In [ ]:
foo(3)

In addition, it is possible to nest the functions, i.e. we can declare function that call other functions already declared for example


In [ ]:
function foo2(x,y)
    z = foo(x) + y
    return z^2
end

In [ ]:
foo2(3,0)

Or we can nest function calls on the run, such as


In [ ]:
foo2(foo(1),0)

Finally, before going back to the Syllabus, we will introduce the function print. This function takes a string ( which is a series of characters between quotations marks), or a series of strings separated by a comma, and prints the series of characters in the console. For example,


In [ ]:
print("Hello, world! \n");
print("ping! \n", "pong \n!")

The \n charachter is used to signal the end of a line. If the argument is a number it will automatically convert it to a series of characters and print it in the console. The same happens if you assing a value to a variable and you print the variable.


In [ ]:
print(1/3, "\n")
variable_to_be_print = 48920;
print(variable_to_be_print, "\n")
variable_to_be_print = "Hello World! (again)";
print(variable_to_be_print, "\n")

Now, coming back to the Syllabus, the first topic is finite precission arithmetic. Fortunaly for us, Julia accepts different types of precission. We will use Julia to illustrate some properties of the finite precission arithmetic.

For integers, the most used representation is using what is called a fixed point representation. Given that the set of integer numbers is infinite we need to choose a subset to be represented. The normal representation for integers is based on a representation in digits with a fixed and limited amount of digits.

The representation is based on a binary representation of an integer using a limited and fixed amount of bits. For example, a typical int need 32 bits, or 4 bytes (normally called int32). The first bit is used for the sign, and the rest are used to represent a number using a binary base. We have ints that uses, 8, 16, and 64 bits, as well as unsinged ints. The main advantage of having more bits to represent a number is an increased range.

A simple combinatorial argument shows that we can write $2^{32} = 4294967296$ different numbers using 32 bits. Given that we use one digit to store the sign, the range of int32 is $-2^{31} =-2147483648$ to $2^{31}-1 = 2147483647$.

You can easily check the veracity of this statement by running


In [ ]:
print(int32(2^31-1), "\n")
print(int32(-2^31))

The result printed in your console should be the same as the one computed before. In this case the function int32 is converting a typical int in Julia (that is by default an int64) to a int32. If you try with a bigger or smaller number such as


In [ ]:
print(int32(2^31), "\n")
print(int32(-2^31-1))

You will get an error or a completely wrong answer (depending on your julia version), this is because the desired number exceeds the representation range allowed by an Int32. In the case of the integers numbers the addition, substraction and multiplication are exact, up to a possible out of range exception. However, the division is only exact if the result is an int32, otherwise, it will either send an error (you would expect this behavior in Julia) or it will approximate to the next smaller int32 (you would expect this behavior in C). You can run the snippet below


In [ ]:
print(int32(1155/5), "\n")
print(int32(1155/2))

In order to be able to cope with division, along another issues that will not be covered in this class, we will introduce perhaps the most important numerical "type" in Scientific computing : the floating point numbers.

The floating point numbers are a finite subset of the reals, used to have a finite representation of an infinite (and uncountable) set and they are designed as a trade off between precision and range, for more details see https://en.wikipedia.org/wiki/Floating_point

By default Julia uses double precision floating numbers https://en.wikipedia.org/wiki/Double-precision_floating-point_format, but it is possible to use single precision floating numbers https://en.wikipedia.org/wiki/Single-precision_floating-point_format and half precission floating numbers https://en.wikipedia.org/wiki/Half-precision_floating-point_format.

A floating point number is in general defined as \begin{equation} x = n_1.n_2...n_{L}\cdot a^{d} \end{equation} where $n_1.n_2...n_{L}$ is the significand, $L$ the precision (or number of digits in the significand, $a$ the base, and $d$ the exponent.

The base is normally fixed, binary, and the significand and exponent are written in pixed point notation.

The precision is normally related to the number of bits necessary to represent in a unique fashion each number, single uses 32 bits (or 4 bytes); double 64 bits, and half 16 bits. They are normally denoted as Float64, Float32, and Float16.

The number of bits used to represent a number tells you how fine is the representation, i.e. how many real number it can represent exactly; when an exact representation is not possible, it will truncate to the closest number in the set, thus introducing a truncation error.

In Julia you can easily convert any number netween different precission using the float16, float32 and float64 functions. In the example below, we approximate the number Pi using different precissions.


In [ ]:
print(pi)

In [ ]:
print(float64(pi))

In [ ]:
print(float32(pi))

In [ ]:
print(float16(pi))

As you can see, each time you double the number of bits used for the representation you (roughly) double the number of digits. For the correct number of digits you need to check the number of bits saved for the significand. For half precision you have 10 bits, single 23, and double 53. You can easily obtain the precision in decimals by multiplying the number of bit by $\log_{10}(2)$.


In [ ]:
print(53*log(2)/log(10),"\n")
print(24*log(2)/log(10),"\n")
print(11*log(2)/log(10))

You can easily see that you have 7 correct digits for a an float32 and only 3 for a float16.

For further material you may want to check the Julia documentation on how the numners are implemented (see http://julia.readthedocs.org/en/latest/manual/integers-and-floating-point-numbers/).

Q1. Using the representation of floating point (in binary) numbers seen in class, what is the smallest positive number that can be exactly represented using a Float64? (you can use the markdown cell below to answer)

Answer: To have the smallest positive number that can be represented exactly we need to make the significand as small as possible and the exponent as negative as possible.

We have 11 bits to represent the exponent, including the sign, then the smallest possible exponent is -1022 (check the wikipedia page), then we have 53 digits for the significand and we want to make it as small as possible. The smallest positive number would be a number with 52 digits zero except by the right-most digit that would be one. That would mean that the significand would be $2^{-52}$.

Then the smallest possible number would be $2^{-52} \cdot 2^{-1022}= 4.940656458412465 \cdot 10^{-324}$.

Test your answer by running the following function


In [ ]:
print(eps(float64(0.0)),"\n")

Q2. What is the smallest positive number for a single and half precission floating numbers?

Answer: For float32 we have 8 bits for the exponent and 23 for the significand, then $2^{-2^{7}+2} \cdot 2^{-23}$ and for float26 we have 5 digits for the exponent and 10 for the significand, then $2^{-2^4+2} \cdot 2^{-10}$

test your answer by running the following functions


In [ ]:
print(eps(float32(0.0)))

In [ ]:
print(eps(float16(0.0)))

You have seen in class that finite precission arithmetic introduces errors, in particular those errors can be big when dealing with numbers of different orders of magnitude.

Q3. What is the biggest number in the Float16 representation such that if we add it to 1, the result is always 1?

Answer: We need to check that is the next number that can be represented exactly. The next number will have the same exponent, and it wll have one extra non zero bit (the right most) in the significand, in other words, the next number that can be exactly represented is $1+ 2^{-10}$. Then if we add $2^{10}$ divided by 2, the result will always be one (by truncation).

To see if you were correct, (numerically) run


In [ ]:
eps16 = eps(float16(1.0))
print("Answer from Julia = ", eps16, "\n")
print("Theoretical answer = ", 2.0^(-10), "\n")

Now, we can check that adding eps16/2 to 1 results in exactly 1!


In [ ]:
print(float16(eps16/2 + float16(1.) ))

Q4. repeat the same computation for single and double precission and check your answers

Answer: using the same reasoning that before we have $2^{-23}$ and $2^{-53}$


In [ ]:
eps32 = eps(float32(1.0))
print("Answer from Julia = ", eps32, "\n")
print("Theoretical answer = ", 2.0^(-23), "\n")
eps64 = eps(float64(1.0))
print("Answer from Julia = ", eps64,"\n")
print("Theoretical answer = ", 2.0^(-52), "\n")

As you will see in the sequel, the main problem of having low accucary (i.e. a big machine epsilon) happends when we need to divide and substract two numbers.

One of the main problems for computation in a computer is that addition and substraction are not associative operations in finite precission arithmetic. You easily check this fact by running the following scripts that simulate a different ordering.


In [ ]:
print(float16(float16(eps16/2 + float16(1.)) - float16(1.)), "\n")
print(float16(eps16/2 + float16(float16(1.) - float16(1.))))

this is the same problem with any precission, run the same example for the double precision


In [ ]:
print(eps64/2 + 1.0     - 1.0    ,"\n")
print(1.0     + eps64/2 - 1.0    ,"\n")
print(1.0     - 1.0     + eps64/2,"\n")

this becomes extremely problematic if you have a division by a small number


In [ ]:
print((1.0 + eps64/2 - 1.0)/(eps64/2),"\n")
print((1.0 - 1.0     + eps64/2)/(eps64/2), "\n")

As you can see the first result is completely off. Imagine that something like that would happen when computing your GPA :p.

Now you should be able to explain why \begin{equation} \lim_{n\rightarrow \infty} \left ( 1 +\frac{1}{n} \right)^n = e \end{equation}

does not converge numerically when $n$ is large. When $n$ is large, $1/n$ is really small, then when is added to $1$ we may make a small error. Then we multiple that approximated number by itself, slightly increasing the error. Now, you need to repeat the same operation $n$ times, where $n$ is large. This makes that the final result will be completely off for larhe $n$.

The main question that you may be asking yourself right now, why don't just use double or quadruple precision floating point numbers all the time. The answer is that operation with bigger numbers tend to be more expensive. If you don't need too much precision, you would want to use single or half. One example can be found in graphics cards, that are optimized for single precision computations, in order to save space to and speedup the rendering; your eye won't see a 0.001% variation in the color of a pixel.

Before changing the subject, you can have an in-depth exposition in floating point arithmetic in https://ece.uwaterloo.ca/~dwharder/NumericalAnalysis/02Numerics/Double/paper.pdf.

One of the most powerful tools in Julia is it vector support, we can easily define a vector using the following syntax


In [ ]:
x = [pi e sqrt(2) sqrt(3)].'

where we just declared and assigned \begin{equation} x = \left ( \begin{array}{c} \pi \\ e \\ \sqrt{2} \\ \sqrt{3} \end{array} \right) \end{equation}

In Julia vectors are by default row vectors, to convert them in column vector we use the transpose "'" operator. Julia can perform tasks on full vector component-wise, using the component-wise operator "." followed by the operation


In [ ]:
print("x.^2 = ",x.^2, "\n")
print("e.^(x) = ", e.^(x))

where we just computed the numerical result of \begin{equation} \text{x.^2} = \left ( \begin{array}{c} \pi^2 \\ e^2 \\ 2 \\ 3 \end{array} \right), \, \text{and } \text{e.^x} = \left ( \begin{array}{c} e^{\pi} \\ e^e \\ e^{\sqrt{2}} \\ e^{\sqrt{3}} \end{array} \right) \end{equation}

and you can easily apply functions to vectors.


In [ ]:
print("cos(x) = ", cos(x))

Let us apply this vectorized techniques to for example compute the integral of $\sin(x)$ between $0$ and $\pi$. One of the most basic ways to compute this integral is using the quadrature given by

\begin{equation} \int_{0}^{\pi} cos (x) dx \approx \sum_{i=1}^{N} cos(x_i) h \end{equation}

where $h =\pi/N$ and $x_i = h(i - 1/2)$.

We will see how to easily evaluate $ \sum_{i=1}^{N} cos(x_i) h$ using Julia.

First, we need to say how many points in the quadrature we want, $N$


In [ ]:
nPoints = 11  # number of evaluation or quadrature points

We can compute the step, $h$,


In [ ]:
h = pi/nPoints  # step size

Now we need to define the points in which the function will be evaluated, i.e., $x_i$, which are the midpoints


In [ ]:
xQuad = [1/2:1:nPoints-1/2]*h # evaluation points

We evaluate using the vectorized form $ cos(x_i) $ for every $i$ at once,


In [ ]:
fQuad = cos(xQuad)   # functions evaluated at the evaluation points

In [ ]:
result = sum(fQuad)*h   # computing the approximated integral

As you can see, the approximation converges extremely fast. Q5. In the space allocated below use the same method to compute \begin{equation} \int_0^1 x^2 dx. \end{equation} How many points do you need to have a relative accuracy of $10^{-6}$.


In [ ]:
nPoints = 500  # number of evaluation or quadrature points
h = 1/nPoints  # step size
xQuad = [1/2:1:nPoints-1/2]*h # evaluation points
fQuad = xQuad.^2   # functions evaluated at the evaluation points
result = sum(fQuad)*h   # computing the approximated integral
print("The relative error is = ",abs(result-1/3)/(1/3))

Q6. How many points do you need to approximate \begin{equation} \int_{0}^1 \sqrt{x} dx, \end{equation} and to obtain a relative accuracy of $10^{-6}$.


In [ ]:
nPoints = 2100  # number of evaluation or quadrature points
h = 1/nPoints  # step size
xQuad = [1/2:1:nPoints-1/2]*h # evaluation points
fQuad = sqrt(xQuad)   # functions evaluated at the evaluation points
result = sum(fQuad)*h   # computing the approximated integral
print("The relative error is = ",abs(result-2/3)/(2/3))

In [ ]: