Programming with Python

Vedran Šego, vsego.org

An algorithm

A process or set of rules to be followed in calculations or other problem-solving operations, especially by a computer
http://www.oxforddictionaries.com/definition/english/algorithm?q=algorithm
Informal definition from Wikipedia: a set of rules that precisely defines a sequence of operations.
More precisely (albeit still a bit unformal):
A sequence of actions that are always executed in a finite number of steps, used to solve a certain problem.
Simplified: a cookbook, often very general

Important parts of an algorithm:

get the input data,
solve the problem,
output the result(s).

Example: preparing a frozen pizza. Steps:

Get temperature temp and time t (written on the box).
Heat oven to the temperature temp.
Discard all packaging (recycle the cartoon :-)).
Cook for a time t.
Get pizza out of the oven.
Use some heat-resistent hands protection or execute the "go to the ER" subroutine.

Example of the input data:

temp = "conventional oven: 190C; fan oven: 180C",
T = between 15 and 17 minutes.

A program

A series of coded software instructions to control the operation of a computer or other machine.
http://www.oxforddictionaries.com/definition/english/programme
Simplified: precise instructions writen in some programming language

Our focus

Algorithms, not programs!
We shall use programs to test algorithms and to see how they work.
Aim: Learn to solve problems using a computer, regardless of the choice of language (which may or may not be Python in the courses you take in your future work or studies).

Types of programming languages
(implementations)

Compiled

Require compilation (translation) of the code to the machine code (popular, albeit a bit meaningless, "zeroes and ones"),
Usually strictly typed,
Usually faster than interpreters.

Interpreted

Translated during the execution,
Usually untyped or loosly typed,
Usually slow(ish).

Where is Python?

Simplified: an interpreter.
Programs get semi-translated when run (pseudocompiler).
Untyped, slower than compiled languages (but with various ways to speed up).

No details – too technical and they depend on the specific Python implementation.

Python versions

Many implementations, but two major versions

Python 2
Python 3 ⇜

Minor differences in what we need (they will be mentioned).

An example: Hello world



In [8]:

    
print("Hello, World!")









    



Hello, World!

The output can be nicely formated, but more on that in the near future.

What is "Hello World"?

A very simple program, used to show the basic syntax of a programming language.

See the The Hello World Collection for the examples in many other programming languages.

The need for such an example can be clearly seen from the examples of more complex, but still fairly readable languages (for example, various versions of C++ and Java).
Beware of the Assembler-Z80-Console and BIT examples. ☺</span>

Note: A full program should always start with the following:

the first line: #!/usr/bin/env python3
This has no meaning on Windows, but it makes your programs easier to run on Linux and Mac OSX.
Right after that, a description what they do, possibly with some other info (system requirements, authors name and contact, etc), between the triple quotation marks. This is called a docstring and it will be further addressed next week.

So, a full "Hello World" program would look like this:



In [9]:

    
#!/usr/bin/env python3

"""
A program that prints the "Hello, World!" message.
"""

print("Hello, World!")









    



Hello, World!

We shall ommit these elements in the lectures, as we shall mostly present chunks of code (that are not necessarily full programs) and to save the screen space.

The following line that Spyder adds to new files

# -*- coding: utf-8 -*-

is not necessary in Python 3. However, if you are using international characters in some encoding other than UTF-8 (which you really shouldn't do!), this is a way to specify that encoding.

In this course we shall not cover encodings, as it is a very technical subject and most of the time it is enough to just use UTF-8 (find it in your editor's settings). However, if you're ever to build an application with the need for the international characters, do look up the encodings on the internet (the Wikipedia page is a good start) and use UTF-8 whenever possible, as it is a widely accepted standard and the default in Python 3. You should also make sure that your editor saves files using the UTF-8 encoding (Spyder does that by default).

In Python 2, the default encoding is ASCII and the UTF-8 support has to be enabled manually.

Comments

It is often useful to add human language notes in the code. These are called comments and are ignored by a computer, but they help programmers read the code.

In Python, comments are done by prepending the hash sign # in front of it. Each comments ends with the end of the line.

It is a standard to always write comments in English:

Python coders from non-English speaking countries: please write your comments in English, unless you are 120% sure that the code will never be read by people who don't speak your language
Source: PEP 8

For example:



In [ ]:

    
#!/usr/bin/env python3

"""
A program that prints the "Hello, World!" message.
"""

# A welcome message
print("Hello, World!")
# TODO: ask user their name and save it as a file

As you can see, the above code runs just as the previous one, but a programmer reading it can get more information about the code itself.

Some editors will recognize certain tags in comments and highlight them to make them more noticable (like the TODO tag in the previous example). Some of the more common ones are, as listed in the Wikipedia's comments article:

FIXME to mark potential problematic code that requires special attention and/or review.
NOTE to document inner workings of code and indicate potential pitfalls.
TODO to indicate planned enhancements.
XXX to warn other programmers of problematic or misguiding code.

We shall often use the comments to denote what certain parts of the code do. These should always be descriptive and not merely rewritten code.

For example, this is good:

# Get the sum of primes in `L` as `prime_sum`
for x in L:
    if is_prime(x):
        prime_sum += x

as it makes clear what the code is doing, even to someone who doesn't "speak" Python.

This is bad:

# For each element `x` in the list, if `x` is a prime number,
# add it to `prime_sum`.
for x in L:
    if is_prime(x):
        prime_sum += x

because this comment is just a rewrite of the code that follows it and, as such, it is useless.

It is advisable to keep all lines (comments and docstrings) wrapped under 80 characters, although it shouldn't be forced when it reduces the code readability.

Input

How do we input some data?

Not surprisingly, using the function input().



In [10]:

    
x = input()
print("The value of x is", x)









    



17
The value of x is 17

Well, this x looks kinda important here. What could it be?

Variables

Used to store and retrieve data
Each has a value (which, in Python, can be undefined) and a type (partly hidden in Python)

Simple types

Number -- generally, what you'd consider an integer or a real number in mathematics.
For example: 17, -19, 17.19, 0, 2e3,...
String -- a text.
For example: "word", "This is a mighty deep, philosophical sentence.", "ŞƿҿÇïåĿ sɹǝʇɔɐɹɐɥɔ", "17",...
So called empty string, "", has no characters (its length is zero).
Boolean -- truth values (True and False).
NoneType -- the type of a special constant None that means "no value".
This is not a zero (a number), nor an empty string, nor anything else! None is different from any other constant and any other value that a variable can get.

Be careful: 17 is a number, while "17" is a string!

Some not so simple types

Lists and tuples - most languages have only lists (usually called arrays),
Dictionaries,
Sets,
Objects,
Functions (yes, functions can be saved in variables as well),
...

How do variables work?

Let us analize this piece of code:



In [12]:

    
x = input()
print("The value of x is", x)









    



a mistery
The value of x is a mistery

Whatever is on the right hand side of the assignment = gets computed first. Then the result is assigned to the variable on the left hand side. When this is done, the next line of code is executed.

In our example, this means:

The function input() reads a sequence of characters from the standard input (usually user's keyboard, although it can be changed) and returns it as a string.
That value is then assigned to the variable x (on the lefthand side of the assignment operator =).
Now, x holds - as a string - whatever we have typed up to the first newline, i.e., up to the first Enter key (the newline itself is omitted).
The function print() now outputs its arguments to the standard output (usually the user's screen), in order in which they were given, separated by a single space character. So,
- First, a string "The value of x is" is written out.
- Then a singe space character is written out.
- Then the value of x is written out (not the string "x" itself, because x is a variable!).

In other words, if we type "17", our program will write

The value of x is 17

And if we type "a mistery", our program will write

The value of x is a mistery

Don't expect Python to do anything smart here. It just writes out the values as they were given.

Python 2 remark: In Python 2, print does not need parentheses (i.e., print "Hello, World!" is fine).
However, do include them even if writing a Python 2 program, to make it easier to port to Python 3, as well as to avoid problems in some more advanced uses you might encounter in the future.

On the simple types

Be careful! If it looks like a number, it may still not be one!

Let us try to input two numbers in the following code, also adding the descriptions of what is expected in each of the inputs:



In [13]:

    
x = input("x: ")
y = input("y: ")
print(x, "+", y, "=", x+y)









    



x: 17
y: 19
17 + 19 = 1719

What did just happen here?

The user types two numbers, which are saved -- as two strings -- in variables x and y. Then the program writes out (among other things) the value of x+y.

How would we "add" one string to another in the real world?

For example, if x = "Bruce" and y = "Wayne", what would x + y be?

It may come as a little surprise that "x + y" will not produce "Batman". Python is a well defined language that keeps Bruce's secret identity well hidden.

The result of x+y would, less surprisingly, be "BruceWayne". Notice that there is no additional space here: the strings are glued (concatenated) one to another, with no extra separators!

So, what happens if x = "17" and y = "19"?

It would be very bad if Python looked at these and decided that they were numbers only because they have nothing but digits. Maybe we want them concatenated (as opposed to adding them one to another)!

So, the result is -- by now not a bit surprisingly -- "1719", because the strings' addition + is a concatenation, regardless of the value of the strings in question.

How do we explain Python to "treat these two as numbers"?

Converting basic types

We can explicitly tell Python to convert a string to an integer or a real number, and vice versa.



In [14]:

    
x = int(input())
y = float(input())
print("x = ", x)
print("y = ", y)
print("x+y = ", x+y)
z = 'This is a string: "' + str(x+y) + '"'
print(z)









    



17
1.23
x =  17
y =  1.23
x+y =  18.23
This is a string: "18.23"

We see three conversion functions:

int(), which takes a string and converts it to an integer. If the argument is not a string representation of an integer, an error occurs.
float(), which takes a string and converts it to a "real" number (also called floating point number, hence the name of the function). If the argument is not a string representation of a real number, an error occurs.
str(), which takes a number (among other allowed types) and converts it to a string.

Python 2 remark: In Python 2, input() is similar to float(input()) in Python 3 (actually, eval(input()), but that is well beyond this lecture), i.e., it loads a number and returns it as a floating point number (causing an error or a strange behaviour if anything else is given as the input).
To load a string in Python 2, one has to call raw_input() (which does not exist in Python 3).

A note on the string creation: There are better ways to form the variable z (using various formatting methods), but this will have to wait a few weeks until we cover strings in more depth than here.

More on the assignments

What will the following code print?



In [15]:

    
x = 17
print("The value of x was", x)
x = x + 2
print("The value of x is", x)









    



The value of x was 17
The value of x is 19

As we said before: whatever is on the right hand side of the assignment =, gets computed first. Only after that, the result is assigned to the variable on the left hand side.

So, when Python encounters the command

x = x + 2

while the value of x is 17, it first computes x + 2 (for x = 17), which is 19. After that, it preforms the assignment x = 19, so 19 becomes the new value of x (which is then printed with the second print).

In most of the modern languages, x = x + y can be written as x += y. The same shortcut works for other operators as well, i.e., x = x op y can be written as x op= y.

For basic numerical operations, this means that we have the following translation:

Expression	Shortcut
`x = x + y`	`x += y`
`x = x - y`	`x -= y`
`x = x * y`	`x *= y`
`x = x / y`	`x /= y`
`x = x // y`	`x //= y`
`x = x % y`	`x %= y`
`x = x ** y`	`x **= y`

A note on other languages: there are no increment (++) and decrement (--) operators in Python.

Some special operators

Most of the operators from the previous slide have the same meaning as in mathematics (@C-people: / means the usual, i.e., real division). The three not used in mathematics are defined as follows:

x // y means floored quotient of x and y (also called integer division), i.e., x // y $:= \left\lfloor \mathsf{x}\ /\ \mathsf{y} \right\rfloor$,
x % y means the remainder of $x / y$, i.e., x % y := x - y * (x // y),
x ** y means $x^y$ (x to the power y).

Python 2 remark: In Python 2, the ordinary real division x/y works in a C-like manner, which means that x/y is equivalent to x//y if both x and y are integers.
In Python 3, x/y always means real division. In other words,

Python 2: 3//2 = 3/2 = 1, but 3/2.0 = 3.0 / 2 = 3.0 / 2.0 = 1.5;
Python 3: 3//2 = 1, but 3/2 = 3/2.0 = 3.0 / 2 = 3.0 / 2.0 = 1.5.

Real number trouble

When dealing with real numbers, one must be extremely careful!

Simple arithmetics

What happens when we try to compute a + b - a for several different real values of a and b? Fairly often, the result will not be b!



In [16]:

    
a = 10
b = 0.1
print("a =", a, "  b =", b, "  ->  ", "a + b - a =", a + b - a, "!=", b, "= b")
a = 10**7
b = 10**(-7)
print("a =", a, "  b =", b, "  ->  ", "a + b - a =", a + b - a, "!=", b, "= b")
a = 10**11
b = 10**(-11)
print("a =", a, "  b =", b, "  ->  ", "a + b - a =", a + b - a, "!=", b, "= b")









    



a = 10   b = 0.1   ->   a + b - a = 0.09999999999999964 != 0.1 = b
a = 10000000   b = 1e-07   ->   a + b - a = 1.0058283805847168e-07 != 1e-07 = b
a = 100000000000   b = 1e-11   ->   a + b - a = 0.0 != 1e-11 = b

A division

There is no such thing as a real number in a computer. There are all actually (something like) decimals with an upper limit to the number of correctly remembered digits. The rest of the digits is lost, which can produce weird results, like x * (1 / x) ≠ 1.



In [17]:

    
x = 474953
y = 1 / x
print(x * y)









    



0.9999999999999999

Use integers whenever possible

Fibonacci numbers are defined as follows: $$F_0 := 0, \quad F_1 := 1, \quad F_{n+1} := F_n + F_{n-1}, \quad n \ge 1.$$ There is also a direct formula for computing $F_n$: $$F_n = \frac{\varphi^n - \psi^n}{\sqrt{5}}, \quad \varphi := \frac{1 + \sqrt{5}}{2}, \quad \psi := \frac{1 - \sqrt{5}}{2}.$$ However, in a computer, this will soon give you wrong results.

In the following code, fib1(n) returns the n-th Fibonacci number computed by a simple integer-arithmetics algorithm, while fib(2) uses the above formula (never use the recursive definition for computation of Fibonacci numbers!).



In [20]:

    
def fib1(n):
    f0 = 0
    f1 = 1
    while n > 1:
        (f0, f1) = (f1, f0 + f1)
        n -= 1
    return f1

def fib2(n):
    sqrt5 = 5 ** .5
    phi = (1 + sqrt5) / 2
    psi = (1 - sqrt5) / 2
    return int((phi**n - psi**n) / sqrt5)

n = int(input("Type n (try to go for 73 or more): "))
fib1n = fib1(n)
fib2n = fib2(n)
print("|fib1(n) - fib2(n)| = |" + str(fib1n), "-", str(fib2n) + "| =", abs(fib1n - fib2n))









    



Type n (try to go for 73 or more): 100
|fib1(n) - fib2(n)| = |354224848179261915075 - 354224848179263111168| = 1196093

Even a simple addition

The following code computes and prints three sums: $$\sum_{i = 0}^{999} 0.1 = 100, \quad \sum_{i = 0}^{9999} 0.1 = 1000, \quad \text{and} \quad \sum_{i = 0}^{9999999} 0.1 = 10^6.$$



In [21]:

    
s = 0
for _ in range(1000):
    s += 0.1
print(s)
s = 0
for _ in range(10000):
    s += 0.1
print(s)
s = 0
for _ in range(10000000):
    last = s
    s += 0.1
print(s)









    



99.9999999999986
1000.0000000001588
999999.9998389754

Notice how the result is sometimes smaller and sometimes bigger than the correct result.

Associativity of addition

We all know that for a finite set of real numbers $\{ a_1, \dots, a_n \}$, the following is true: $$\sum_{i=1}^n a_i = \sum_{i=n}^1 a_i = \sum_{i=1}^n a_{P(i)},$$ for any permutation $P$. However, in a computer, this isn't always so.



In [22]:

    
from math import pi
x = 15 * pi
# Create the list of series elements
elts = [ ]
f = 1
for k in range(1, 150, 2):
    elts.append(x**k / f)
    f *= -(k+1) * (k+2)
# Summarize elements in the original order
sin1 = 0
for el in elts:
    sin1 += el
print("sin1 =", sin1)
# Summarize elements in the reversed order
sin2 = 0
for el in reversed(elts):
    sin2 += el
print("sin2 =", sin2)
# Summarize elements from the middle one to the ones on the edge
cnt = len(elts)
mid = cnt // 2
sin3 = 0
for i in range(mid + 1):
    if mid + i < cnt:
        sin3 += elts[mid + i]
    if i:
        sin3 += elts[mid - i]
print("sin3 =", sin3)
# Summarize elements from the ones on the edge to the middle one
sin4 = 0
for i in reversed(range(mid + 1)):
    if mid + i < cnt:
        sin4 += elts[mid + i]
    if i:
        sin4 += elts[mid - i]
print("sin4 =", sin4)
print("|sin1 - sin4| =", abs(sin1 - sin4))
print("the first element:", elts[0])
print("the last element:", elts[-1])









    



sin1 = -3121.3699495895926
sin2 = -2947.8076865687467
sin3 = -2403.8076865683283
sin4 = -1768.0
|sin1 - sin4| = 1353.3699495895926
the first element: 47.12388980384689
the last element: 5.3966776616824465e-12

The above is the computation of $\sin 15\pi$ via the first $74$ elements of the Taylor series of the sine function,

sin1 computation starting from the first element,
sin2 going towards it,
sin3 going from the center out ($a_{37} + a_{36} + a_{38} + a_{35} + a_{39} + \dots$),
sin4 going from the edges in ($a_1 + a_{74} + a_2 + a_{73} + \dots$).

The difference between sin1 and sin4 is roughly $1353$, which may not look like much, but it is far more than the difference between any two sines should be.

You might also notice that $\sin 15\pi$ shouldn't be nowhere near $-3000$ or even $-1768$.

One might think that we should compute more elements of the sum, but this is not the case: the last element of the sum is only around $5.4 \cdot 10^{-12}$ (and those after it would be even smaller).

So, what happened here?

Here is a hint: take a look at how big the elements of our sums are.



In [23]:

    
%matplotlib inline
import matplotlib.pyplot as plt
plt.plot(elts)









    Out[23]:





[<matplotlib.lines.Line2D at 0x7f4f79586e90>]

Powers

Let us define $$f(x,n) := \underbrace{\sqrt{\sqrt{\dots\sqrt{x}}}\,\hskip-1em}_{n}\hskip1em, \quad g(x,n) := \hskip3.7em\overbrace{\hskip-3.7em\left(\left(\dots\left(x\right)^2\dots\right)^2\right)^2}^{n}.$$ In other words, $f(x, n)$ is the number that we get by taking the square root of $x$, $n$ times in a row, and $g(x, n)$ is the number we get by computing the second power of $x$, $n$ times in a row.

Obviously, $x = f(g(x, n), n) = g(f(x, n), n)$ for any $n \in \mathbb{N}$ and $x \in \mathbb{R}^+_0$. But, let us see what a computer has to say if we input some $x \ne 1$ and $n = 50, 60, \dots$:



In [24]:

    
from math import sqrt
x = float(input("x = "))
n = int(input("n = "))
t = x
for _ in range(n):
    t = sqrt(t)
for _ in range(n):
    t *= t
print("g(f(" + str(x) + ", " + str(n) + "), " + str(n) + ") =", t)
t = x
for _ in range(n):
    t *= t
for _ in range(n):
    t = sqrt(t)
print("f(g(" + str(x) + ", " + str(n) + "), " + str(n) + ") =", t)









    



x = 1.7
n = 53
g(f(1.7, 53), 53) = 1.0
f(g(1.7, 53), 53) = inf

Do these tiny errors really matter?

Consider the following two systems of linear equations:

$$\left\{\begin{array}{rcrcr} 1 \cdot x &+& 1 \cdot y &=& 2, \\ 1.000001 \cdot x &+& 1 \cdot y &=& 2.000001, \end{array}\right. \quad \text{and} \quad \left\{\begin{array}{rcrcr} 1 \cdot x &+& 1 \cdot y &=& 2, \\ 1.000001 \cdot x &+& 1 \cdot y &=& 1.999999. \end{array}\right.$$

What are their solutions?

The solution to the first one is $(x, y) = (1, 1)$, but the solution to the second one is $(x, y) = (-1, 3)$.

Notice that only one element was changed, and only for an order of magnitude $10^{-6}$, which could've easily been caused by one of the small errors shown before. Similar results can be achieved with arbitrarily small errors.

Bottom line

Always be extra careful when working with "real" numbers in a computer (or, better, avoid them altogether if possible, like in the Fibonacci example)!

These errors cannot always be considered insignificant, as they can pile up and/or grow in subsequent computations.

Anyone intending to do any serious computations using computers should take the course "Numerical Analysis 1" (MATH20602), lest rockets will miss, elevators will fall, and bridges will collapse...