Functions

Storing individual Python commands for re-use is one thing. Creating a function that can be repeatedly applied to different input data is quite another, and of huge importance in coding.

In VBA there are two related concepts: subroutines and functions. Subroutines perform actions, functions return results (given inputs). In Python there is no distinction: any function can both return results and perform actions.

In VBA there is a standard layout. For subroutines we have

Sub name()
'
' Comments
'
    Code
End Sub

For functions we have

Function name(arguments)
'
' Comments
'
    name = ...
End Function

A similar structure holds in Python. Here we have

def name(arguments):
    """
    Comments
    """
    return value

The def keyword says that what follows is a function. Again, the name of the function follows the same rules and conventions as variables and files. The colon : at the end of the first line is essential: everything that follows that is indented will be the code to be executed when the function is called. The indentation is also essential. As soon as the indentation stops, the function stops (like End Function in VBA).

Here is a simple example, that you can type directly into the console or into a file:


In [1]:
def add(x, y):
    """
    Add two numbers
    
    Parameters
    ----------
    
    x : float
        First input
    y : float
        Second input
    
    Returns
    -------
    
    x + y : float
    """
    return x + y

add(1, 2)


Out[1]:
3

We see that the line add(1, 2) is outside the function and so is executed. We can also call the function repeatedly:


In [2]:
print(add(3, 4))
print(add(10.61, 5.99))


7
16.6

The lengthy comment at the start of the function is very useful to remind yourself later what the function should do. You can see this information by typing


In [3]:
help(add)


Help on function add in module __main__:

add(x, y)
    Add two numbers
    
    Parameters
    ----------
    
    x : float
        First input
    y : float
        Second input
    
    Returns
    -------
    
    x + y : float

You can also view this in spyder by typing add in the Object window of the Help tab in the top right.

We can save the function to a file and re-use the function by importing the file. Create a new file in the spyder editor containing

def add(x, y):
    """
    Add two numbers

    Parameters
    ----------

    x : float
        First input
    y : float
        Second input

    Returns
    -------

    x + y : float
    """
    return x + y

and save it as script2.py. Then in the console check that it works as expected:


In [4]:
import script2
script2.add(1, 2)


Out[4]:
3

if statements and flow control

We often need to make a decision whether to do something, or to do something else. In Visual Basic this uses an If statement:

Dim count As Integer = 0
Dim message As String

If count = 0 Then
    message = "There are no items."
ElseIf count = 1 Then
    message = "There is 1 item."
Else
    message = "There are " & count & " items."
End If

The equivalent Python code is similar:


In [5]:
count = 0

if count == 0:
    message = "There are no items."
elif count == 1:
    message = "There is 1 item."
else:
    message = "There are" + count + " items."

print(message)


There are no items.

We see that the Visual Basic If statement becomes the lower case if, and the ElseIf is contracted to elif. The condition (in this case!) compares a variable, count, to a number, using the equality comparison ==. Once again, as in the case of functions, the line containing the if definition is ended with a colon (:), and the commands to be executed are indented.

We can include as many branches of the if statement as we like using multiple elif statements. We do not need to use any elif statements, nor an else, unless we want (or need) to. We can nest if statements inside each other.

Exercise

Write a function that, given an integer $n$, returns the $n^{\text{th}}$ Fibonacci number $F_n = F_{n-1} + F_{n-2}$, where $F_1 = 1 = F_2$. Check that it works for $n = 1, 2, 5, 10$ ($F_5 = 5$ and $F_{10} = 55$).


In [1]:
def fibonacci(n):
    if n == 1 or n == 2:
        return 1
    else:
        return fibonacci(n-1) + fibonacci(n-2)
    
print('F_1 = ', fibonacci(1))
print('F_2 = ', fibonacci(2))
print('F_5 = ', fibonacci(5))
print('F_10 = ', fibonacci(10))


F_1 =  1
F_2 =  1
F_5 =  5
F_10 =  55

Loops

We will often want to run the same code many times on similar input. Let us suppose we want to add $n$ to $3$, where $n$ is every number between $1$ and $5$. We could do:


In [6]:
print(add(3, 1))
print(add(3, 2))
print(add(3, 3))
print(add(3, 4))
print(add(3, 5))


4
5
6
7
8

This is tedious and there's a high chance of errors.

In VBA you can define a loop that repeats commands as, for example

For n = 1 To 5
        Z = 3 + n
    Next n

In Python there is also a for loop:


In [7]:
for n in 1, 2, 3, 4, 5:
    print(add(3, n))
print("Loop has ended")


4
5
6
7
8
Loop has ended

The syntax has similarities to the syntax for functions. The line defining the loop starts with for, specifies the values that n takes, and ends with a colon. The code that is executed inside the loop is indented.

As a short-hand for integer loops, we can use the range function:


In [8]:
for n in range(1, 6):
    print("n =", n)
for m in range(3):
    print("m =", m)
for k in range(2, 7, 2):
    print("k =", k)


n = 1
n = 2
n = 3
n = 4
n = 5
m = 0
m = 1
m = 2
k = 2
k = 4
k = 6

We see that

  • if two numbers are given, range returns all integers from the first number up to but not including the second in steps of $1$;
  • if one number is given, range starts from $0$;
  • if three numbers are given, the third is the step.

In fact Python will iterate over any collection of objects: they do not have to be integers:


In [9]:
for thing in 1, 2.5, "hello", add:
    print("thing is ", thing)


thing is  1
thing is  2.5
thing is  hello
thing is  <function add at 0x10b0bc8c8>

This is very often used in Python code: if you have some way of collecting things together, Python will happily iterate over them all.

Containers, sequences, lists, arrays

So what are the Python ways of collecting things together? In VBA, there are arrays:

Dim A(2) AS DOUBLE

defines an array, or vector, of length $3$, starting from $0$, of double precision floating point numbers.

Dim B() AS DOUBLE

defines an array, or vector, of arbitrary length, starting from $0$, of double precision floating point numbers. You can also start arrays from values other than $0$. The individual entries are accessed and modified using, for example, A(0).

In Python there are many ways of collecting objects together. The closest to VBA are tuples and lists.

Tuples

A tuple is a sequence with fixed size, whose entries cannot be modified:


In [10]:
t1 = (0, 1, 2, 3, 4, 5)
print(t1[0])
print(t1[3])


0
3

We see that to access individual entries we use square brackets and the number of the entry, starting from $0$. All Python tuples and lists start from $0$. To check that it cannot be modified:


In [11]:
t1[0] = 1


---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-11-9e1f4de27f17> in <module>()
----> 1 t1[0] = 1

TypeError: 'tuple' object does not support item assignment

We can use slicing to access many entries at once:


In [12]:
print(t1[1:4])


(1, 2, 3)

As with the range function, the notation <start>:<end> returns the entries from (and including) <start> up to, but not including, <end>.

We can use negative numbers to access from the right of the sequence: -1 is the last entry, -2 the next-to-last, and so on:


In [13]:
print(t1[-1])


5

Lists

A list is a sequence with a size that can change, and whose entries can be modified:


In [14]:
l1 = [0, 1, 2, 3, 4, 5]
print(l1[3])
l1[3] = 7
print(l1[3])
l1.append(6)
print(l1)


3
7
[0, 1, 2, 7, 4, 5, 6]

The same slicing notation can be used, and now can be used to assignment:


In [15]:
l1[0:2] = l1[4:6]
print(l1)


[4, 5, 2, 7, 4, 5, 6]

Crucially, lists and tuples can contain anything. As with loops, there is no restriction on types, and things can be nested:


In [16]:
l2 = [0, 1.2, "hello", ["a", 3, 4.5], (0, (1.1, 2.3, 4))]
print(l2[1])
print(l2[3][0])


1.2
a

Dictionaries

Both lists and tuples are ordered: there are accessed by an integer giving there location in the sequence. This doesn't always make sense. Consider an algorithm which depends on parameters $\omega, \Gamma, N$. We want to keep the parameters together, but there's no logical order to them. Instead we can use a dictionary, which is an unordered Python container:


In [17]:
d1 = {"omega": 1.0, "Gamma": 5.7, "N": 100}
print(d1["Gamma"])


5.7

As there is no order we access dictionaries using the key. To loop over a dictionary, we take advantage of Python's loose iteration rules:


In [18]:
for key in d1:
    print("Key is", key, "value is", d1[key])


Key is omega value is 1.0
Key is Gamma value is 5.7
Key is N value is 100

There is a shortcut to allow you to get both key and value in one go:


In [19]:
for key, value in d1.items():
    print("Key is", key, "value is", value)


Key is omega value is 1.0
Key is Gamma value is 5.7
Key is N value is 100

Exercise

Write a dictionary with the structure:

d = {'first name' : ...,
     'last name'  : ...,
     'student ID' : ...,
     'project'    : ...}

Fill it in with suitable values. Write two functions f_name and f_project. Each should take as input a dictionary.

  1. f_name should print "My name is <first name> <last name>"
  2. f_project should print "Student <student ID> is doing project <project>"

where <X> should fill in the appropriate value from the dictionary.


In [2]:
boaty = {'first name' : 'Boaty',
     'last name'  : 'McBoatface',
     'student ID' : 123456,
     'project'    : 'Surveying the arctic ocean'}

def f_name(d):
    print("My name is {} {}".format(d['first name'], d['last name']))
    
def f_project(d):
    print("Student {} is doing project {}".format(d['student ID'], d['project']))

In [3]:
f_name(boaty)
f_project(boaty)


My name is Boaty McBoatface
Student 123456 is doing project Surveying the arctic ocean

Numpy arrays

We've seen python's built-in lists for storing data, however for the numpy library contains the more powerful array datatype. Arrays are essentially a more powerful form of lists which make it easier to handle data. Most importantly, they allow us to apply operations to all elements of an array at once, rather than looping over the elements one-by-one.

To see this, let's create a list and a numpy array, both containing the same data.


In [20]:
import numpy

In [21]:
# python list
l = [[1., 2., 3.],
     [4., 5., 6.],
     [7., 8., 9.]]

a = numpy.array([[1., 2., 3.],
                 [4., 5., 6.],
                 [7., 8., 9.]])

print('list l = {}'.format(l))
print('numpy array a = {}'.format(a))


list l = [[1.0, 2.0, 3.0], [4.0, 5.0, 6.0], [7.0, 8.0, 9.0]]
numpy array a = [[ 1.  2.  3.]
 [ 4.  5.  6.]
 [ 7.  8.  9.]]

Accessing elements of numpy arrays is very similar to accessing elements of lists, but with slightly less typing. To access elements from an n-dimensional list, we have to use multiple square brackets, e.g. l[0][4][7][8]. For a numpy array, we separate the indices using a comma: a[0, 4, 7, 8].


In [22]:
print(l[1][2])
print(a[1,2])


6.0
6.0

Let's say we now want to square every element of the array. For this 2d list, we would need a for loop:


In [23]:
import copy
squared = copy.deepcopy(l)
for i in range(3):
    for j in range(3):
        squared[i][j] = l[i][j]**2
print(squared)


[[1.0, 4.0, 9.0], [16.0, 25.0, 36.0], [49.0, 64.0, 81.0]]

Note that here we used the function deepcopy from the copy module the copy the list l. If we had simply used squared = l, when we the assigned the elements of squared new values, this would also have changed the values in l. This is in contrast to the simple variables we saw before, where changing the value of one will leave the values of others unchanged.

For numpy arrays, applying operations across the entire array is much simpler:


In [24]:
print(a**2)


[[  1.   4.   9.]
 [ 16.  25.  36.]
 [ 49.  64.  81.]]

Numpy has a range of array manipulation routines for rearranging and manipulating elements, such as those below.


In [25]:
# transpose
a.T


Out[25]:
array([[ 1.,  4.,  7.],
       [ 2.,  5.,  8.],
       [ 3.,  6.,  9.]])

In [26]:
# reshape
numpy.reshape(a, (1,9))


Out[26]:
array([[ 1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9.]])

In [27]:
# stack arrays horizontally
numpy.hstack((a,a,a))


Out[27]:
array([[ 1.,  2.,  3.,  1.,  2.,  3.,  1.,  2.,  3.],
       [ 4.,  5.,  6.,  4.,  5.,  6.,  4.,  5.,  6.],
       [ 7.,  8.,  9.,  7.,  8.,  9.,  7.,  8.,  9.]])

If you've used Matlab before, you may be familiar with logical indexing. This is a way of accessing elements of a array that satisfy some criteria, e.g. all the elements which are greater than 0. We can also do this with numpy arrays using boolean array indexing:


In [28]:
a[a > 5]


Out[28]:
array([ 6.,  7.,  8.,  9.])

Exercise

Do the bubble sort and counting sort exercise.