Working with Python: functions and modules

Session 2: Functions

Function basics

We have already seen a number of functions built into python that let us do useful things to strings, collections and numbers etc. For example print() or len() which is passed some kind of sequence object and returns the length of the sequence.

This is the general form of a function; it takes some input arguments and returns some output based on the supplied arguments.

The arguments to a function, if any, are supplied in parentheses and the result of the function call is the result of evaluating the function.

In [ ]:
x = abs(-3.0)

l = len("ACGGTGTCAA")

As well as using python's built in functions, you can write your own. Functions are a nice way to encapsulate some code that you want to reuse elsewhere in your program, rather than repeating the same bit of code multiple times. They also provide a way to name some coherent block of code and allow you to structure a complex program.

Function definition syntax

Functions are defined in Python using the def keyword followed by the name of the function. If your function takes some arguments (input data) then you can name these in parentheses after the function name. If your function does not take any arguments you still need some empty parentheses. Here we define a simple function named sayHello that prints a line of text to the screen:

In [ ]:
def sayHello():
    print('Hello world!')

Note that the code block for the function (just a single print line in this case) is indented relative to the def. The above definition just decalares the function in an abstract way and nothing will be printed when the definition is made. To actually use a function you need to invoke it (call it) by using its name and a pair of round parentheses:

In [ ]:
sayHello() # Call the function to print 'Hello world'

If required, a function may be written so it accepts input. Here we specify a variable called name in the brackets of the function definition and this variable is then used by the function. Although the input variable is referred to inside the function the variable does not represent any particular value. It only takes a value if the function is actually used in context.

In [ ]:
def sayHello(name):
    print('Hello', name)

When we call (invoke) this function we specify a specific value for the input. Here we pass in the value User, so the name variable takes that value and uses it to print a message, as defined in the function.

In [ ]:
sayHello('User')  # Prints 'Hello User'

When we call the function again with a different input value we naturally get a different message. Here we also illustrate that the input value can also be passed-in as a variable (text in this case).

In [ ]:
text = 'Mary'
sayHello(text)     # Prints 'Hello Mary'

A function may also generate output that is passed back or returned to the program at the point at which the function was called. For example here we define a function to do a simple calculation of the square of input (x) to create an output (y):

In [ ]:
def square(x):
  y = x*x
  return y

Once the return statement is reached the operation of the function will end, and anything on the return line will be passed back as output. Here we call the function on an input number and catch the output value as result. Notice how the names of the variables used inside the function definition are separate from any variable names we may choose to use when calling the function.

In [ ]:
number = 7
result = square(number) # Call the square() function which returns a result
print(result)           # Prints: 49

The function square can be used from now on anywhere in your program as many times as required on any (numeric) input values we like.

In [ ]:
print(square(1.2e-3))   # Prints: 1.4399999999999998e-06

A function can accept multiple input values, otherwise known as arguments. These are separated by commas inside the brackets of the function definition. Here we define a function that takes two arguments and performs a calculation on both, before sending back the result.

In [ ]:
def calcFunc(x, y):
  z = x*x + y*y
  return z

result = calcFunc(1.414, 2.0)
print(result)  #  5.999396

Note that this function does not check that x and y are valid forms of input. For the function to work properly we assume they are numbers. Depending on how this function is going to be used, appropriate checks could be added.

Functions can be arbitrarily long and can peform very complex operations. However, to make a function reusable, it is often better to assign it a single responsibility and a descriptive name. Let's define now a function to calculate the Euclidean distance between two vectors:

In [ ]:
def calcDistance(vec1, vec2):    
    dist = 0
    for i in range(len(vec1)):
        delta = vec1[i] - vec2[i]
        dist += delta*delta
    dist = dist**(1/2) # square-root
    return dist

For the record, the prefered way to calcule a square-root is by using the built-in function sqrt() from the math library:

import math

Let's experiment a little with our function.

In [ ]:
w1 = ( 23.1, 17.8, -5.6 )
w2 = ( 8.4, 15.9, 7.7 )
calcDistance( w1, w2 )

Note that the function is general and handles any two vectors (irrespective of their representation) as long as their dimensions are compatible:

In [ ]:
calcDistance( ( 1, 2 ), ( 3, 4 ) ) # dimension: 2

In [ ]:
calcDistance( [ 1, 2 ], [ 3, 4 ] ) # vectors represented as lists

In [ ]:
calcDistance( ( 1, 2 ), [ 3, 4 ] ) # mixed representation

Exercise 2.1

  • a. Calculate the mean
    • Write a function that takes 2 numerical arguments and returns their mean. Test your function on some examples.
    • Write another function that takes a list of numbers and returns the mean of all the numbers in the list.
  • b. Write a function that takes a single DNA sequence as an argument and estimates the molecular weight of this sequence. Test your function using some example sequences. The following table gives the weight of each (single-stranded) nucleotide in g/mol:
DNA ResidueWeight
  • c. If the sequence passed contains base N, use the mean weight of the other bases as the weight of base N.

Return value

There can be more than one return statement in a function, although typically there is only one, at the bottom. Consider the following function to get some text to say whether a number is positive or negative. It has three return statements: the first two return statements pass back text strings but the last, which would be reached if the input value were zero, has no explicit return value and thus passes back the Python None object. Any function code after this final return is ignored. The return keyword immediately exits the function, and no more of the code in that function will be run once the function has returned (as program flow will be returned to the call site)

In [ ]:
def getSign(value):
    if value > 0:
        return "Positive"
    elif value < 0:
        return "Negative"
    return # implicit 'None'

    print("Hello world") # execution does not reach this line
print("getSign( 33.6 ):", getSign( 33.6 ))
print("getSign( -7 ):", getSign( -7 ))
print("getSign( 0 ):", getSign( 0 ))

All of the examples of functions so far have returned only single values, however it is possible to pass back more than one value via the return statement. In the following example we define a function that takes two arguments and passes back three values. The return values are really passed back inside a single tuple, which can be caught as a single collection of values.

In [ ]:
def myFunction(value1, value2):
    total = value1 + value2
    difference = value1 - value2
    product = value1 * value2
    return total, difference, product

values = myFunction( 3, 7 )  # Grab output as a whole tuple
print("Results as a tuple:", values)

x, y, z = myFunction( 3, 7 ) # Unpack tuple to grab individual values
print("x:", x)
print("y:", y)
print("z:", z)

Exercise 2.2

a. Write a function that counts the number of each base found in a DNA sequence. Return the result as a tuple of 4 numbers representing the counts of each base A, C, G and T.

b. Write a function to return the reverse-complement of a nucleotide sequence.

Function arguments

Mandatory arguments

The arguments we have passed to functions so far have all been mandatory, if we do not supply them or if supply the wrong number of arguments python will throw an error also called an exception:

In [ ]:
def square(number):
    # one mandatory argument
    y = number*number
    return y

In [ ]:

Mandatory arguments are assumed to come in the same order as the arguments in the function definition, but you can also opt to specify the arguments using the argument names as keywords, supplying the values corresponding to each keyword with a = sign.

In [ ]:

In [ ]:
def repeat(seq, n):
    # two mandatory arguments
    result = ''
    for i in range(0,n):
        result += seq
    return result

print(repeat("CTA", 3))
print(repeat(n=4, seq="GTT"))
**NOTE** Unnamed (positional) arguments must come before named arguments, even if they look to be in the right order.

In [ ]:
print(repeat(seq="CTA", n=3))

Arguments with default values

Sometimes it is useful to give some arguments a default value that the caller can override, but which will be used if the caller does not supply a value for this argument. We can do this by assigning some value to the named argument with the = operator in the function definition.

In [ ]:
def runSimulation(nsteps=1000):
    print("Running simulation for", nsteps, "steps")

**CAVEAT**: default arguments are defined once and keep their state between calls. This can be a problem for *mutable* objects:

In [ ]:
def myFunction(parameters=[]):
    parameters.append( 100 )

... or avoid modifying mutable default arguments.

In [ ]:
def myFunction(parameters):
    # one mandatory argument without default value
    parameters.append( 100 )
my_list = []
my_new_list = []

Position of mandatory arguments

Arrange function arguments so that mandatory arguments come first:

In [ ]:
def runSimulation(initialTemperature, nsteps=1000):
    # one mandatory argument followed by one with default value
    print("Running simulation starting at", initialTemperature, "K and doing", nsteps, "steps")
runSimulation(300, 500)

As before, no positional argument can appear after a keyword argument, and all required arguments must still be provided.

In [ ]:
runSimulation( nsteps=100, initialTemperature=300 )

In [ ]:
runSimulation( initialTemperature=300 )

In [ ]:
runSimulation( nsteps=100 ) # Error: missing required argument 'initialTemperature'

In [ ]:
runSimulation( nsteps=100, 300 ) # Error: positional argument follows keyword argument

Keyword names must naturally match to those declared:

In [ ]:
runSimulation( initialTemperature=300, numSteps=100 ) # Error: unexpected keyword argument 'numSteps'

Function cannot be defined with mandatory arguments after default ones.

In [ ]:
def badFunction(nsteps=1000, initialTemperature):

Variable scope

Every variable in python has a scope in which it is defined. Variables defined at the outermost level are known as globals (although typically only for the current module). In contrast, variables defined within a function are local, and cannot be accessed from the outside.

In [ ]:
def mathFunction(x, y):
    math_func_result = ( x + y ) * ( x - y )
    return math_func_result

In [ ]:
answer = mathFunction( 4, 7 )

In [ ]:
answer = mathFunction( 4, 7 )

Here we have two variables with the same name but they do have different scopes, one local to the function and the other one global, therefore they are different.

In [ ]:
def increase(value):
    value += 1
value = 4

Generally, variables defined in an outer scope are also visible in functions, but you should be careful manipulating them as this can lead to confusing code and python will actually raise an error if you try to change the value of a global variable inside a function. Instead it is a good idea to avoid using global variables and, for example, to pass any necessary variables as parameters to your functions.

In [ ]:
counter = 4
def increment(): 
    counter += 1
    return counter


Use a local variable instead

In [ ]:
def increment(): 
    counter = 4
    counter += 1
    return counter


or pass any necessary variables as parameters to your functions.

In [ ]:
def increment(counter): 
    counter += 1
    return counter


Exercise 2.3

Extend your solution to the previous exercise estimating the weight of a DNA sequence so that it can also calculate the weight of an RNA sequence, use an optional argument to specify the molecule type, but default to DNA. The weights of RNA residues are:

RNA ResidueWeight

Next session

Go to our next notebook: python_functions_and_modules_3