Unit Testing

Unit testing is the process of breaking a program into small pieces and testing thoroughly each piece.

A unit is generally a function or class in a program.

Why should we do unit testing?

verification and validation of code
prevent bugs from being introduced by yourself or others
personal sanity (prevents computational hubris)
it's super duper easy to do but probably saves hours of headache
lets you add cool things to your projects repo

Two schools of thought. Create unit tests before writing code (test driven developement) or create unit tests after writing code to make sure it works correctly.

Unit tests are run periodically, generally after changes to code base, to make sure changes haven't introduced new bugs

For python, like most languages, there are several frameworks for building unit tests.

Paul has briefly discussed nosetests before at hacker (https://github.com/walternathan6754/illinois/blob/master/sphinx/sphinx.pdf)

Unit tests are generally functions that test the output or performance of other functions or modules. This is often checked with assert statements that are evaluated.

Pytest

As a simple example, we will start with a function to divide two numbers (don't ask why we need this function to be made, we just do!)

def divide(numerator, denominator):
    """ function to perform division of two numbers. This should not perform
        integer division

        Raises:
            ZeroDivisionError: raised if denominator is zero
    """
    return numerator/denominator

As the doc string explains, the function should not perform integer division, and should raise a ZeroDivisionError if the denominator is zero.

Then, we could build a test function to test division of two ints

def test_divide_ints():
    """test division of two integers 4 and 2"""
    assert divide(4,2) == 2

These two functions are in the file simple_example.py. To perform execute the test function, in terminal perform

$ py.test -v simple_example.py
$ python -m pytest -v simple_example.py

====================================================== test session starts =======================================================
platform darwin -- Python 2.7.11, pytest-2.8.1, py-1.4.30, pluggy-0.3.1 -- /Users/Nathan/anaconda/bin/python
cachedir: .cache
rootdir: /Users/Nathan/Documents/illinois/unit_testing, inifile: 
collected 1 items 

simple_example.py::test_divide_ints PASSED

==================================================== 1 passed in 0.00 seconds ====================================================

Either of those lines works for executing pytest on the included file. If no file is included, pytest will run on all files titled test_* in the current directory and all of the sub directories. It is worth noting that py.test could also be replaced with nosetests (a different unittesting module for python) and these codes will still work!

Pytest will run on all functions named test_* in the files

Seperating Tests

For this simple case, it is okay to include both the function and the test in the same file. However, for a larger project this is less viable. Thus, the functions and the tests can be seperated

project\
    src\
        __init__.py
        divide.py
    test\
        test_divide.py

The init.py file is important in the src folder because this allows python to import the files to the test files.

Also, it is important to add the project directory to the PYTHONPATH.

$ export PYTHONPATH=$PYTHONPATH;\path_to_project\

Now we can build many more tests into the test\test_divide.py file

from divide import divide

import pytest

def test_divide_ints():
    assert divide(4,2) == 2

def test_divide_floats():
    assert divide(5.0, 2.0) == 2.5

def test_zero_division():
    with pytest.raises(ZeroDivisionError) as e_info:
        divide(4.0,0.0)

This includes three tests. One to check division of intergers, one to test division of floats, and one to check that an exception is raised when zero is provided as a denominator.

$ py.test -v test/test_divide.py

Nathans-iMac:unit_testing Nathan$ py.test -v test/test_divide.py 
====================================================== test session starts =======================================================
platform darwin -- Python 2.7.11, pytest-2.8.1, py-1.4.30, pluggy-0.3.1 -- /Users/Nathan/anaconda/bin/python
cachedir: test/.cache
rootdir: /Users/Nathan/Documents/illinois/unit_testing/test, inifile: 
collected 3 items 

test/test_divide.py::test_divide_ints PASSED
test/test_divide.py::test_divide_floats PASSED
test/test_divide.py::test_zero_division PASSED

==================================================== 3 passed in 0.01 seconds ====================================================

However, you may notice that this does not check that integer division is not performed. And in fact, our divide function actually does perform integer division (I am running python 2).

Maybe, someone notices this and decides to fix this. And maybe, they really really really love numpy, so the fix they make is

import numpy

def divide(numerator, denomator):
    """ function to perform division of two numbers. This should not perform
        integer division

        Raises:
        ZeroDivisionError: raised if denominator is zero
    """
    return numpy.float64(numerator)/numpy.float64(denomator)

It's not the worst fix ever. It corrects the integer division. However, it does break the test for zero division

$ py.test -v test/test_divide_numpy.py

====================================================== test session starts =======================================================
platform darwin -- Python 2.7.11, pytest-2.8.1, py-1.4.30, pluggy-0.3.1 -- /Users/Nathan/anaconda/bin/python
cachedir: test/.cache
rootdir: /Users/Nathan/Documents/illinois/unit_testing/test, inifile: 
collected 3 items 

test/test_divide_numpy.py::test_divide_numpy_ints PASSED
test/test_divide_numpy.py::test_divide_floats PASSED
test/test_divide_numpy.py::test_zero_division FAILED

============================================================ FAILURES ============================================================
_______________________________________________________ test_zero_division _______________________________________________________

    def test_zero_division():
        with pytest.raises(ZeroDivisionError) as e_info:
>           divide(4.0,0.0)
E           Failed: DID NOT RAISE

test/test_divide_numpy.py:13: Failed
------------------------------------------------------ Captured stderr call ------------------------------------------------------
/Users/Nathan/Documents/illinois/unit_testing/src/divide_numpy.py:10: RuntimeWarning: divide by zero encountered in double_scalars
  return numpy.float64(numerator)/numpy.float64(denomator)
=============================================== 1 failed, 2 passed in 0.15 seconds ===============================================

So you can see the function did not raise the zero division error. Interesting thing about numpy.float64 variables, they can be divided by zero and produce NaN instead of an error.

Our unit tests allowed us to know immediately that our new code broke other functioning protions of our code (which can be very costly to commit if it is a large project)

Verbosity in Pytest

These examples are in test/test_verbose_fails.py

If you are making tests, it might be very useful to include messages with your assertions when they fail.

from divide import divide

import numpy

def test_integer_division():
    assert divide(1,2) != 0, "performed integer division"

In this case, when integer division is performed, the assertion will tell us why this is an issue

============================================================ FAILURES ============================================================
_____________________________________________________ test_integer_division ______________________________________________________

    def test_integer_division():
>       assert divide(1,2) != 0, "performed integer division"
E       AssertionError: performed integer division
E       assert 0 != 0
E        +  where 0 = divide(1, 2)

test/test_verbose_fails.py:6: AssertionError

Notice there is two addition bits of verbose information. Our message of "performed integer division" and pytest telling us that divide(1,2) = 0, which I thought was cool

When performing a test that tests dictionaries and lists, pytest will print the exact difference between the two objects

def test_lists():
    left_list  = [1,2,3,4]
    right_list = [2,2,3,4]
    assert left_list == right_list

def test_dictionaries():
    left_dic  = {'item1': 1, 'item2':2, 'item3':3}
    right_dic = {'item1': 2, 'item2':2, 'item4':3}
    assert left_dic == right_dic

def test_numpy_arrays():
    left_array  = numpy.array([1,2,3,4])
    right_array = numpy.array([2,2,3,4])
    assert numpy.array_equal(left_array, right_array) == True

$ py.test -v test\test_verbose_fails.py

___________________________________________________________ test_lists ___________________________________________________________

    def test_lists():
        left_list  = [1,2,3,4]
        right_list = [2,2,3,4]
>       assert left_list == right_list
E       assert [1, 2, 3, 4] == [2, 2, 3, 4]
E         At index 0 diff: 1 != 2
E         Full diff:
E         - [1, 2, 3, 4]
E         ?  ---
E         + [2, 2, 3, 4]
E         ?     +++

test/test_verbose_fails.py:11: AssertionError
_______________________________________________________ test_dictionaries ________________________________________________________

    def test_dictionaries():
        left_dic  = {'item1': 1, 'item2':2, 'item3':3}
        right_dic = {'item1': 2, 'item2':2, 'item4':3}
>       assert left_dic == right_dic
E       assert {'item1': 1, ...2, 'item3': 3} == {'item1': 2, '...2, 'item4': 3}
E         Common items:
E         {'item2': 2}
E         Differing items:
E         {'item1': 1} != {'item1': 2}
E         Left contains more items:
E         {'item3': 3}
E         Right contains more items:
E         {'item4': 3}
E         Full diff:
E         - {'item1': 1, 'item2': 2, 'item3': 3}
E         ?           ^                   ^
E         + {'item1': 2, 'item2': 2, 'item4': 3}
E         ?           ^                   ^

test/test_verbose_fails.py:16: AssertionError
_______________________________________________________ test_numpy_arrays ________________________________________________________

    def test_numpy_arrays():
        left_array  = numpy.array([1,2,3,4])
        right_array = numpy.array([2,2,3,4])
>       assert numpy.array_equal(left_array, right_array) == True
E       assert <function array_equal at 0x103c972a8>(array([1, 2, 3, 4]), array([2, 2, 3, 4])) == True
E        +  where <function array_equal at 0x103c972a8> = numpy.array_equal

Notice that pytest prints the common items, the differing items, and what items are only in the left container and the right container.

Sadly, this doesn't work on numpy arrays though :(

Setup and Teardown

The previous examples are all very simple. Most likely, in a real setting, the functions created will rely on a certain state of the program.

To setup a certain state before a test function is performed, pytest can run functions with the name

def setup_module(module):

def setup_function(function):

and the state can be destroyed with

def teardown_module(module):

def teardown_function(function):

As an example, I will make a simplier version of how I used this in a project.

Imagine a package that simulates power cycles. At some point in the code, the thermal efficiency needs to be calculated from various properties of the cycle components. In order to not have to run the code until the point where the efficiency needs to be computed (which could be hours), setup functions could set the state of the code to be able to test the function.

This example is in src\power_efficiency.py

"""module that computes the power efficiency of a cycle"""

input_power = 0
output_power = 0

def compute_efficiency():
    """Computes the power efficiency of a thermal cycle
    Raises:
        ValueError: if power is negative
    """
    if output_power < 0:
        raise ValueError

    return output_power/input_power

The code computes the efficiency of the cycle from the input and output power, which would be set elsewhere in the program.

Obviously, without some setup, testing the compute_efficiency function would always result in a ValueError.

So, using the setup_module we can set the state before testing the function

import power_efficiency as pe

def setup_module(module):
    print ""
    print "module setup"
    pe.input_power = 100. # kJ
    pe.output_power = 30. # kJ

def teardown_module(module):
    print ""
    print "module teardown"
    pe.input_power = 0. # kJ
    pe.output_power = 0. # kJ

def test_input_power():
    print "test input power"
    assert pe.input_power == 100.

def test_compute_efficiency():
    print "\ntest efficiency"
    assert pe.compute_efficiency() == 0.3 # it's not very efficient

I added prints to the functions so that when the tests are run, we can see the order of the functions called.

$ py.test -v -s test/test_compute_efficiency.py

====================================================== test session starts =======================================================
platform darwin -- Python 2.7.11, pytest-2.8.1, py-1.4.30, pluggy-0.3.1 -- /Users/Nathan/anaconda/bin/python
cachedir: test/.cache
rootdir: /Users/Nathan/Documents/illinois/unit_testing/test, inifile: 
collected 2 items 

test/test_compute_efficiency.py::test_input_power PASSED
test/test_compute_efficiency.py::test_compute_efficiency PASSED

==================================================== 2 passed in 0.01 seconds ====================================================
Nathans-iMac:unit_testing Nathan$ py.test -v -s test/test_compute_efficiency.py 
====================================================== test session starts =======================================================
platform darwin -- Python 2.7.11, pytest-2.8.1, py-1.4.30, pluggy-0.3.1 -- /Users/Nathan/anaconda/bin/python
cachedir: test/.cache
rootdir: /Users/Nathan/Documents/illinois/unit_testing/test, inifile: 
collected 2 items 

test/test_compute_efficiency.py::test_input_power 
module setup
test input power
PASSED
test/test_compute_efficiency.py::test_compute_efficiency 
test efficiency
PASSED
module teardown


==================================================== 2 passed in 0.01 seconds ====================================================

The -s collects the prints and orders them on screen. From the output, you can see that the module setup is called once, then the two tests functions are called and passed. Then the module teardown is called.

These functions can also be very useful if data needs to be read into a program. The setup can read the data from file and then the test functions can be performed.

The difference between module and function setups, is function setups are called before every test function and module setups are called once.

Then I use the teardown function to undo the changes done by the setup function.

pytest fixtures

Another method of setting up a state of a program before performing tests is pytest fixtures.

import power_efficiency as pe
import pytest

@pytest.fixture
def set_power():
    print "\nset the power"
    pe.input_power = 100. # kJ
    pe.output_power = 30. # kJ

@pytest.fixture
def set_negative_power():
    print "\nset negative output power"
    pe.input_power = 100. # kJ
    pe.output_power = -30. # kJ

def test_input_power(set_power):
    print "test input power"
    assert pe.input_power == 100.

def test_compute_efficiency(set_power):
    print "test efficiency with setup"
    assert pe.compute_efficiency() == 0.3 # it's not very efficient

def test_negative_power(set_negative_power):
    print "test efficiency with negative power"
    with pytest.raises(ValueError) as e_info:
        pe.compute_efficiency()

To use fixtures, pytest needs to be imported.

Then before setup functions, a @pytest.fixture is placed.

This allows for multiple setups to be created. Then the test function takes these fixtures as inputs.

$ py.test -v -s test/test_pytest_fixtures.py

====================================================== test session starts =======================================================
platform darwin -- Python 2.7.11, pytest-2.8.1, py-1.4.30, pluggy-0.3.1 -- /Users/Nathan/anaconda/bin/python
cachedir: test/.cache
rootdir: /Users/Nathan/Documents/illinois/unit_testing/test, inifile: 
collected 3 items 

test/test_pytest_fixtures.py::test_input_power 
set the power
test input power
PASSED
test/test_pytest_fixtures.py::test_compute_efficiency 
set the power
test efficiency with setup
PASSED
test/test_pytest_fixtures.py::test_negative_power 
set negative output power
test efficiency with negative power
PASSED

==================================================== 3 passed in 0.02 seconds ====================================================

So depending on the if the test took set_power or set_negative_power as an input, the state of the module was different. This allows for testing with different states.

For more on how to start unit testing I recommend downloading any open package and look through their testing codes. Online examples I have found have all been far too simple.



In [ ]: