In [ ]:
from __future__ import print_function, division, absolute_import

Code Testing and CI

Version 0.1

The notebook contains problems about code testing and continuous integration.


E Tollerud (STScI)

Problem 1: Set up py.test in you repo

In this problem we'll aim to get the py.test testing framework up and running in the code repository you set up in the last set of problems. We can then use it to collect and run tests of the code.

1a: Ensure py.test is installed

Of course py.test must actually be installed before you can use it. The commands below should work for the Anaconda Python Distribution, but if you have some other Python installation you'll want to install pytest (and its coverage plugin) as directed in the install instructions for py.test.


In [ ]:
!conda install pytest pytest-cov

1b: Ensure your repo has code suitable for unit tests

Depending on what your code actually does, you might need to modify it to actually perform something testable. For example, if all it does is print something, you might find it difficult to write an effective unit test. Try adding a function that actually performs some operation and returns something different depending on various inputs. That tends to be the easiest function to unit-test: one with a clear "right" answer in certain situations.

Also be sure you have cded to the root of the repo for pytest to operate correctly.

1c: Add a test file with a test function

The test must be part of the package and follow the convention that the file and the function begin with test to get picked up by the test collection machinery. Inside the test function, you'll need some code that fails if the test condition fails. The easiest way to do this is with an assert statement, which raises an error if its first argument is False.

Hint: remember that to be a valid python package, a directory must have an __init__.py


In [ ]:
!mkdir #complete
!touch #complete

In [ ]:
%%file <yourpackage>/tests/test_something.py

def test_something_func():
    assert #complete

1d: Run the test directly

While this is not how you'd ordinarily run the tests, it's instructive to first try to execute the test directly, without using any fancy test framework. If your test function just runs, all is good. If you get an exception, the test failed (which in this case might be good).

Hint: you may need to use reload or just re-start your notebook kernel to get the cell below to recognize the changes.


In [ ]:
from <yourpackage>.tests import test_something

test_something.test_something_func()

1e: Run the tests with py.test

Once you have an example test, you can try invoking py.test, which is how you should run the tests in the future. This should yield a report that shows a dot for each test. If all you see are dots, the tests ran sucessfully. But if there's a failure, you'll see the error, and the traceback showing where the error happened.


In [ ]:
!py.test

1f: Make the test fail (or succeed...)

If your test failed when you ran it, you should now try to fix the test (or the code...) to make it work. Try running

(Modify your test to fail if it succeeded before, or vice versa)


In [ ]:
!py.test

1g: Check coverage

The coverage plugin we installed will let you check which lines of your code are actually run by the testing suite.


In [ ]:
!py.test --cov=<yourproject> tests/ #complete

This should yield a report, which you can use to decide if you need to add more tests to acheive complete coverage. Check out the command line arguments to see if you can get a more detailed line-by-line report.

Problem 2: Implement some unit tests

The sub-problems below each contain different unit testing complications. Place the code from the snippets in your repository (either using an editor or the %%file trick), and write tests to ensure the correctness of the functions. Try to achieve 100% coverage for all of them (especially to catch some hidden bugs!).

Also, note that some of these examples are not really practical - that is, you wouldn't want to do this in real code because there's better ways to do it. But because of that, they are good examples of where something can go subtly wrong... and therefore where you want to make tests!

2a

When you have a function with a default, it's wise to test both the with-default call (function_b()), and when you give a value (function_b(1.2))

Hint: Beware of numbers that come close to 0... write your tests to accomodate floating-point errors!


In [ ]:
#%%file <yourproject>/<filename>.py #complete, or just use your editor

# `math` here is for *scalar* math... normally you'd use numpy but this makes it a bit simpler to debug

import math 

inf = float('inf')  # this is a quick-and-easy way to get the "infinity" value 

def function_a(angle=180):
    anglerad = math.radians(angle)
    return math.sin(anglerad/2)/math.sin(anglerad)

Solution (one of many...)


In [ ]:
def test_default_bad():
    # this will fail, although it *seems* like it should succeed...  the sin function has rounding errors
    assert function_a() == inf  

def test_default_good():
    assert function_a() > 1e10
    
def test_otherval_bad():
    # again it seems like it should succed, but rounding errors make it fail
    assert function_a(90) == math.sqrt(2)/2
    
def test_otherval_good():
    assert abs(function_a(90) - math.sqrt(2)/2) < 1e-10

2b

This test has an intentional bug... but depending how you right the test you might not catch it... Use unit tests to find it! (and then fix it...)


In [ ]:
#%%file <yourproject>/<filename>.py #complete, or just use your editor

def function_b(value):
    if value < 0:
        return value - 1
    else:
        value2 = subfunction_b(value + 1)
        return value + value2
    
def subfunction_b(inp):
    vals_to_accum = []
    for i in range(10):
        vals_to_accum.append(inp ** (i/10))
    if vals_to_accum[-1] > 2:
        vals.append(100)
    # really you would use numpy to do this kind of number-crunching... but we're doing this for the sake of example right now
    return sum(vals_to_accum)

Solution (one of many...)


In [ ]:
def test_neg():
    assert function_b(-10) == -11
    
def test_zero():
    assert function_b(0) == 10
    
def test_pos_lt1():
    res = function_b(.5) 
    assert res > 10
    assert res < 100
    
def test_pos_gt1():
    res = function_b(1.5) 
    assert res > 100
    
# this test reveals that `subfunction_b()` has a ``vals`` where it should have a ``vals_to_accum``

2c

There are (at least) two significant bugs in this code (one fairly apparent, one much more subtle). Try to catch them both, and write a regression test that covers those cases once you've found them.

One note about this function: in real code you're probably better off just using the Angle object from astropy.coordinates. But this example demonstrates one of the reasons why that was created, as it's very easy to write a buggy version of this code.

Hint: you might find it useful to use astropy.coordinates.Angle to create test cases...


In [ ]:
#%%file <yourproject>/<filename>.py #complete, or just use your editor

import math

# know that to not have to worry about this, you should just use `astropy.coordinates`.
def angle_to_sexigesimal(angle_in_degrees, decimals=3):
    """
    Convert the given angle to a sexigesimal string of hours of RA.
    
    Parameters
    ----------
    angle_in_degrees : float
        A scalar angle, expressed in degrees
    
    Returns
    -------
    hms_str : str
        The sexigesimal string giving the hours, minutes, and seconds of RA for the given `angle_in_degrees`
        
    """
    if math.floor(decimals) != decimals:
        raise ValueError('decimals should be an integer!')
    
    hours_num = angle_in_degrees*24/180
    hours = math.floor(hours_num)
    
    min_num = (hours_num - hours)*60
    minutes = math.floor(min_num)
    
    seconds = (min_num - minutes)*60

    format_string = '{}:{}:{:.' + str(decimals) + 'f}'
    return format_string.format(hours, minutes, seconds)

Solution (one of many...)


In [ ]:
def test_decimals():
    assert angle_to_sexigesimal(0) == '0:0:0.000'
    assert angle_to_sexigesimal(0, decimals=5) == '0:0:0.00000'
    

def test_qtrs():
    assert angle_to_sexigesimal(90, decimals=0) == '6:0:0'
    assert angle_to_sexigesimal(180, decimals=0) == '12:0:0'
    assert angle_to_sexigesimal(270, decimals=0) == '18:0:0'
    assert angle_to_sexigesimal(360, decimals=0) == '24:0:0'
    # this reveals the major bug that the 180 at the top should be 360
    
def test_350():
    assert angle_to_sexigesimal(350, decimals=0) == '23:20:00'
    # this fails, revealing that sometimes the values round 

def test_neg():
    assert angle_to_sexigesimal(-7.5, decimals=0) == '-0:30:0'
    assert angle_to_sexigesimal(-20, decimals=0) == angle_to_sexigesimal(340, decimals=0)
    # these fail, revealing a "debatable" bug: that negative degrees give negative RAs that are 
    # nonsense.  You could always tell users not to give negative values... but users, particularly 
    # future you, probably won't listen.
    
def test_neg_decimals():
    import pytest
    
    with pytest.raises(ValueError):
        angle_to_sexigesimal(10, decimals=-2)

2d

Hint: numpy has some useful functions in numpy.testing for comparing arrays.


In [ ]:
#%%file <yourproject>/<filename>.py #complete, or just use your editor

import numpy as np

def function_d(array1=np.arange(10)*2, array2=np.arange(10), operation='-'):
    """
    Makes a matrix where the [i,j]th element is array1[i] <operation> array2[j]
    """
    if operation == '+':
        return array1[:, np.newaxis] + array2
    elif operation == '-':
        return array1[:, np.newaxis] - array2
    elif operation == '*':
        return array1[:, np.newaxis] * array2
    elif operation == '/':
        return array1[:, np.newaxis] / array2
    else:
        raise ValueError('Unrecognized operation "{}"'.format(operation))

Solution (one of many...)


In [ ]:
def test_minus():
    array1 = np.arange(10)*2
    array2 = np.arange(10)
    
    func_mat =  function_d(array1, array2, operation='-')
    
    for i, val1 in enumerate(array1):
        for j, val2 in enumerate(array2):
            assert func_mat[i, j] == val1 - val2
            
def test_plus():
    array1 = np.arange(10)*2
    array2 = np.arange(10)
    
    func_mat =  function_d(array1, array2, operation='+')
    
    for i, val1 in enumerate(array1):
        for j, val2 in enumerate(array2):
            assert func_mat[i, j] == val1 + val2
            
def test_times():
    array1 = np.arange(10)*2
    array2 = np.arange(10)
    
    func_mat =  function_d(array1, array2, operation='*')
    
    for i, val1 in enumerate(array1):
        for j, val2 in enumerate(array2):
            assert func_mat[i, j] == val1 * val2
            
def test_div():
    array1 = np.arange(10)*2
    array2 = np.arange(10)
    
    func_mat =  function_d(array1, array2, operation='/')
    
    for i, val1 in enumerate(array1):
        for j, val2 in enumerate(array2):
            assert func_mat[i, j] == val1 / val2
    #GOTCHA! This doesn't work because of floating point differences between numpy and python scalars 
    # This is where that numpy stuff is handy - see the next function

def test_div_npt():
    from numpy import testing as npt
    
    array1 = np.arange(10)*2
    array2 = np.arange(10)
    
    func_mat =  function_d(array1, array2, operation='/')
    
    test_mat = np.empty(((len(array1), len(array2))))
    for i, val1 in enumerate(array1):
        for j, val2 in enumerate(array2):
            test_mat[i, j] =  val1 / val2
            
    npt.assert_array_almost_equal(func_mat, test_mat)

Problem 3: Set up travis to run your tests whenever a change is made

Now that you have a testing suite set up, you can try to turn on a continuous integration service to constantly check that any update you might send doesn't create a bug. We will the Travis-CI service for this purpose, as it has one of the lowest barriers to entry from Github.

3a: Ensure the test suite is passing locally

Seems obvious, but it's easy to forget to check this and only later realize that all the trouble you thought you had setting up the CI service was because the tests were actually broken...


In [ ]:
!py.test

3b: Set up an account on travis

This turns out to be quite convenient. If you go to the Travis web site, you'll see a "Sign in with GitHub" button. You'll need to authorize Travis, but once you've done so it will automatically log you in and know which repositories are yours.

3c: Create a minimal .travis.yml file.

Before we can activate travis on our repo, we need to tell travis a variety of metadata about what's in the repository and how to run it. The template below should be sufficient for the simplest needs.


In [ ]:
%%file .travis.yml

language: python
python:
  - "3.6"
# command to install dependencies
#install: "pip install numpy"  #uncomment this if your code depends on numpy or similar
# command to run tests
script: pytest

Be sure to commit and push this to github before proceeding:


In [ ]:
!git #complete

3d: activate Travis

You can now click on your profile picture in the upper-right and choose "accounts". You should see your repo listed there, presumably with a grey X next to it. Click on the X, which should slide the button over and therefore activate travis on that repository. Once you've done this, you should be able to click on the name of the reposository in the travis accounts dashboard, popping up a window showing the build already in progress (if not, just be a bit patient).

Wait for the tests to complete. If all is good you should see a green check next to the repository name. Otherwise you'll need to go in and fix it and the tests will automatically trigger when you send a new update.

3e: Break the build

Make a small change to the repository to break a test. If all else fails simply add the following test:

def test_fail():
    assert False

Push that change up and go look at travis. It should automatically run the tests and result in them failing.

3f: Have your neighbor fix your repo

Challenge your nieghbor to find the bug and fix it. Have them follow the Pull Request workflow, but do not merge the PR until Travis' tests have finished (they should run automatically, and leave note in the github PR page to that effect). Once the tests have finished, they will tell you if the fix really does cure the bug. If it does, merge it and say thank you. If it doesn't, ask your neighbor to try updating their fix with the feedback from Travis...

Hint: it may be late in the day, but keep being nice!

Challenge Problem 1: Use py.test "parametrization"

py.test has a feature called test parametrization that can be extremely useful for writing easier-to-understand tests. The key idea is that you can use one simple test function, but with multiple inputs, and break that out into separate tests. At first glance this might appear similar to just one test where you interate over lots of inputs, but it's actually much more useful because it doesn't stop at the first failure. Rather it will run all the inputs ever time, helpinf you debug subtle problems where only certain inputs fail.

For more info and how to actually use the feature, see the py.test docs on the subject. In this challenge problem, try adapting the Problem 2 cases to use this feature. 2c and 2d are particularly amenable to this approach.

Challenge Problem 2: Test-driven development

Test-driven development is a radically different approach to designing code from what we're generally used to. In test-driven design, you write the tests first. That is, you write how you expect your code to behave before writing the code.

For this problem, try experimenting with test-driven desgin. Choose a problem (ideally from your science interests) where you know some clear cases that you could write tests for. Write the full testing suite (using the techniques you developed above). Then run the tests to ensure all the new ones are failing due to lack of implementation, and then write the new code. A few ideas are given below, but, again, for a real challenge try to come up with your own problem.

  • Compute the location of Lagrange points for two arbitrary mass bodies. (Good test cases are the Earth-Moon or Earth-Sun system, which you can probably find on wikipedia.) Consider solving the problem numerically instead of with formulae you can look up, but use the formulae to concoct the test cases.
  • Write a function that uses one of the a clustering algorithm in scikit-learn to identify the centers of two 2D gaussian point-clouds. The tests are particularly easy to formulate before-hand because you know the right answer at the outset if you generate the point-clouds yourself.