Lecture 7

Software design, documentation, and testing

Design of a program

From the Practice of Programming:

The essence of design is to balance competing goals and constraints. Although there may be many tradeoffs when one is writing a small self-contained system, the ramifications of particular choices remain within the system and affect only the individual programmer. But when code is to be used by others, decisions have wider repercussions.

Software Design Desirables

  • Documentation
    • names (understandable names)
    • pre+post conditions or requirements
  • Maintainability
    • Extensibility
    • Modularity and Encapsulation
  • Portability
  • Installability
  • Generality
    • Data Abstraction (change types, change data structures)
    • Functional Abstraction (the object model, overloading)
    • Robustness
      • Provability: Invariants, preconditions, postconditions
      • User Proofing, Adversarial Inputs
  • Efficiency
    • Use of appropriate algorithms and data structures
    • Optimization (but no premature optimization)

Issues to be aware of:

  • Interfaces

    Your program is being designed to be used by someone: either an end user, another programmer, or even yourself. This interface is a contract between you and the user.

  • Hiding Information

    There is information hiding between layers (a higher up layer can be more abstract). Encapsulation, abstraction, and modularization, are some of the techniques used here.

  • Resource Management

    Resource management issues: who allocates storage for data structures. Generally we want resource allocation/deallocation to happen in the same layer.

  • How to Deal with Errors

    Do we return special values? Do we throw exceptions? Who handles them?

Interface principles

Interfaces should:

  • hide implementation details
  • have a small set of operations exposed, the smallest possible, and these should be orthogonal. Be stingy with the user.
  • be transparent with the user in what goes on behind the scenes
  • be consistent internally: library functions should have similar signature, classes similar methods, and external programs should have the same cli flags

Testing should deal with ALL of the issues above, and each layer ought to be tested separately .

Testing

There are different kinds of tests inspired by the interface principles just described.

  • acceptance tests verify that a program meets a customer's expectations. In a sense these are a test of the interface to the customer: does the program do everything you promised the customer it would do?

  • unit tests are tests which test a unit of the program for use by another unit. These could test the interface for a client, but they must also test the internal functions that you want to use.

Exploratory testing, regression testing, and integration testing are done in both of these categories, with the latter trying to combine layers and subsystems, not necessarily at the level of an entire application.

One can also performance test, random and exploratorily test, and stress test a system (to create adversarial situations).

Documentation

Documentation is a contract between a user (client) and an implementor (library writer).

Write good documentation

  • Follow standards of PEP 257
  • Clearly outline the inputs, outputs, default values, and expected behavior
  • Include basic usage examples when possible

In [1]:
def quad_roots(a=1.0, b=2.0, c=0.0):
    """Returns the roots of a quadratic equation: ax^2 + bx + c = 0.
    
    INPUTS
    =======
    a: float, optional, default value is 1
       Coefficient of quadratic term
    b: float, optional, default value is 2
       Coefficient of linear term
    c: float, optional, default value is 0
       Constant term
    
    RETURNS
    ========
    roots: 2-tuple of complex floats
       Has the form (root1, root2) unless a = 0 
       in which case a ValueError exception is raised
    
    EXAMPLES
    =========
    >>> quad_roots(1.0, 1.0, -12.0)
    ((3+0j), (-4+0j))
    """
    import cmath # Can return complex numbers from square roots
    if a == 0:
        raise ValueError("The quadratic coefficient is zero.  This is not a quadratic equation.")
    else:
        sqrtdisc = cmath.sqrt(b * b - 4.0 * a * c)
        r1 = -b + sqrtdisc
        r2 = -b - sqrtdisc
        return (r1 / 2.0 / a, r2 / 2.0 / a)

Documenting Invariants

  • An invariant is something that is true at some point in the code.
  • Invariants and the contract are what we use to guide our implementation.
  • Pre-conditions and post-conditions are special cases of invariants.
  • Pre-conditions are true at function entry. They constrain the user.
  • Post-conditions are true at function exit. They constrain the implementation.

You can change implementations, stuff under the hood, etc, but once the software is in the wild you can't change the pre-conditions and post-conditions since the client user is depending upon them.


In [2]:
def quad_roots(a=1.0, b=2.0, c=0.0):
    """Returns the roots of a quadratic equation: ax^2 + bx + c.
    
    INPUTS
    =======
    a: float, optional, default value is 1
       Coefficient of quadratic term
    b: float, optional, default value is 2
       Coefficient of linear term
    c: float, optional, default value is 0
       Constant term
    
    RETURNS
    ========
    roots: 2-tuple of complex floats
       Has the form (root1, root2) unless a = 0 
       in which case a ValueError exception is raised

    NOTES
    =====
    PRE: 
         - a, b, c have numeric type
         - three or fewer inputs
    POST:
         - a, b, and c are not changed by this function
         - raises a ValueError exception if a = 0
         - returns a 2-tuple of roots

    EXAMPLES
    =========
    >>> quad_roots(1.0, 1.0, -12.0)
    ((3+0j), (-4+0j))
    """
    import cmath # Can return complex numbers from square roots
    if a == 0:
        raise ValueError("The quadratic coefficient is zero.  This is not a quadratic equation.")
    else:
        sqrtdisc = cmath.sqrt(b * b - 4.0 * a * c)
        r1 = -b + sqrtdisc
        r2 = -b - sqrtdisc
        return (r1 / 2.0 / a, r2 / 2.0 / a)

Accessing Documentation (1)

  • Documentation can be accessed by calling the __doc__ special method
  • Simply calling function_name.__doc__ will give a pretty ugly output
  • You can make it cleaner by making use of splitlines()

In [3]:
quad_roots.__doc__.splitlines()


Out[3]:
['Returns the roots of a quadratic equation: ax^2 + bx + c.',
 '    ',
 '    INPUTS',
 '    =======',
 '    a: float, optional, default value is 1',
 '       Coefficient of quadratic term',
 '    b: float, optional, default value is 2',
 '       Coefficient of linear term',
 '    c: float, optional, default value is 0',
 '       Constant term',
 '    ',
 '    RETURNS',
 '    ========',
 '    roots: 2-tuple of complex floats',
 '       Has the form (root1, root2) unless a = 0 ',
 '       in which case a ValueError exception is raised',
 '',
 '    NOTES',
 '    =====',
 '    PRE: ',
 '         - a, b, c have numeric type',
 '         - three or fewer inputs',
 '    POST:',
 '         - a, b, and c are not changed by this function',
 '         - raises a ValueError exception if a = 0',
 '         - returns a 2-tuple of roots',
 '',
 '    EXAMPLES',
 '    =========',
 '    >>> quad_roots(1.0, 1.0, -12.0)',
 '    ((3+0j), (-4+0j))',
 '    ']

Accessing Documentation (2)

A nice way to access the documentation is to use the pydoc module.


In [4]:
import pydoc
pydoc.doc(quad_roots)


Python Library Documentation: function quad_roots in module __main__

quad_roots(a=1.0, b=2.0, c=0.0)
    Returns the roots of a quadratic equation: ax^2 + bx + c.
    
    INPUTS
    =======
    a: float, optional, default value is 1
       Coefficient of quadratic term
    b: float, optional, default value is 2
       Coefficient of linear term
    c: float, optional, default value is 0
       Constant term
    
    RETURNS
    ========
    roots: 2-tuple of complex floats
       Has the form (root1, root2) unless a = 0 
       in which case a ValueError exception is raised
    
    NOTES
    =====
    PRE: 
         - a, b, c have numeric type
         - three or fewer inputs
    POST:
         - a, b, and c are not changed by this function
         - raises a ValueError exception if a = 0
         - returns a 2-tuple of roots
    
    EXAMPLES
    =========
    >>> quad_roots(1.0, 1.0, -12.0)
    ((3+0j), (-4+0j))

Testing

There are different kinds of tests inspired by the interface principles just described.

  • acceptance tests verify that a program meets a customer's expectations. In a sense these are a test of the interface to the customer: does the program do everything you promised the customer it would do?

  • unit tests are tests which test a unit of the program for use by another unit. These could test the interface for a client, but they must also test the internal functions that you want to use.

Exploratory testing, regression testing, and integration testing are done in both of these categories, with the latter trying to combine layers and subsystems, not necessarily at the level of an entire application.

One can also performance test, random and exploratorily test, and stress test a system (to create adversarial situations).

Testing of a program

Test as you write your program.

This is so important that I repeat it.

Test as you go.

From The Practice of Programming:

The effort of testing as you go is minimal and pays off handsomely. Thinking about testing as you write a program will lead to better code, because that's when you know best what the code should do. If instead you wait until something breaks, you will probably have forgotten how the code works. Working under pressure, you will need to figure it out again, which takes time, and the fixes will be less thorough and more fragile because your refreshed understanding is likely to be incomplete.

Test Driven Develoment

doctest

The doctest module allows us to test pieces of code that we put into our doc. string.

The doctests are a type of unit test, which document the interface of the function by example.

Doctests are an example of a test harness. We write some tests and execute them all at once. Note that individual tests can be written and executed individually in an ad-hoc manner. However, that is especially inefficient.

Of course, too many doctests clutter the documentation section.

The doctests should not cover every case; they should describe the various ways a class or function can be used. There are better ways to do more comprehensive testing.


In [5]:
import doctest
doctest.testmod(verbose=True)


Trying:
    quad_roots(1.0, 1.0, -12.0)
Expecting:
    ((3+0j), (-4+0j))
ok
1 items had no tests:
    __main__
1 items passed all tests:
   1 tests in __main__.quad_roots
1 tests in 2 items.
1 passed and 0 failed.
Test passed.
Out[5]:
TestResults(failed=0, attempted=1)

Principles of Testing

  • Test simple parts first
  • Test code at its boundaries
    • The idea is that most errors happen at data boundaries such as empty input, single input item, exactly full array, wierd values, etc. If a piece of code works at the boundaries, its likely to work elsewhere...
  • Program defensively

    "Program defensively. A useful technique is to add code to handle "can't happen" cases, situations where it is not logically possible for something to happen but (because of some failure elsewhere) it might anyway. As an example, a program processing grades might expect that there would be no negative or huge values but should check anyway.

  • Automate using a test harness
  • Test incrementally

Test simple parts first:

A test for the quad_roots function:


In [6]:
def test_quadroots():
    assert quad_roots(1.0, 1.0, -12.0) == ((3+0j), (-4+0j))

test_quadroots()

Test at the boundaries

Here we write a test to handle the crazy case in which the user passes strings in as the coefficients.


In [7]:
def test_quadroots_types():
    try:
        quad_roots("", "green", "hi")
    except TypeError as err:
        assert(type(err) == TypeError)

test_quadroots_types()

We can also check to make sure the $a=0$ case is handled okay:


In [8]:
def test_quadroots_zerocoeff():
    try:
        quad_roots(a=0.0)
    except ValueError as err:
        assert(type(err) == ValueError)

test_quadroots_zerocoeff()

When you get an error

It could be that:

  • you messed up an implementation
  • you did not handle a case
  • your test was messed up (be careful of this)

If the error was not found in an existing test, create a new test that represents the problem before you do anything else. The test should capture the essence of the problem: this process itself is useful in uncovering bugs. Then this error may even suggest more tests.

Automate Using a Test Harness

Great! So we've written some ad-hoc tests. It's pretty clunky. We should use a test harness.

As mentioned already, doctest is a type of test harness. It has it's uses, but gets messy quickly.

We'll talk about pytest here.

Preliminaries

  1. The idea is that our code consists of several different pieces (or objects)
  2. The objects are grouped based on how they are related to each other
    • e.g. you may have a class that contains different statistical operations
    • We'll get into this idea much more in the coming weeks
  3. For now, we can think of having related functions all in one file
  4. We want to test each of those functions
    • Tests should include checking correctness of output, correctness of input, fringe cases, etc

I will work in the Jupyter notebook for demo purposes.

To create and save a file in the Jupyter notebook, you type %%file file_name.py.

I highly recommend that you actually write your code using a text editor (like vim) or an IDE like Sypder.

The toy examples that we've been working with in the class so far can be done in Jupyter, but a real project can be done more efficiently through other means.


In [9]:
%%file roots.py
def quad_roots(a=1.0, b=2.0, c=0.0):
    """Returns the roots of a quadratic equation: ax^2 + bx + c = 0.
    
    INPUTS
    =======
    a: float, optional, default value is 1
       Coefficient of quadratic term
    b: float, optional, default value is 2
       Coefficient of linear term
    c: float, optional, default value is 0
       Constant term
    
    RETURNS
    ========
    roots: 2-tuple of complex floats
       Has the form (root1, root2) unless a = 0 
       in which case a ValueError exception is raised
    
    EXAMPLES
    =========
    >>> quad_roots(1.0, 1.0, -12.0)
    ((3+0j), (-4+0j))
    """
    import cmath # Can return complex numbers from square roots
    if a == 0:
        raise ValueError("The quadratic coefficient is zero.  This is not a quadratic equation.")
    else:
        sqrtdisc = cmath.sqrt(b * b - 4.0 * a * c)
        r1 = -b + sqrtdisc
        r2 = -b - sqrtdisc
        return (r1 / 2.0 / a, r2 / 2.0 / a)


Overwriting roots.py

Let's put our tests into one file.


In [10]:
%%file test_roots.py
import roots

def test_quadroots_result():
    assert roots.quad_roots(1.0, 1.0, -12.0) == ((3+0j), (-4+0j))

def test_quadroots_types():
    try:
        roots.quad_roots("", "green", "hi")
    except TypeError as err:
        assert(type(err) == TypeError)

def test_quadroots_zerocoeff():
    try:
        roots.quad_roots(a=0.0)
    except ValueError as err:
        assert(type(err) == ValueError)


Overwriting test_roots.py

In [11]:
!pytest


============================= test session starts ==============================
platform darwin -- Python 3.6.1, pytest-3.0.7, py-1.4.33, pluggy-0.4.0
rootdir: /Users/dsondak/Teaching/Harvard/CS207/F17/students/cs207_david_sondak/lectures/L7, inifile:
plugins: cov-2.5.1
collected 3 items 

test_roots.py ...

=========================== 3 passed in 0.02 seconds ===========================

Code Coverage

In some sense, it would be nice to somehow check that every line in a program has been covered by a test. If you could do this, you might know that a particular line has not contributed to making something wrong. But this is hard to do: it would be hard to use normal input data to force a program to go through particular statements. So we settle for testing the important lines. The pytest-cov module makes sure that this works.

Coverage does not mean that every edge case has been tried, but rather, every critical statement has been.

Let's add a new function to our roots file.


In [12]:
%%file roots.py
def linear_roots(a=1.0, b=0.0):
    """Returns the roots of a linear equation: ax+ b = 0.
    
    INPUTS
    =======
    a: float, optional, default value is 1
       Coefficient of linear term
    b: float, optional, default value is 0
       Coefficient of constant term
    
    RETURNS
    ========
    roots: 1-tuple of real floats
       Has the form (root) unless a = 0 
       in which case a ValueError exception is raised
    
    EXAMPLES
    =========
    >>> linear_roots(1.0, 2.0)
    -2.0
    """
    if a == 0:
        raise ValueError("The linear coefficient is zero.  This is not a linear equation.")
    else:
        return ((-b / a))

def quad_roots(a=1.0, b=2.0, c=0.0):
    """Returns the roots of a quadratic equation: ax^2 + bx + c = 0.
    
    INPUTS
    =======
    a: float, optional, default value is 1
       Coefficient of quadratic term
    b: float, optional, default value is 2
       Coefficient of linear term
    c: float, optional, default value is 0
       Constant term
    
    RETURNS
    ========
    roots: 2-tuple of complex floats
       Has the form (root1, root2) unless a = 0 
       in which case a ValueError exception is raised
    
    EXAMPLES
    =========
    >>> quad_roots(1.0, 1.0, -12.0)
    ((3+0j), (-4+0j))
    """
    import cmath # Can return complex numbers from square roots
    if a == 0:
        raise ValueError("The quadratic coefficient is zero.  This is not a quadratic equation.")
    else:
        sqrtdisc = cmath.sqrt(b * b - 4.0 * a * c)
        r1 = -b + sqrtdisc
        r2 = -b - sqrtdisc
        return (r1 / 2.0 / a, r2 / 2.0 / a)


Overwriting roots.py

Run the tests and check code coverage


In [13]:
!pytest --cov


============================= test session starts ==============================
platform darwin -- Python 3.6.1, pytest-3.0.7, py-1.4.33, pluggy-0.4.0
rootdir: /Users/dsondak/Teaching/Harvard/CS207/F17/students/cs207_david_sondak/lectures/L7, inifile:
plugins: cov-2.5.1
collected 3 items 

test_roots.py ...

---------- coverage: platform darwin, python 3.6.1-final-0 -----------
Name            Stmts   Miss  Cover
-----------------------------------
roots.py           12      3    75%
test_roots.py      13      0   100%
-----------------------------------
TOTAL              25      3    88%


=========================== 3 passed in 0.02 seconds ===========================

Run the tests, report code coverage, and report missing lines.


In [14]:
!pytest --cov --cov-report term-missing


============================= test session starts ==============================
platform darwin -- Python 3.6.1, pytest-3.0.7, py-1.4.33, pluggy-0.4.0
rootdir: /Users/dsondak/Teaching/Harvard/CS207/F17/students/cs207_david_sondak/lectures/L7, inifile:
plugins: cov-2.5.1
collected 3 items 

test_roots.py ...

---------- coverage: platform darwin, python 3.6.1-final-0 -----------
Name            Stmts   Miss  Cover   Missing
---------------------------------------------
roots.py           12      3    75%   22-25
test_roots.py      13      0   100%
---------------------------------------------
TOTAL              25      3    88%


=========================== 3 passed in 0.02 seconds ===========================

Run tests, including the doctests, report code coverage, and report missing lines.


In [15]:
!pytest --doctest-modules --cov --cov-report term-missing


============================= test session starts ==============================
platform darwin -- Python 3.6.1, pytest-3.0.7, py-1.4.33, pluggy-0.4.0
rootdir: /Users/dsondak/Teaching/Harvard/CS207/F17/students/cs207_david_sondak/lectures/L7, inifile:
plugins: cov-2.5.1
collected 5 items 

roots.py ..
test_roots.py ...

---------- coverage: platform darwin, python 3.6.1-final-0 -----------
Name            Stmts   Miss  Cover   Missing
---------------------------------------------
roots.py           12      1    92%   23
test_roots.py      13      0   100%
---------------------------------------------
TOTAL              25      1    96%


=========================== 5 passed in 0.05 seconds ===========================

Let's put some tests in for the linear roots function.


In [16]:
%%file test_roots.py
import roots

def test_quadroots_result():
    assert roots.quad_roots(1.0, 1.0, -12.0) == ((3+0j), (-4+0j))

def test_quadroots_types():
    try:
        roots.quad_roots("", "green", "hi")
    except TypeError as err:
        assert(type(err) == TypeError)

def test_quadroots_zerocoeff():
    try:
        roots.quad_roots(a=0.0)
    except ValueError as err:
        assert(type(err) == ValueError)

def test_linearoots_result():
    assert roots.linear_roots(2.0, -3.0) == 1.5

def test_linearroots_types():
    try:
        roots.linear_roots("ocean", 6.0)
    except TypeError as err:
        assert(type(err) == TypeError)

def test_linearroots_zerocoeff():
    try:
        roots.linear_roots(a=0.0)
    except ValueError as err:
        assert(type(err) == ValueError)


Overwriting test_roots.py

Now run the tests and check code coverage.


In [17]:
!pytest --doctest-modules --cov --cov-report term-missing


============================= test session starts ==============================
platform darwin -- Python 3.6.1, pytest-3.0.7, py-1.4.33, pluggy-0.4.0
rootdir: /Users/dsondak/Teaching/Harvard/CS207/F17/students/cs207_david_sondak/lectures/L7, inifile:
plugins: cov-2.5.1
collected 8 items 

roots.py ..
test_roots.py ......

---------- coverage: platform darwin, python 3.6.1-final-0 -----------
Name            Stmts   Miss  Cover   Missing
---------------------------------------------
roots.py           12      0   100%
test_roots.py      25      0   100%
---------------------------------------------
TOTAL              37      0   100%


=========================== 8 passed in 0.06 seconds ===========================