Errors, or bugs, in your software

Today we'll cover dealing with errors in your Python code, an important aspect of writing software.

What is a software bug?

According to Wikipedia (accessed 16 Oct 2018), a software bug is an error, flaw, failure, or fault in a computer program or system that causes it to produce an incorrect or unexpected result, or behave in unintended ways.

Where did the terminology come from?

Engineers have used the term well before electronic computers and software. Sometimes Thomas Edison is credited with the first recorded use of bug in that fashion. [Wikipedia]

If incorrect code is never executed, is it a bug?

This is the software equivalent to "If a tree falls and no one hears it, does it make a sound?".

Three classes of bugs

Let's discuss three major types of bugs in your code, from easiest to most difficult to diagnose:

  1. Syntax errors: Errors where the code is not written in a valid way. (Generally easiest to fix.)
  2. Runtime errors: Errors where code is syntactically valid, but fails to execute. Often throwing exceptions here. (Sometimes easy to fix, harder when in other's code.)
  3. Semantic errors: Errors where code is syntactically valid, but contain errors in logic. (Can be difficult to fix.)

In [1]:
import numpy as np

Syntax errors


In [2]:
print( "This should only work in Python 2.x, not 3.x used in this class.")


This should only work in Python 2.x, not 3.x used in this class.

In [3]:
x = 1; y = 2
b = x == y # Boolean variable that is true when x & y have the same value
b = 1 = 2


  File "<ipython-input-3-03308e24f762>", line 3
    b = 1 = 2
       ^
SyntaxError: can't assign to literal

Runtime errors


In [ ]:
# invalid operation
    a = 0
    5/a  # Division by zero

In [ ]:
# invalid operation
input = '40'
input/11  # Incompatiable types for the operation

In [ ]:
str(21).index("1")

Semantic errors

Say we're trying to confirm that a trigonometric identity holds. Let's use the basic relationship between sine and cosine, given by the Pythagorean identity"

$$ \sin^2 \theta + \cos^2 \theta = 1 $$

We can write a function to check this:


In [ ]:
import math
import numpy as np

'''Checks that Pythagorean identity holds for one input, theta'''
def check_pythagorean_identity(theta):
    return math.sin(theta)**2 + math.cos(theta)**2 == 1

In [ ]:
check_pythagorean_identity(0)

In [ ]:
check_pythagorean_identity(np.pi)

Is our code correct?

How to find and resolve bugs?

Debugging has the following steps:

  1. Detection of an exception or invalid results.
  2. Isolation of where the program causes the error. This is often the most difficult step.
  3. Resolution of how to change the code to eliminate the error. Mostly, it's not too bad, but sometimes this can cause major revisions in codes.

Detection of Bugs

The detection of bugs is too often done by chance. While running your Python code, you encounter unexpected functionality, exceptions, or syntax errors. While we'll focus on this in today's lecture, you should never leave this up to chance in the future.

Software testing practices allow for thoughtful detection of bugs in software. We'll discuss more in the lecture on testing.

Isolation of Bugs

There are three main methods commonly used for bug isolation:

  1. The "thought" method. Think about how your code is structured and so what part of your could would most likely lead to the exception or invalid result.
  2. Inserting print statements (or other logging techniques)
  3. Using a line-by-line debugger like pdb.

Typically, all three are used in combination, often repeatedly.

Using print statements

Say we're trying to compute the entropy of a set of probabilities. The form of the equation is

$$ H = -\sum_i p_i \log(p_i) $$

We can write the function like this:


In [ ]:
import numpy as np
def entropy(p):
    """
     arg p: list of float
    """
    items = p * np.log(p)
    return -np.sum(items)

In [ ]:
entropy([0.5, 0.5])

In [ ]:
import numpy as np
def improved_entropy(p):
    """
     arg p: list of float
    """
    items = p * np.log(p)
    new_items = []
    print("1 " + str(new_items))
    for item in items:
        if np.isnan(item):
            pass
        else:
            new_items.append(item)
        print("2 " + str(new_items))
    return -np.sum(new_items)

In [ ]:
# Detailed examination of codes
p = [1, 0.0]
p * np.log(p)

In [ ]:
improved_entropy([1, 0.0])

Next steps:

  • Other inputs
  • Determine reason for errors by looking at details of codes

Using Python's debugger, pdb

Python comes with a built-in debugger called pdb. It allows you to step line-by-line through a computation and examine what's happening at each step. Note that this should probably be your last resort in tracing down a bug. I've probably used it a dozen times or so in five years of coding. But it can be a useful tool to have in your toolbelt.

You can use the debugger by inserting the line

import pdb; pdb.set_trace()

within your script. To leave the debugger, type "exit()". To see the commands you can use, type "help".

Let's try this out:


In [4]:
def debugging_entropy(p):
    items = p * np.log(p) 
    if any([np.isnan(v) for v in items]):
        import pdb; pdb.set_trace()
    return -np.sum(items)

In [5]:
debugging_entropy([0.5, 0.5])


Out[5]:
0.6931471805599453

This can be a more convenient way to debug programs and step through the actual execution.


In [6]:
p = [.1, -.2, .3]
debugging_entropy(p)


/home/ubuntu/miniconda3/lib/python3.6/site-packages/ipykernel_launcher.py:2: RuntimeWarning: invalid value encountered in log
  
> <ipython-input-4-f48f7c4b8de6>(5)debugging_entropy()
-> return -np.sum(items)
(Pdb) exit()
--------------------------------------------------------------
BdbQuit                      Traceback (most recent call last)
<ipython-input-6-b9fbbb90fe6e> in <module>
      1 p = [.1, -.2, .3]
----> 2 debugging_entropy(p)

<ipython-input-4-f48f7c4b8de6> in debugging_entropy(p)
      3     if any([np.isnan(v) for v in items]):
      4         import pdb; pdb.set_trace()
----> 5     return -np.sum(items)

<ipython-input-4-f48f7c4b8de6> in debugging_entropy(p)
      3     if any([np.isnan(v) for v in items]):
      4         import pdb; pdb.set_trace()
----> 5     return -np.sum(items)

~/miniconda3/lib/python3.6/bdb.py in trace_dispatch(self, frame, event, arg)
     46             return # None
     47         if event == 'line':
---> 48             return self.dispatch_line(frame)
     49         if event == 'call':
     50             return self.dispatch_call(frame, arg)

~/miniconda3/lib/python3.6/bdb.py in dispatch_line(self, frame)
     65         if self.stop_here(frame) or self.break_here(frame):
     66             self.user_line(frame)
---> 67             if self.quitting: raise BdbQuit
     68         return self.trace_dispatch
     69 

BdbQuit: