Debugging strategies

You will get errors in your scripts. This is not a bad thing! It's just part of the process -- the error messages will help guide you to the solution. The key is to not get discouraged.

A typical development pattern: Write some code. Run it. See what errors break your script. Throw in some print() statements. Google around. Fix your errors. Rinse and repeat.

n.b. Googling your error is not "cheating" -- it's often the first step in resolving your problem. And if you get really stuck, don't be afraid to ask for help.

Dissecting a Python error

Let's step through an error and discuss strategies for resolving it.

Run the code in the next cell.


In [ ]:
x = 10

if x > 20
    print('x is greater than 20!')

The "traceback" message shows you a couple of useful things:

  • What line the error is on: line 3
  • The class of error: SyntaxError (v common)
  • Exactly where the error occured -- see where the ^ symbol is pointing?

What's the problem?

Googling

If it's not immediately clear what's wrong, I might start by Googling the error messsage, the word "python" and maybe some keywords for what I was trying to do when I got the error. Something like "SyntaxError: invalid syntax" python if statement

Click through the first couple of links -- you'll become very familiar with StackOverflow -- and see if you spot the problem.

Read the docs

If I'm still stuck, I might check out the documentation and examples for the thing I'm trying to do. Here's the page outlining how to write an if statement in Python. From there, I would copy the example code, run it, compare it line by line with my code and see what's different.

If I'm still stuck, I might see if there are other keywords to search on and take another run at Google.

Guess and check ¯\_(ツ)_/¯

"Maybe if I changed this thing ..." -- the muttered spell that punctuates many a successful debugging session. Tinker with your script, try things out, see if something works.

If you change something and suddenly your script runs without error, great! Your next step, if you have time, is to figure out why that change worked. Google around, read the docs, ask a more experienced developer.

Use print() liberally

Especially when you're iterating over data files, the print() function can be a lifesaver. Print the value before you do any operations on it -- that will show you whether the value is what you expect, and point you to the line of data that's causing your script to fail. Here's an example:


In [ ]:
staff = [
    {'name': 'Fran', 'age': 32, 'job': 'Reporter'},
    {'name': 'John', 'age': ' 41', 'job': 'Managing Editor'},
    {'name': 'Sue', 'age': 39, 'job': 'Executive Editor'}
]

Pretend, for a moment, that we were reading in this data from a file, so it's not immediately obvious what's causing the error. I'd start by adding a print statement to dump the entire value of the person variable at the beginning of the loop:


In [ ]:

Now I've isolated the line of data causing the problem, and I can see the cause: The value for John's age is a string with a leading space, not a number. Boom.

Ask for help

If you're hopelessly stuck, it's time to ask for help. You have many skilled friends in journalism who want to help you succeed -- pick a venue you're comfortable with (see below) and ask for help.

And of course feel free to contact me (cody@ire.org) or the rest of the training staff at IRE (training@ire.org) for help.

Get one thing to work at a time

In general, if you're trying to get something to work for all the data flowing through your script, it's a good idea to get it to work on one thing first.

For instance: Let's say you're processing data in a 30,000-line data file, and you want to reformat the dates from m/m/yyyy format to yyyy-mm-dd format. You've started work on a parsing function that currently looks like this:


In [ ]:
def parse_row(row):
    age, booking_date, dob = row
    # do something to reformat the date strings via Python date objects
    return

You need to figure out how to turn "9/7/1985" into "1985-09-07". Instead of calling that function on a "real" row of data inside the with() block where you're parsing your CSV file, however, start by doing something like this:


In [ ]:
from datetime import datetime

test_date = '9/7/1985'

parsed_date = datetime.strptime(test_date, '%m/%d/%Y').strftime('%Y-%m-%d')

print(parsed_date)

... and then once you've got the pattern down, add it to your parsing function.

Exercises: What's the prob, Bob?

For each of these Python snippets, figure out what the problems are and solve them.


In [ ]:
print(Hello, Minneapolis!)

In [ ]:
desk = {'wood': 'fir', 'color': 'black', 'height_in': 36, 'width_in': 48, 'length_in': 68}

print(desk['drawer_count'])

In [ ]:
students = ['Kelly', 'Larry', 'José', 'Frank', 'Sarah', 'Sue']

for student in students:
    if student = 'Kelly':
    print('It's Kelly!')
    elif student == 'José':
        print("It's José!")

In [ ]:
import cvs

with open('data/import-refusal-charge-codes.csv', r) as infile:
    reader = csv.reader(infile)
    
for row in reader:
    print(row)