In [1]:
# code for loading the format for the notebook
import os
# path : store the current path to convert back to it later
path = os.getcwd()
os.chdir(os.path.join('..', 'notebook_format'))
from formats import load_style
load_style(plot_style=False)
Out[1]:
In [2]:
os.chdir(path)
# 1. magic to print version
# 2. magic so that the notebook will reload external python modules
%load_ext watermark
%load_ext autoreload
%autoreload 2
%watermark -a 'Ethen' -d -t -v
Once our application grows beyond a basic project, having good logging as oppose to just print statement is going to allow us to look at behaviors and errors over time and give us a better overall picture of what's going on.
There are 5 different kinds of logging levels. Levels allow us to to specify exactly what we want to log by separating them into categories. The description of each of these is as follows:
DEBUG
: Detailed debug information, typically of interest only when diagnosing problems.INFO
: Confirmation that things are working as expected.WARNING
: An indication that something unexpected happened, or indicative of some problem in the near future (e.g. disk space low). The software is still working as expected.ERROR
: Due to a more serious problem, the software has not been able to perform some function.CRITICAL
: A serious error, indicating that the program itself may be unable to continue running.The default level is WARNING, meaning that it will capture anything that is a warning or above, and ignore the DEBUG and INFO level. we can change this behavior using the basicConfig method.
In [3]:
import logging
from imp import reload
# jupyter notebook already uses logging, thus we reload the module to make it work in notebooks
# http://stackoverflow.com/questions/18786912/get-output-from-the-logging-module-in-ipython-notebook
reload(logging)
# In the following not only did we change the logging level, but
# also specify a logging file to write the log to, and the format.
# the format we specified here is simply the time, the level name
# and the message that we'll later specify, for more information
# about what format we can specify, refer to the following webpage
# https://docs.python.org/3/library/logging.html#logrecord-attributes
logging.basicConfig(filename='test.log', level=logging.DEBUG,
format='%(asctime)s:%(levelname)s:%(message)s')
def add(x, y):
"""Add Function"""
return x + y
def subtract(x, y):
"""Subtract Function"""
return x - y
def multiply(x, y):
"""Multiply Function"""
return x * y
def divide(x, y):
"""Divide Function"""
return x / y
In [4]:
num1 = 20
num2 = 10
# add logging debugrmation instead of print statement
# to record what was going on, note that if we were to
# run it for multiple
add_result = add(num1, num2)
logging.debug('Add: {} + {} = {}'.format(num1, num2, add_result))
sub_result = subtract(num1, num2)
logging.debug('Sub: {} - {} = {}'.format(num1, num2, sub_result))
mul_result = multiply(num1, num2)
logging.debug('Mul: {} * {} = {}'.format(num1, num2, mul_result))
div_result = divide(num1, num2)
logging.debug('Div: {} / {} = {}'.format(num1, num2, div_result))
After running the code, we should a logging file in the same directory as the notebook. And it should something along the lines of:
2017-03-16 13:55:32,075:DEBUG:Add: 20 + 10 = 30
2017-03-16 13:55:32,076:DEBUG:Sub: 20 - 10 = 10
2017-03-16 13:55:32,076:DEBUG:Mul: 20 * 10 = 200
2017-03-16 13:55:32,076:DEBUG:Div: 20 / 10 = 2.0
The code chunk logs the information in the root logger. If we have multiple scripts that does the logging, they will get logged to the same place, which might not be ideal. Thus we can create a separate logger for each module.
In [5]:
import logging
# then specify the module's logger, the logger's level
# and add the handler to the logger
logger = logging.getLogger(__name__)
logger.setLevel(logging.ERROR)
# boiler plate, set the format using Formatter,
# and set the file to log to with FileHandler
formatter = logging.Formatter('%(asctime)s:%(levelname)s:%(message)s')
file_handler = logging.FileHandler('math.log')
file_handler.setFormatter(formatter)
logger.addHandler(file_handler)
# note that we do not need to reload the logging
# module this time, as it will not have conflict
# with jupyter notebook's logging behavior
def add(x, y):
"""Add Function"""
return x + y
def subtract(x, y):
"""Subtract Function"""
return x - y
def multiply(x, y):
"""Multiply Function"""
return x * y
def divide(x, y):
"""Divide Function"""
try:
result = x / y
except ZeroDivisionError:
# by calling .exception it will produce the traceback,
# which is helpful for knowing where the bug occurred
logger.exception('Tried to divide by 0')
else:
return result
In [6]:
num1 = 20
num2 = 0
# note that we'll use the logger we explicitly created
# to log to message as oppose to logging in the last example
add_result = add(num1, num2)
logger.info('Add: {} + {} = {}'.format(num1, num2, add_result))
sub_result = subtract(num1, num2)
logger.info('Sub: {} - {} = {}'.format(num1, num2, sub_result))
mul_result = multiply(num1, num2)
logger.info('Mul: {} * {} = {}'.format(num1, num2, mul_result))
div_result = divide(num1, num2)
logger.info('Div: {} / {} = {}'.format(num1, num2, div_result))
2017-03-16 15:02:58,144:ERROR:Tried to divide by 0
Traceback (most recent call last):
File "<ipython-input-7-573995498bac>", line 37, in divide
result = x / y
ZeroDivisionError: division by zero
In the previous section, we've were logging the exception to the log file. This is actually an important practice and should not be taken lightly.
When using try and except to catch exceptions, we should be more diligent in the way we handle the exceptions. In other words, NEVER do the following:
In [7]:
num1 = 20
num2 = 0
result = None
try:
result = num1 / num2
except: # or except Exception:
pass
result
It might be quick for us as a developer to simply ignore the exception so that we can move on with our life and not worry about dealing with the exception. But when we do this, it is actually doing a massive disservice as it not only swallows up the various errors that might occur, even the ones we weren't expecting when writing try, expect block, but also throws away the stack trace, which provides useful information to speed up our bugging process when an error occurs.
This is not to say that we should not be using try and expect to catch Exception, as this ensures our critical or long-running process doesn't go down due to some unexpected error. It is to enforce the fact that we should not be silently swallowing the error.
The solution to this is to use a more specific exception type and log the full stack trace. In the function we've defined, we:
ZeroDivisionError
to specify what errors we were expecting when writing the try, except block.
In [8]:
def divide(x, y):
"""Divide Function"""
try:
result = x / y
except ZeroDivisionError:
# by calling .exception it will produce the traceback,
# which is helpful for knowing where the bug occurred
logger.exception('Tried to divide by 0')
else:
return result
If for some reason, our application does logging in some other way, then we can use the trackback module to format the exception's stack trace.
In [9]:
import traceback
def divide(x, y):
"""Divide Function"""
try:
result = x / y
output = {'result': result}
except ZeroDivisionError as ex:
error_message = log_traceback(ex)
output = {'result': None, 'error': error_message}
return output
def log_traceback(ex):
"""Format full stack trace into a single string."""
traceback_lines = traceback.format_exception(ex.__class__, ex, ex.__traceback__)
traceback_text = ''.join(traceback_lines)
return traceback_text
In [10]:
output = divide(20, 0)
output
Out[10]:
The bottom line is do log the full stack trace if we decide to use the try, except pattern in our codebase. This ensures we get as much information as possible when an unexpected error occurs, which significantly speeds up the debugging process.
Apart from the base logging module, there are projects like logzero that tries to make it less tedious to setup the logs.