A PEP is a Python Enhancement Proposal. PEP 8 (the eighth PEP) describes how to write Python code in a common style that will be easily readable by other programmers. If this seems unnecessary, consider that programmers spend much more time reading code than writing it.
You can read PEP 8 here: https://www.python.org/dev/peps/pep-0008/
Wouldn't it be nice if you didn't need to remember all of these silly rules for how to write PEP 8-consistent code? What if there was a tool that would tell you if your code matches PEP 8 conventions or no?
There is such a tool, called pycodestyle
.
In [ ]:
"""
This is some ugly code that does not conform to PEP 8.
Check me with pycodestyle:
pycodestyle ../resources/pep8_example.py
"""
from string import *
import math, os, sys
def f(x):
"""This function has lines that are just too long. The maximum suggested line length is 80 characters."""
return 4.27321*x**3 -8.375134*x**2 + 7.451431*x + 2.214154 - math.log(3.42153*x) + (1 + math.exp(-6.231452*x**2))
def g(x,
y):
print("Bad splitting of arguments")
# examples of bad spacing
mydict = { 'ham' : 2, 'eggs' : 7 }#this is badly spaced
mylist=[ 1 , 2 , 3 ]
myvar = 7
myvar2 = myvar*myvar
myvar10 = myvar**10
# badly formatted math
a= myvar+7 * 18-myvar2 / 2
l = 1 # l looks like 1 in some fonts
I = l # also bad
O = 0 # O looks like 0 in some fonts
In [ ]:
!pycodestyle ../resources/pep8_example.py
Load the ../resources/pep8_example.py
file in a text editor (you can use Jupyter notebook, or something else) and fix the problems in that pycodestyle
is complaining about. Then rerun pycodestyle
using the cell above, or from the terminal:
cd PythonWorkshop-ICE/resources
pycodestyle pep8_example.py
Use descriptive names for your variables, functions, and classes. In Python, the following conventions are usually observed:
index = 0
num_columns = 3
length_m = 7.2 # you can add units to a variable name
CU_SPECIFIC_HEAT_CAPACITY = 376.812 # J/(kg K)
class MyClass:
Programmers coming from other programming languages (especially FORTRAN and C/C++) should avoid using special encodings (e.g., Hungarian notation) in their variable names:
# don't do this!
iLoopVar = 0 # i indicates integer
szName = 'Test' # sz means 'string'
gGlobalVar = 7 # g indicates a global variable
Comments are helpful when they clarify code. They should be used sparingly. Why?
Consider this example:
In [ ]:
# this function does foo to the bar!
def foo(bar):
bar = not bar # bar is active low, so we invert the logic
if bar == True: # bar can sometimes be true
print("The bar is True!") # success!
else: # sometimes bar is not true
print("Argh!") # I hate it when the bar is not true!
Only one of these comments is helpful. This code is much easier to read when written properly:
In [ ]:
def foo(bar):
"""
This function does foo to the bar!
Bar is active low, so we invert the logic.
"""
bar = not bar # logic inversion
if bar:
print("The bar is True!")
else:
print("Argh!")
Doc-strings are a useful way to document what a function (or class) does.
In [ ]:
def add_two_numbers(a, b):
"""This function returns the result of a + b."""
return a + b
In a Jupyter notebook (like this one) or an iPython shell, you can get information about what a function does and what arguments it does by reading its doc-string:
In [ ]:
add_two_numbers?
Doc-strings can be several lines long:
In [ ]:
def analyze_data(data, old_format=False, make_plots=True):
"""
This function analyzes our super-important data.
If you want to use the old data format, set old_format to True.
Set make_plots to false if you do not want to plot the data.
"""
# analysis ...
If you are working on a large project, there may be project-specific conventions on how to write doc-strings. For example:
In [ ]:
def google_style_doc_string(arg1, arg2):
"""Example Google-style doc-string.
Put a brief description of what the function does here.
In this case, the function does nothing.
Args:
arg1 (str): Your full name (name + surname)
arg2 (int): Your favorite number
Returns:
bool: The return value. True for success, False otherwise.
"""
def scipy_style_doc_string(x, y):
"""This is a SciPy/NumPy-style doc-string.
All of the functions in SciPy and NumPy use this format for their
doc-strings.
Parameters
----------
x : float
Description of parameter `x`.
y :
Description of parameter `y` (with type not specified)
Returns
-------
err_code : int
Non-zero value indicates error code, or zero on success.
err_msg : str or None
Human readable error message, or None on success.
"""
In large Python projects, you may see doc-strings like this:
In [ ]:
def sphinx_example(variable):
"""This function does something.
:param variable: Some variable that the function uses.
:type variable: str.
:returns: int -- the return code.
"""
return 0
These doc-strings are for use with Sphinx, which can be used to automatically generate html documentation from code (similar to doxygen).
For the love of God and all that is holy, do not do this:
In [ ]:
from numpy import *
from scipy import *
from pickle import *
from scipy.stats import *
Why not? Imagine that you import these libraries at the top of your code. At some point in a ~200 line script, you see this:
In [ ]:
with open('../resources/mystery_data', 'rb') as f:
data = array(load(f))
x, y = data[:, 0], data[:, 1]
r = linregress(x, y)
s = polyfit(x, y, 1)
print(r.slope - s[0])
Can you identify which function belongs to which library? Don't do this!
Remember that it is best to keep your functions short and concise. As a result, it is best to avoid deeply nested if
... elif
... else
logic structures. These can become very long, which obscures the logic and makes them difficult to read. Consider this example:
In [ ]:
import os
class AnalyzeData:
def __init__(self, fname):
self.fname = fname
self.import_data()
self.analyze_data()
def import_data(self):
file_extension = os.path.splitext(self.fname)[-1]
if 'csv' in file_extension:
print("Import comma-separated data")
# many lines of code, maybe with several if statements
elif 'tab' in file_extension:
print("Import tab-separated data")
# many lines of code, maybe with several if statements
elif 'dat' in file_extension:
print("Import data with | delimiters (old-school)")
# many lines of code, maybe with several if statements
else:
print("Unknown data format. I give up!") # should use an exception here; see later...
return
def analyze_data(self):
"""Do some super-awesome data analysis!"""
This long list of if
statements is nasty to look at, and if you want to add more file types, it will become worse. Consider the alternative, which uses a dictionary with functions as values:
In [ ]:
class AnalyzeData:
def __init__(self, fname):
self.fname = fname
self.import_data()
self.analyze_data()
def import_data(self):
valid_extensions = {'.csv': self._import_csv,
'.tab': self._import_tab,
'.dat': self._import_dat}
file_extension = os.path.splitext(self.fname)[-1]
importer_function = valid_extensions[file_extension]
importer_function()
def _import_csv(self):
print("Import comma-separated data")
# many lines of code, perhaps with function calls
def _import_tab(self):
print("Import tab-separated data")
# many lines of code, perhaps with function calls
def _import_dat(self):
print("Import data with | delimiters (old-school)")
# many lines of code, perhaps with function calls
def analyze_data(self):
"""Do some super-awesome data analysis!"""
a = AnalyzeData('data.tab')
# a = AnalyzeData('data.xls') # unknown file type, throws exception!
This code is much clearer and nicer to read. Adding more valid file types increases import_data()
by only one line (actually the valid_extensions
dictionary could be moved out of this function), and removing file types is similarly easy.
Did you notice that the else
case is gone? Because we are using a dictionary, an invalid extension will automatically generate a KeyError
-- try uncommenting the last line in the cell above.
Finally, this type of structure makes unit testing much easier!
The following example was taken from a real C++ code, and converted to Python:
In [ ]:
for j in range(4): # Loop over course clipping
for i in [0, 64]: # Loop over each attenuation
for k in [63]: # Loop over fine clipping
for channel in range(7): # Loop over each channel
"""Does lots of stuff (55 lines of code)"""
"""Does some other stuff (30 lines of code) at end of i, k, and channel loops"""
Don't do this!
Many things that you would need a loop for in C++ can be done in one line in Python. Code with many nested loops will also run very slowly in Python. In the Numpy lesson you will learn how Numpy eliminates the need for many nested loops.
If you absolutely must use nested loops, try to wrap the interior code in a function:
In [ ]:
def loop():
for j in range(4):
for i in [0, 64]:
for k in [63]: # this is left-over code
for channel in range(7):
inner_loop_function(i, k, channel)
outer_loop_function(j)
Or, even better (with proper variable names!):
In [ ]:
def inner_loop(course_clipping):
fine_clipping = 63
for attenuation in [0, 64]:
for channel in range(7):
inner_loop_details(attenuation, fine_clipping, channel)
def outer_loop():
for course_clipping in range(4):
inner_loop(course_clipping)
outer_loop_details(j)
Python can be used as a scripting language (like Bash or Perl), and often times Python programs start out
as scripts. Here is an example of a script that renames image files (call it image_renamer.py
):
In [ ]:
#!/usr/bin/env python3
from glob import glob
import os
jpeg_file_list = glob('Image_*.jpg')
for old_file_name in jpeg_file_list:
fname_parts = old_file_name.split('_')
new_file_name = fname_parts[0] + '_0' + fname_parts[1] # add leading zero: 01 -> 001
os.rename(old_file_name, new_file_name)
The first line indicates to the shell that this is a Python 3 script (the #!
combination is called a shebang).
You can run this script as an executable from the shell, just like any other program:
chmod a+x image_renamer.py
./image_renamer.py
Often times, this is all you need. However, it has several disadvantages:
Functions solve all of these problems. Consider this code:
In [ ]:
"""
image_renamer.py -- simple script to rename images.
"""
from glob import glob
import sys
import os
def rename_images(image_list, test=False):
for old_file_name in image_list:
fname_parts = old_file_name.split('_')
new_file_name = fname_parts[0] + '_0' + fname_parts[1] # add leading zero: 01 -> 001
if test:
print(new_file_name)
else:
os.rename(old_file_name, new_file_name)
if __name__ == '__main__': # only run this part if the file is being executed as a script
directory = './'
if len(sys.argv) == 2:
directory = sys.argv[1]
jpeg_file_list = glob(directory + '/Image_*.jpg')
rename_images(jpeg_file_list)
To be fair, the code is now longer, and in some ways more complicated. However, it has several advantages over the simple script. Recalling our previous list, note that:
Reusing the code is now very easy:
"""new_code.py"""
from image_renamer import rename_images
rename_images(some_directory)
In [ ]:
rename_images(['Image_01.jpg', 'Image_02.jpg'], test=True)
Functions should have names that describe what they are for.
For example, what does this function do?
In [ ]:
def myfunc(mylist):
import re
f = re.compile('([0-9]+)_.*')
return [int(f.findall(mystr)[0]) for mystr in mylist]
myfunc(['000_Image.png', '123_Image.png', '054_Image.png'])
A better name could be:
def extract_integer_index(file_list):
If you name things well, it makes comments unnecessary. Your code will speak for itself!
Here is an example of a function that is a bit too long. It is not very long because it is an example, but in real physics code it is not uncommon to find single functions that are hundreds of lines long!
In [ ]:
def analyze():
print("******************************")
print(" Starting the Analysis! ")
print("******************************")
# create fake data
x = [4.1, 2.8, 6.7, 3.5, 7.9, 8.0, 2.1, 6.3, 6.6, 4.2, 1.5]
y = [2.2, 5.3, 6.3, 2.4, 0.1, 0.67, 7.8, 9.1, 7.1, 4.9, 5.1]
# make tuple and sort
data = list(zip(x, y))
data.sort()
# calculate statistics
y_sum = 0
xy_sum = 0
xxy_sum = 0
for xx, yy in data:
y_sum += xx
xy_sum += xx*yy
xxy_sum += xx*xx*yy
xbar = xy_sum / y_sum
x2bar = xxy_sum/y_sum
std_dev = (x2bar - xbar**2)**0.5
# print the results
print("Mean: ", xbar)
print("Std Dev:", std_dev)
print("Analysis successful!")
analyze()
How can we improve this code? Our analysis
function is really doing three things:
Each of these things can be put in a separate function.
In [ ]:
def generate_fake_data():
x = [4.1, 2.8, 6.7, 3.5, 7.9, 8.0, 2.1, 6.3, 6.6, 4.2, 1.5]
y = [2.2, 5.3, 6.3, 2.4, 0.1, 0.67, 7.8, 9.1, 7.1, 4.9, 5.1]
data = list(zip(x, y))
data.sort()
return data
def calculate_mean_and_stddev(xy_data):
y_sum = 0
xy_sum = 0
xxy_sum = 0
for xx, yy in xy_data:
y_sum += xx
xy_sum += xx*yy
xxy_sum += xx*xx*yy
xbar = xy_sum / y_sum
x2bar = xxy_sum/y_sum
std_dev = (x2bar - xbar**2)**0.5
return xbar, std_dev
def generate_data_and_compute_statistics():
data = generate_fake_data()
mean, std_dev = calculate_mean_and_stddev(data)
print("Mean: ", mean)
print("Std Dev:", std_dev)
generate_data_and_compute_statistics()
We note three important results of this code restructuring:
analyze()
does.generate_fake_data()
and calculate_mean_and_stddev()
can now be reused elsewhere.A useful principle for guiding the creation of functions is that functions should do one thing.
In the previous section, our large analysis()
function was doing several things, so we broke it up into smaller functions. Notice that calculate_mean_and_stddev()
does two things. Should we break it up into two functions, calculate_mean()
and calculate_stddev()
?
The answer depends on two things:
In [ ]:
def write_data_to_file(data, filename='data.dat'):
with open(filename, 'w') as f:
data *= 2
f.write(data)
Try to imagine a much larger code where you have a factor of two introduced, and you can't figure out where it came from. Then try to imagine searching a large code for the number 2.
Exceptions are a mechanism for handling errors. Traditionally, errors were handled with return codes, like this:
In [ ]:
def example_only_does_not_work():
fin = open('does_not_exist.txt', 'r')
if not fin:
return -1
# ... do stuff with file
fin.close()
return 0
This kind of code is problematic for a few reasons:
To illustrate point #2, consider the following code:
In [ ]:
def foo():
return -1 # error code!
def bar():
foo()
return 0 # return success?
def baz():
bar()
return 0 # no errors, right?
Exceptions offer an elegant solution to all three of the problems listed above.
Exceptions must derive from the BaseException
class (user-defined exceptions should be derived from Exception
). It is common to use one of the built-in exception subclasses. Common examples include:
ImportError
- raised when trying to import an unknown module.IndexError
- raised when trying to access an invalid element in an array.KeyError
- raised when trying to use an invalid key with a dictionary.NameError
- raised when trying to use a variable that hasn't been defined.TypeError
- raised when trying to use an object of the wrong type.ValueError
- raised when an argument has the correct type but a bad value.OSError
- base exception for problems with reading/writing a file (and other things).RuntimeError
- catch-all class for errors while code is running.In general, you can use these built-in exceptions when there is one that suits the problem. For instance, you might raise a ValueError
or TypeError
when checking arguments to a function:
In [ ]:
def foobar(value):
if not isinstance(value, int):
raise TypeError("foobar requires and int!")
if value < 0:
raise ValueError("foobar argument 'value' should be > 0; you passed: %i" % value)
# uncomment to test:
# foobar(2.7)
# foobar(-7)
You do not need to add a string argument when raising an exception. This works fine:
In [ ]:
raise Exception
However, this is not very helpful. In general, you should add some descriptive text to your exceptions to explain to the user what exactly went wrong.
To make your exceptions even more useful, or when there isn't a built-in exception that meets your needs, you can roll your own by sub-classing Exception
or one of the other built-in exceptions:
In [ ]:
class MyCustomException(Exception):
pass
# using a doc-string instead of 'pass' is more helpful
class CorruptFile(OSError):
"""Raise this exception when attempting to read a file that is corrupt."""
# uncomment to test...
# raise MyCustomException("Test")
# raise CorruptFile("Oh no, the file is corrupted!")
Handling exceptions is done by using try
... except
blocks. That is, you try
some operation where you suspect there may be some problems. If there are no problems, you continue on your merry way, except
in error cases where you deal with the problem before continuing on.
Let's return to the example from the top of this section to see how this works:
In [ ]:
def foo():
raise RuntimeError("Oh no! Can't foo!")
def bar():
foo()
def baz():
try:
bar()
except RuntimeError:
print("Foo had an error, but it is being handled...")
# do something useful to handle the error, or keep going
baz()
This is much better than using return codes (e.g., return -1
for errors) because:
What about that nice, descriptive error message that we wrote? Wouldn't it be nice if we could reuse that information in our except
block? You can, and it's easy! Just convert the exception to a string:
In [ ]:
def baz():
try:
bar()
except RuntimeError as e:
print('baz error: ' + str(e))
baz()
Finally, in some cases you may want to do something in the event that an exception is not thrown. Maybe you were expecting an exception, but for some bizarre reason it wasn't raised, which might be interesting. In these cases, you can add an else
to the end of the try
... except
block:
In [ ]:
def foo():
"""This foo actually foos."""
pass
def baz():
try:
bar()
except RuntimeError:
print("Bar raised an exception!")
else:
print("No exception was raised??")
baz()
Don't dismiss this as being a useless edge case -- exceptions are used for all kinds of things in Python. For instance, did you remember to install the pycodestyle
package for this module?
In [ ]:
try:
import pycodestyle
except ImportError:
print("You didn't remember to install it. :(")
else:
print("Nice job!")
In [ ]:
import random
values = {'a': 0, 'b': 1, 'c': 2}
# DON'T CHANGE THESE
def one(values):
print(v) # throws NameError because v is not defined
def two(values):
value['c'] /= values['a']
def three(values):
return values['d']
def tricky():
if random.randint(0, 1):
raise ValueError
else:
raise RuntimeError
Handle the exceptions thrown by each of the functions. The first one is done as an example.
In [ ]:
try:
one(values)
except NameError:
pass
# two(values)
# three(values)
# try:
# for i in range(10):
# tricky()
# except __:
Consider this line from some earlier code:
In [ ]:
fin = open('does_not_exist.txt', 'r')
The file does not exist, so it raises an error -- very sensibly, a FileNotFoundError
. Here, we have not handled this exception, so Python performs a "stack trace" or "traceback" (basically unrolling your code to show you where the error occurred).
These tracebacks are an excellent way to figure out what went wrong in your program. However, they can appear to be a little cryptic to the uninitiated, so we will look at how to understand them.
Consider this example, where you are trying to fit a quadratic function to two data points:
In [ ]:
from scipy.optimize import curve_fit
def f(x, a, b, c):
return a*x**2 + b*x + c
x = [0, 1]
y = [2, 3]
curve_fit(f, x, y)
The traceback indicates that the error is a TypeError
, and then starts in the current file (listed in green), where the offending call is made. It tells you that the error originates on line 8 (in this case, of the notebook cell).
Aside: you can view line numbers in an notebook by selecting a cell, pressing escape, and then pressing the (lowercase) 'L' key. Press 'L' again to turn the line numbers off.
The traceback then goes to the file where the offending function resides (in this case, in minpack.pyc
in the scipy library). The exception originated during a call to leastsq()
.
Finally, the traceback shows you where the actual TypeError
exception was raised (also in the minpack.pyc
file, just at a different line). The TypeError
tells you that N=3 must not exceed M=2.
This doesn't seem very helpful at first. What actually went wrong? What are N and M? In fact, the problem is one of basic linear algebra: we are trying to fit three unknowns (from our quadratic) with only two equations (one from each (x,y) data point). We need more data! Try adding another junk data point, and you will see that the error goes away.
To summarize, we note the following useful lessons:
Someone hands you the following code to calculate $n!$:
In [ ]:
def factorial(n):
n_fact = n
while n > 1:
n -= 1
n_fact *= n
return n_fact
Usually, you check that this code is working by doing something like this:
In [ ]:
print(factorial(3), 3*2)
print(factorial(5), 5*4*3*2)
This sort of testing works fine, but it has a few issues:
To illustrate point 2, note the following:
In [ ]:
factorial(0)
Oops! Recall that $0! \equiv 1$. Also, note that factorial(-1) = -1
, which is also wrong!
Writing a unit test is not much more work than our manual testing above. A possible test suite could look like this:
In [ ]:
correct_factorials = {0: 1, 1: 1, 2: 2, 3: 6, 4: 24, 5: 120}
for n, expected in correct_factorials.items():
assert factorial(n) == expected
The test fails because factorial(0) = 0
, but you wouldn't know that from the output. All you know is that something isn't working.
A more realistic example of unit testing using pytest
can be found in ../resouces/pytest_example
. Please open this directory and have a look at the code, which is organized as follows (ignoring the __pycache__
directories):
../resources/pytest_example
factorial.py
tests/
__init__.py
test_factorial.py
The file names and directory structure are important (see the pytest website). pytest
can be run as follows:
In [ ]:
!pytest ../resources/pytest_example/
pytest
tells us that 2/3 tests failed. One test, test_n_zero
, fails because we are trying to assert that factorial(0)
, which equals zero, is equivalent to one: 0 == 1
.
The other test that fails is test_n_negative
. A proper version of our factorial version might be expected to raise a ValueError
for a negative number, but the one above doesn't, so it fails the test.
For a quick pytest
tutorial, look here. For more details, see the pytest website. Several other unit testing frameworks exist, but we prefer pytest
because it requires the least amount of code to set up tests and has the cleanest looking tests.
The question "When should I use classes?" is more difficult to answer than "When should I use functions?" (for which the answer is: almost always). Classes are generally used in Object-Oriented Programming (OOP). A full discussion of OOP is beyond the scope of this course, so we will just give some general guidance here.
You should consider using classes when:
Consider this code:
In [ ]:
import random
def create_data_set(length, lower_bound=0, upper_bound=10, seed_value=None):
random.seed(seed_value)
return [random.uniform(lower_bound, upper_bound) for i in range(length)]
def shuffle(data):
random.shuffle(data)
return data
def mean(data):
return sum(data)/len(data)
def display(data):
print(data)
def analyze(data):
print(mean(data))
display(data)
new_data = shuffle(data)
display(new_data)
data = create_data_set(5)
analyze(data)
The first function creates a data set (initialization), while the other functions manipulate this data set. In this case, it may make sense to create a class:
In [ ]:
import random
class DataSet:
def __init__(self, length, lower_bound=0, upper_bound=10, seed_value=None):
random.seed(seed_value)
self.data = [random.uniform(lower_bound, upper_bound) for i in range(length)]
def shuffle(self):
random.shuffle(self.data)
def mean(self):
return sum(self.data)/len(self.data)
def display(self):
print(self.data)
def analyze(self):
print(self.mean())
self.display()
self.shuffle()
self.display()
a = DataSet(length=5)
a.analyze()
In this simple case, the class version and the function version appear more-or-less the same. However, the function version is actually better because it allows more flexibility: what if you wanted to analyze
some other data set besides a set of random numbers?
To see the real benefit of using classes, we need to consider something a bit more complex:
In [ ]:
import matplotlib as mpl
import matplotlib.pyplot as plt
import matplotlib.patches as patches
import matplotlib.animation as animation
from IPython.core.display import display, HTML
from math import sin, cos, atan2
import random
def generate_random_path(num_points):
"""Generate a random list of points pacman should visit."""
xlim = (-PacMan.X_BOUNDS, PacMan.X_BOUNDS)
ylim = (-PacMan.Y_BOUNDS, PacMan.Y_BOUNDS)
waypoints = []
for i in range(num_points):
waypoints.append((random.uniform(*xlim), random.uniform(*ylim)))
return waypoints
class PacMan:
RADIUS = 0.1 # size of pacman
ANGLE_DELTA = 5 # degrees; controls how fast pacman's mouth opens/closes
MAX_MOUTH_ANGLE = 30 # degrees; maximum mouth opening half-angle
MAX_SPEED = 0.02 # controls how fast pacman moves
X_BOUNDS = 1 # controls x-axis display range
Y_BOUNDS = 0.5 # controls y-axis display range
def __init__(self, waypoints=None):
self._init_figure()
self._init_pacman()
if waypoints:
self.waypoints = waypoints
self.go_home()
else:
self.waypoints = []
self._show_animation()
def _init_figure(self):
self.fig = plt.figure(figsize=(10, 8))
self.ax = self.fig.add_subplot(111, aspect='equal')
self.ax.set_xlim(-self.X_BOUNDS, self.X_BOUNDS)
self.ax.set_ylim(-self.Y_BOUNDS, self.Y_BOUNDS)
plt.tight_layout()
def _init_pacman(self):
self.x = 0
self.y = 0
self.angle = 0
self.angle_set = False
self.mouth_closing = True
self.mouth_open_angle = 30
pacman_patch = patches.Wedge((self.x, self.y), self.RADIUS,
self.mouth_open_angle, -self.mouth_open_angle,
color="yellow", ec="none")
self.pacman = self.ax.add_patch(pacman_patch)
def _animate_mouth(self):
if self.mouth_closing:
self.mouth_open_angle -= self.ANGLE_DELTA
else:
self.mouth_open_angle += self.ANGLE_DELTA
if self.mouth_open_angle <= 0:
self.mouth_open_angle = 1
self.mouth_closing = False
if self.mouth_open_angle >= self.MAX_MOUTH_ANGLE:
self.mouth_closing = True
self.pacman.set_theta1(self.mouth_open_angle)
self.pacman.set_theta2(-self.mouth_open_angle)
def _calculate_angle_to_point(self, x, y):
dx = x - self.x
dy = y - self.y
angle_rad = atan2(dy, dx)
return angle_rad
def _animate_motion(self):
if not self.waypoints:
return
way_x, way_y = self.waypoints[0]
if (self.x == way_x) and (self.y == way_y):
self.waypoints.pop(0)
self.angle_set = False
return
if not self.angle_set:
self.angle = self._calculate_angle_to_point(way_x, way_y)
self.angle_set = True
dx = self.MAX_SPEED*cos(self.angle)
dy = self.MAX_SPEED*sin(self.angle)
if abs(way_x - (self.x + dx)) >= self.MAX_SPEED:
self.x += dx
else:
self.x = way_x
if abs(way_y - (self.y + dy)) >= self.MAX_SPEED:
self.y += dy
else:
self.y = way_y
tx = mpl.transforms.Affine2D().rotate(self.angle) + \
mpl.transforms.Affine2D().translate(self.x, self.y) + self.ax.transData
self.pacman.set_transform(tx)
def _next_frame(self, i):
self._animate_mouth()
self._animate_motion()
return self.pacman,
def _show_animation(self):
if u'inline' in mpl.get_backend():
ani = animation.FuncAnimation(self.fig, self._next_frame, frames=500, interval=30, blit=True)
display(HTML(ani.to_html5_video()))
plt.clf()
else:
ani = animation.FuncAnimation(self.fig, self._next_frame, interval=30)
if mpl.get_backend() == u'MacOSX':
plt.show(block=False)
else:
plt.show()
def add_waypoint(self, x, y):
"""Add a point where pacman should go. This function is non-blocking."""
self.waypoints.append((x, y))
def add_random_path(self, num_points):
"""Add a list of random points to pacman's waypoint list."""
random_points = generate_random_path(num_points)
self.waypoints.extend(random_points)
def go_home(self):
"""Send pacman back to the origin (0, 0)."""
self.add_waypoint(-self.MAX_SPEED, 0)
self.add_waypoint(0, 0)
In [ ]:
random_path = generate_random_path(num_points=10)
pac = PacMan(random_path)
Note that Pacman is responsible for maintaining his own internal state. There are functions to manage how Pacman moves and opens/closes his mouth. All the user has to do is tell him where to go.
If you have a Mac (needed for non-blocking animation), you can move Pacman via three "public" functions (the last three), and you can use them without understanding exactly what is happening inside the class. Otherwise, you should tell Pacman where to go using the __init__
function.
In [ ]:
# Insert code here
Variables inside classes are called fields. Functions inside classes are called methods.
By convention, fields and methods that start with an underscore (e.g., _init_pacman()
) are "private", although not in the way that Java or C++ methods are private. These items can still be accessed by users, but the underscore indicates that users should not generally mess with them (they are not part of the public API).
Fields and methods that start with two underscores can also be considered private, but the two underscores have a particular use in Python called "name mangling", and they are intended to help prevent conflicts during inheritance. Unless you know what you are doing, stick to single underscores.
Methods that start and end with two underscores (e.g., __init__()
) are generally reserved for Python system calls. Don't name your methods this way.
Going back to the Pacman example, we note that there are only three methods needed to make pacman move: add_waypoint()
, random_path()
, and go_home
. Each of these can be easily used without any knowledge of the complicated class internals. It is good programming to provide a simple, easy-to-use interface to classes that is difficult to use incorrectly.
Encapsulation is an object-oriented programming concept that it is a good idea to prevent users from meddling with the internals of your class except via an approved external interface.
In traditional OO languages like Java and C++, encapsulation is strongly encouraged, while Python is less strict.
Here is an example of how Python classes are typically written:
In [ ]:
class Rect:
def __init__(self, width, height):
self.width = width
self.height = height
def area(self):
return self.width*self.height
def perimeter(self):
return 2*self.width + 2*self.height
This has a minimum of extra code ("boilerplate" in programmer-speak) and is generally the right way to make a Python class. However, note that we can do the following:
In [ ]:
a = Rect(3, -1) # fine
print('Area of a:', a.area())
b = Rect(2, 's') # also fine?
print('Area of b:', b.area())
It is generally good practice to validate the inputs of your classes (e.g., to avoid generating string Rect
s as above). We may also want to prevent users from changing the internal variables of our class accidentally or in ways that would ultimately generate bad outputs. This is traditionally done using the getter/setter model:
In [ ]:
from numbers import Number
class EncapsulatedRect:
def __init__(self, width, height):
self.set_width(width)
self.set_height(height)
def area(self):
return self._width*self._height
def perimeter(self):
return 2*self._width + 2*self._height
def get_width(self):
return self._width
def get_height(self):
return self._height
def set_width(self, width):
if isinstance(width, Number) and width > 0:
self._width = width
else:
raise ValueError('set_width: value should be a non-negative number.')
def set_height(self, height):
if isinstance(height, Number) and height > 0:
self._height = height
else:
raise ValueError('set_height: value should be a non-negative number.')
Here, _width
and _height
are internal variables, which can only be changed by approved setters which make sure that the values are good.
Unlike in C++ and Java, however, even in our EncapsulatedRect
we can still modify _width
and _height
directly:
In [ ]:
d = EncapsulatedRect(4, 5)
d._width = 2
print(d.area())
In general, the more "Pythonic" approach is actually Rect
rather than EncapsulatedRect
. In particular, Python encourages directly accessing fields rather than using getters and setters, which add boilerplate and clutter the code. Python expects users to be smart enough to use classes correctly.
Note that it is still good practice to validate inputs in Python. But how can you do that without using a set_...
method? Python offers a @property
decorator for this purpose, but we will not discuss its use here.
Inheritance is a more advanced Python topic, so in case you have forgotten or didn't get to the end of your Python tutorial, here is a brief example:
In [ ]:
class Foo:
def __init__(self, value):
self.value = value
def square(self):
return self.value**2
class Bar(Foo): # Bar inherits from Foo
def __init__(self, value):
self.value = value
def double(self):
return 2*self.value
baz = Bar(9)
print(baz.double()) # baz knows how to double because it is a Bar
print(baz.square()) # baz inherited the ability to square from Foo
The classic example of using inheritance for specialization is something like this:
In [ ]:
class BasicClass:
name = "Test"
value = 42
class AdvancedClass(BasicClass):
extra = [1, 2, 3]
adv = AdvancedClass
adv.value
The AdvancedClass has everything that the BasicClass has, plus more! However, in Python, you could also do this:
In [ ]:
basic = BasicClass()
basic.extra = [1, 2, 3] # works fine
You can do the same thing with functions:
In [ ]:
basic.f = lambda x: x + 7
basic.f(3)
However, note that a new BasicClass
object will not have these features:
In [ ]:
basic2 = BasicClass()
# basic2.extra # this won't work
# basic2.f(8) # this won't work either
Finally, there are (at least) four cases when you should definitely use inheritance:
In general, you should prefer using inheritance over manually adding fields or methods.
Duplicated code is evil!
Duplicating code wastes your time, makes your programs longer and harder to read, and makes them more error-prone. If you make a change to a block of code that is duplicated elsewhere, you will then need to manually change that code in each location it is repeated. Yuck!
Here is a trivial example of how the super
function can save you time and money!
In [ ]:
class Foo:
def __init__(self, value):
self.value = value
def compute(self):
print("Foo does some complicated calculations here.")
self.value += 3
print("Value:", self.value)
class Bar(Foo):
def __init__(self, value):
self.value = value
def compute(self):
print("Bar does its own complicated calculations here.")
self.value *= 2
super(Bar, self).compute() # calls compute() function of parent, Foo
b = Bar(7)
b.compute()
We can also use super
to call "special" functions, like the __init__
function (constructor):
In [ ]:
class Bar(Foo):
def __init__(self, value):
"""
This constructor is actually not needed. If you comment it out,
then Foo's constructor will be called automatically. (Try it!)
However, imagine you want to do something before calling Foo's
constructor.
"""
super(Bar, self).__init__(value) # explicitly calls Foo's constructor
def compute(self):
print("Bar does its own complicated calculations here.")
self.value *= 2
super(Bar, self).compute() # calls compute() function of parent, Foo
b = Bar(9)
b.compute()
These are very trivial examples, but please believe that the super
function can really cut down on a lot of duplicated code! Use it as often as you can.