One of the best things about coding is not having to do the same thing over and over again. You automate. You work things into functions and objects and have them worry about completing a series of actions for you. Why wouldn't you do the same thing when actually writing code?
There are times where you find yourself repeating code; when this happens, you should consider if it's possible to refactor and break the issue into a reuable piece of code. Generally, the rule of three comes in play:
There are two "rules of three" in [software] reuse:
* It is three times as difficult to build reusable components as single use components, and
* a reusable component should be tried out in three different applications before it will be sufficiently general to accept into a reuse library.
Facts and Fallacies of Software Engineering #18 Credit to Jeff Atwood's Coding Horror post about the Rule of Three for bringing it to my attention.
This post is just going to be a brief overview of common techniques and patterns to avoid writing the same thing over and over again. Starting with functions and moving into objects, inheritance, mixins, composition, decorators and context managers. There's plenty of other techniques, patterns and idioms that I don't touch on either but this post isn't meant to be an exhaustive list either.
In [1]:
def calc(a, b, x):
"""Our business crucial algorithm"""
return (a + b) * x
calc(1,2,3)
Out[1]:
Python also offers a limited form of anonymous functions called lambda
. They're limited to just a single expression with no statements in them. A lot of them time, they serve as basic callbacks or as key functions for a sort or group method. The syntax is simple and the return value is the outcome of the expression.
In [2]:
sorted([(1,2), (3,-1), (0,0)], key=lambda x: x[1])
Out[2]:
While lambdas are incredibly useful in many instances, it's generally considered bad form to assign them to variables (since they're supposed to be anonymous functions), not that I've never done that when it suited my needs. ;)
Objects are really the poster child for code reuse. Essentially, an object is a collection of data and functions that inter relate. Many in the Python community are fond of calling them a pile of dictionaries -- because that's what they essentially are in Python.
Objects offer all sorts of possibilities such as inheritance and composition, which I'll briefly touch upon here. For now, a simple example will suffice: take our business critical algorithm and turn it into a spreadsheet row
In [3]:
class SpreadsheetRow:
def __init__(self, a, b, x):
self.a = a
self.b = b
self.x = x
def calc(self):
return calc(self.a, self.b, self.x)
row = SpreadsheetRow(1,2,3)
print(row.calc())
Notice how we're already reusing code to find our business critical total of 9! If later, someone in accounting realizes that we should actually be doing a * (b + x)
, we simply change the original calculation function.
Inheritance is simply a way of giving access of all the data and methods of a class to another class. It's commonly called "specialization," though Raymond Hettinger aptly describes it as "delegating work." If later, accounting wants to be able to label all of our spreadsheet rows, we could go back and modify the original class or we could design a new one that does this for us.
Accessing information in the inherited class is done through super()
, I won't delve into it's details here but it is quite super.
In [4]:
class LabeledSpreadsheetRow(SpreadsheetRow):
def __init__(self, label, *args, **kwargs):
super().__init__(*args, **kwargs)
self.label = label
row = LabeledSpreadsheetRow(label='1', a=1, b=2, x=3)
print("The total for {} is {}".format(row.label, row.calc()))
Mixins are a type of multiple inheritance, which I won't fully delve into here because it's a complicated and touchy subject. However, Python supports it. Because of this and it's support for duck typing, we can completely forego the use of Interfaces and Traits which are common in single inheritance languages.
Mixins are a way of writing logic that is common to many objects and placing it in a single location. Mixins are also classes that aren't meant to be instantiated on their own either, since they represent a small piece of a puzzle rather than the whole picture. A common problem I use mixins for is creating a generic __repr__
method for objects.
In [5]:
class ReprMixin:
def __repr__(self):
name = self.__class__.__name__
attrs = ', '.join(["{}={}".format(k,v) for k,v in vars(self).items()])
return "<{} {}>".format(name, attrs)
class Row(LabeledSpreadsheetRow, ReprMixin):
pass
row = Row(label='1', a=1, b=2, x=3)
repr(row)
Out[5]:
This showcases the power of inheritance and mixins: composing complex objects from smaller parts into what you're wanting. The actual class we're using implements no logic of it's own but we're now provided with:
Composition is a fancy way of saying we're going to build an object using other objects, in other words: composing them from parts. It's a similar idea to inheritance, but instead the objects we're using are stored as attributes on the main object. We have spreadsheet rows, why not a spreadsheet to hold them?
In [6]:
class Spreadsheet(ReprMixin):
def __init__(self, name):
self.name = name
self.rows = []
def show_all(self):
for row in self.rows:
print("The total for {} is {}".format(row.label, row.calc()))
def total(self):
return sum(r.calc() for r in self.rows)
sheet = Spreadsheet("alec's totals")
sheet.rows.extend([Row(label=1, a=1, b=2, x=3), Row(label=2, a=3, b=5, x=8)])
sheet.show_all()
print(sheet.total())
repr(sheet)
Out[6]:
Here we're not only reusing the ReprMixin so we can have accurate information about our Spreadsheet object, we're also reusing the Row objects to provide that logic for free, leaving us to just implement the show_all
and total
methods.
Decorators are a way factoring logic out of a class or function and into another class or function. Or to add extra logic to it. That sounds confusing, but it's really not. I've written about them elsewhere, so if you're unfamiliar with them I recommend reading that first. Here, we're going to use two decorators Python provides in the standard library called total_ordering
so we can sort our Row objects and the other is the property
decorator which allows us to retreat a function as if it were an attribute (via the descriptor protocol which is a fantastic code reuse ability that I won't explore here).
In [7]:
from functools import total_ordering
@total_ordering
class ComparableRow(Row):
@property
def __key(self):
return (self.a, self.b, self.x)
def __eq__(self, other):
return self.__key == other.__key
def __lt__(self, other):
return self.__key < other.__key
rows = sorted([ComparableRow(label=1, a=3, b=5, x=8), ComparableRow(label=2, a=1, b=2, x=3)])
print(rows)
What total_ordering
does is provide all the missing rich comparison operators for us. Meaning even though we only defined __lt__
and __eq__
here, we also have __le__
, __gt__
, __ge__
, and __ne__
available to us.
Decorators are an incredibly powerful to modify your regular Python functions and objects.
Context managers are a way of handling operations you typically do in pairs: open a file, close a file; start a timer, end a timer; acquire a lock, release a lock; start a transactio, end a transaction. Really, anything you do in pairs should be a candidate for context managers.
Writing context managers is pretty easy, depending on which method you go about. I'll likely explore them in a future post. For now, I'm going to stick to using the generator context manager form as an example:
In [8]:
from contextlib import contextmanager
@contextmanager
def greeting(name=None):
print("Before the greeting.")
yield "Hello {!s}".format(name)
print("After the greeting.")
with greeting("Alec") as greet:
print(greet)
We won't be writing a context manager here, but rather using one to implement an "alternate constructor" for our Spreadsheet
class. Alternate constructors are a way of initializing an object in a specific way. These are especially handy if you find yourself occasionally creating an object under certain conditions. Consider dict.fromkeys
which lets you fill a dictionary with keys from an iterable that all have the same value:
In [9]:
print(dict.fromkeys(range(5), None))
In our case, we'll probably want to draw our information from a CSV file occasionally. If we do it often enough, writing the setup logic could become tedious to rewrite all over the place.
In [10]:
import csv
class CSVSpreadsheet(Spreadsheet):
@classmethod
def from_csv(cls, sheetname, filename):
sheet = cls(sheetname)
with open(filename) as fh:
reader = csv.reader(fh.readlines())
sheet.rows = [ComparableRow(*map(int, row)) for row in reader]
return sheet
sheet = CSVSpreadsheet.from_csv('awesome', 'row.csv')
sheet.show_all()