Josh Montague, 2015-12
Built on OS X, with IPython 3.0 on Python 2.7
In this session, we'll dig into some details of Python functions. The end goal will be to understand how and why you might want to create decorators with Python functions.
Note: there is an Appendix at the end of this notebook that dives deeper into scope and Python namespaces. I wrote out the content because they're quite relevant to talking about decorators. But, ultimately, we only 45 minutes, and it couldn't all fit. If you're curious, take a few extra minutes to review that material, as well.
A lot of this RST has to do with understanding the subtleties of Python functions. So, we're going to spend some time exploring them.
In Python, functions are objects
[T]his means the language supports passing functions as arguments to other functions, returning them as the values from other functions, and assigning them to variables or storing them in data structures. (wiki)
This is not true (or at least not easy) in all programming languages. I don't have a ton of experience to back this up. But, many moons ago, I remember that Java functions only lived inside objects and classes.
Let's take a moment to look at a relatively simple function and appreciate what it does and what we can do with it.
In [1]:
def duplicator(str_arg):
"""Create a string that is a doubling of the passed-in arg."""
# use the passed arg to create a larger string (double it, with a space between)
internal_variable = ' '.join( [str_arg, str_arg] )
return internal_variable
In [2]:
# print (don't call) the function
print( duplicator )
# equivalently (in IPython):
#duplicator
Remember that IPython and Jupyter will automatically (and conveniently!) call the __repr__
of an object if it is the last thing in the cell. But I'll use the print()
function explicitly just to be clear.
This displays the string representation of the object. It usually includes:
Now, let's actually call the function (which we do with use of the parentheses), and assign the return value (a string) to a new variable.
In [3]:
# now *call* the function by using parens
output = duplicator('yo')
In [4]:
# verify the expected behavior
print(output)
Because functions are objects, they have attributes just like any other Python object.
In [5]:
# the dir() built-in function displays the argument's attributes
dir(duplicator)
Out[5]:
Because functions are objects, we can pass them around like any other data type. For example, we can assign them to other variables! If you occasionally still have dreams about the Enumerator, this will look familiar.
In [6]:
# first, recall the normal behavior of useful_function()
duplicator('ring')
Out[6]:
In [7]:
# now create a new variable and assign our function to it
another_duplicator = duplicator
In [8]:
# now, we can use the *call* notation because the new variable is
# assigned the original function
another_duplicator('giggity')
Out[8]:
In [9]:
# and we can verify that this is actually a reference to the
# original function
print( "original function: %s" % duplicator )
print
print( "new function: %s" % another_duplicator )
By looking at the memory location, we can see that the second function is just a pointer to the first function! Cool!
With an understanding of what's inside a function and what we can do with it, consider the case were we define a new function within another function.
This may seem overly complicated for a little while, but stick with me.
In the example below, we'll define an outer function which includes a local variable, then a local function definition. The inner function returns a string. The outer function calls the inner function, and returns the resulting value (a string).
In [10]:
def speaker():
"""
Simply return a word (a string).
Other than possibly asking 'why are you writing this simple
function in such a complicated fashion?' this should
hopefuly should be pretty clear.
"""
# define a local variable
word='hello'
def shout():
"""Return a capitalized version of word."""
# though not in the innermost scope, this is in the namespace one
# level out from here
return word.upper()
# call shout and then return the result of calling it (a string)
return shout()
In [11]:
# remember that the result is a string, now print it. the sequence:
# - put word and shout in local namespace
# - define shout()
# - call shout()
# - look for 'word', return it
# - return the return value of shout()
print( speaker() )
Now, this may be intuitive, but it's important to note that the inner function is not accessible outside of the outer function. The interpreter can always step out into larger (or "more outer") namespaces, but we can't dig deeper into smaller ones.
In [12]:
try:
# this function only exists in the local scope of the outer function
shout()
except NameError, e:
print(e)
In [13]:
def speaker_func():
"""Similar to speaker(), but this time return the actual inner function!"""
word = 'hello'
def shout():
"""Return an all-caps version of the passed word."""
return word.upper()
# don't *call* shout(), just return it
return shout
In [14]:
# remember: our function returns another function
print( speaker_func() )
Remember that the return value of the outer function is another function. And just like we saw earlier, we can print
the function to see the name and memory location.
Note that the name is that of the inner function. Makes sense, since that's what we returned.
Like we said before, since this is an object, we can pass this function around and assign it to other variables.
In [15]:
# this will assign to the variable new_shout, a value that is the shout function
new_shout = speaker_func()
Which means we can also call it with parens, as usual.
In [16]:
# which means we can *call* it
new_shout()
Out[16]:
In [17]:
from operator import itemgetter
# we might want to sort this by the first or second item
tuple_list = [(1,5),(9,2),(5,4)]
# itemgetter is a callable (like a function) that we pass in as an argument to sorted()
sorted(tuple_list, key=itemgetter(1))
Out[17]:
In [18]:
def tuple_add(tup):
"""Sum the items in a tuple."""
return sum(tup)
# now we can map the tuple_add() function across the tuple_list iterable.
# note that we're passing a function as an argument!
map(tuple_add, tuple_list)
Out[18]:
If we can pass functions into and out of other functions, then I propose that we can extend or modify the behavior of a function without actually editing the original function!
🎉💥🎉💥🎉💥🎉💥🎉💥
For example, say there's some previously-defined function in and you'd like it to be more verbose. For now, let's just assume that printing a bunch of information to stdout is our goal.
Below, we define a function verbose()
that takes another function as an argument. It does other tasks both before and after actually calling the passed-in function.
In [19]:
def verbose(func):
"""Add some marginally annoying verbosity to the passed func."""
def inner():
print("heeeeey everyone, I'm about to do a thing!")
print("hey hey hey, I'm about to call a function called: {}".format(func.__name__))
print
# now call (and print) the passed function
print func()
print
print("whoa, did everyone see that thing I just did?! SICK!!")
return inner
Now, imagine we have a function that we wish had more of this type of "logging." But, we don't want to jump in and add a bunch of code to the original function.
In [20]:
# here's our original function (that we don't want to modify)
def say_hi():
"""Return 'hi'."""
return '--> hi. <--'
In [21]:
# understand the original behavior of the function
say_hi()
Out[21]:
Instead, we pass the original function as an arg to our verbose
function. Remember that this returns the inner function, so we can assign it and then call it.
In [22]:
# this is now a function...
verbose_say_hi = verbose(say_hi)
In [23]:
# which we can call...
verbose_say_hi()
Looking at the output, we can see that when we called verbose_say_hi()
, all of the code in it ran:
say_hi()
was called We'd now say that verbose_say_hi()
is a decorated version of say_hi()
. And, correspondingly, that verbose()
is our decorator.
A decorator is a callable that takes a function as an argument and returns a function (probably a modified version of the original function).
Now, you may also decide that the modified version of the function is the only version you want around. And, further, you don't want to change any other code that may depend on this. In that case, you want to overwrite the namespace value for the original function!
In [24]:
# this will clobber the existing namespace value (the original function def).
# in it's place we have the verbose version!
say_hi = verbose(say_hi)
say_hi()
One use-case where this technique can be useful is when you need to use an existing base of code that you can't edit. There's an existing library that defines classes and methods that are aligned with your needs, but you need a slight variation on them.
Imagine there is a library called (creatively) uneditable_lib
that implements a Coordinate
class (a point in two-dimensional space), and an add()
method. The add()
method allows you to add the vectors of two Coordinates
together and returns a new Coordinate
object. It has great documentation and you know the source Python source code looks like this:
In [25]:
! cat _uneditable_lib.py
In [26]:
! ls | grep .pyc
In [27]:
# you can still *use* the compiled code
from uneditable_lib import Coordinate, add
In [28]:
# make a couple of coordinates using the existing library
coord_1 = Coordinate(x=100, y=200)
coord_2 = Coordinate(x=-500, y=400)
print( coord_1 )
In [29]:
print( add(coord_1, coord_2) )
But, imagine that for our particular use-case, we need to confine the resulting coordinates to the first quadrant (that is, x > 0
and y > 0
). We want any negative component in the coordinates to just be truncated to zero.
We can't edit the source code, but we can decorate (and modify) it!
In [30]:
def coordinate_decorator(func):
"""Decorates the pre-built source code for Coordinates.
We need the resulting coordinates to only exist in the
first quadrant, so we'll truncate negative values to zero.
"""
def checker(a, b):
"""Enforces first-quadrant coordinates."""
ret = func(a, b)
# check the result and make sure we're still in the
# first quadrant at the end [ that is, x and y > 0 ]
if ret.x < 0 or ret.y < 0:
ret = Coordinate(ret.x if ret.x > 0 else 0,
ret.y if ret.y > 0 else 0
)
return ret
return checker
We can decorate the preexisting add()
function with our new wrapper. And since we may be using other code from uneditable_lib
with an API that expects the function to still be called add()
, we can just overwrite that namespace variable.
In [31]:
# first we decorate the existing function
add = coordinate_decorator(add)
In [32]:
# then we can call it as before
print( add(coord_1, coord_2) )
And, we now have a truncated Coordinate
that lives in the first quadrant.
In [33]:
from IPython.display import Image
Image(url='http://i.giphy.com/8VrtCswiLDNnO.gif')
Out[33]:
If we are running out of time, this is an ok place to wrap up.
Here are some real examples you might run across in the wild:
@app.route
is a decorator that lets you decorate an arbitrary Python function and turn it into a URL path.@login_required
is a decorator that lets your function define the appropriate authentication.If you go home tonight and can't possibly wait to learn more about decorators, here are the next things to look up:
@functools.wraps
If there is sufficient interest in a Decorators, Part Deux, those would be good starters.
In [ ]:
In [ ]:
In [ ]:
In [ ]:
In [ ]:
In [ ]:
If so, here are a couple of other useful things worth saying, quickly...
@
)You might still want to use a decorator to modify a function that you wrote in your own code.
You might ask "But if we're already writing the code, why not just make the function do what we want in the first place?" Valid question.
One place where this comes up is in practicing DRY (Don't Repeat Yourself) software engineering practices. If an identical block of logic is to be used in many places, that code should ideally be written in only one place.
In our case, we could imagine making a bunch of different functions more verbose. Instead of adding the verbosity (print statements) to each of the functions, we should define that once and then decorate the other functions.
Another nice example is making your code easier to understand by separating necessary operational logic from the business logic.
There's a nice shorthand - some syntactic sugar - for this kind of statement. To illustrate it, let's just use a variation on a method from earlier. First, see how the original function behaves:
In [34]:
def say_bye():
"""Return 'bye'."""
return '--> bye. <--'
say_bye()
Out[34]:
Remember the verbose()
decorator that we already created? If this function (and perhaps others) should be made verbose at the time they're defined, we can apply the decorator right then and there using the @
shorthand:
In [35]:
@verbose
def say_bye():
"""Return 'bye'."""
return '--> bye. <--'
say_bye()
In [ ]:
Image(url='http://i.giphy.com/piupi6AXoUgTe.gif')
But that shouldn't actually blow your mind. Based on our discussion before, you can probably guess that the decorator notation is just shorthand for:
say_bye = verbose( say_bye )
One place where this shorthand can come in particularly handy is when you need to stack a bunch of decorators. In place of nested decorators like this:
my_func = do_thing_a( add_numbers( subtract( verify( my_func ))))
We can write this as:
@do_thing_a
@add_numbers
@subtract
@verify
def my_func():
# something useful happens here
Note that the order matters!
Ok, thank you, please come again.
In [ ]:
In [ ]:
In [ ]:
In [ ]:
In [ ]:
In [ ]:
In [ ]:
Ok, final round, I promise.
This is material that I originally intended to include in this RST (because it's relevant), but ultimately cut for clarity. You can come back and revisit it any time.
Roughly speaking, scope and namespace are the reason you can type some code (like variable_1 = 'dog'
) and then transparently use variable_1
later in your code. Sounds obvious, but easy to take for granted!
The concepts of scope and namespace in Python are pretty interesting and complex. Some time when you're bored and want to learn some details, have a read of this nice explainer or the official docs on the Python Execution Model.
A short way to think about them is the following:
While this RST isn't explicitly about scope, understanding these concepts will make it easier to read the code later on. Let's look at some examples.
There are two built-in functions that can aid in exploring the namespace at various points in your code: globals()
and locals()
return a dictionary of the names and values in their respective scope.
Since the namespaces in IPython are often huge, let's use IPython's bash magic to call out to a normal Python session to test how globals()
works:
In [ ]:
# -c option starts a new interpreter session in a subshell and evaluates the code in quotes.
# here, we just assign the value 3 to the variable x and print the global namespace
! python -c 'x=3; print( globals() )'
Note that there are a bunch of other dunder names that are in the global namespace. In particular, note that '__name__' = '__main__'
because we ran this code from the command line (a comparison that you've made many times in the past!). And you can see the variable x that we assigned the value of 3.
We can also look at the namespace in a more local scope with the locals()
function. Inside the body of a function, the local namespace is limited to those variables defined within the function.
In [ ]:
# this var is defined at the "outermost" level of this code block
z = 10
def printer(x):
"""Print some things to stdout."""
# create a new var within the scope of this function
animal = 'baboon'
# ask about the namespace of the inner-most scope, "local" scope
print('local namespace: {}\n'.format(locals()))
# now, what about this var, which is defined *outside* the function?
print('variable defined *outside* the function: {}'.format(z))
In [ ]:
printer(17)
First, you can see that when our scope is 'inside the function', the namespace is very small. It's the local variables defined within the function, including the arg we passed the function.
But, you can also see that we can still "see" the variable z
, which was defined outside the function. This is because even though z
doesn't exist in the local namespace, this is just the "innermost" of a series of nested namespaces. When we failed to find z
in locals()
, the interpreter steps "out" a layer, and looks for a namespace key (variable name) that's defined outside of the function. If we look through this (and any larger) namespace and still fail to find a key (variable name) for z
, the interpreter will raise a NameError
.
While the interpreter will always continue looking in larger or more outer scopes, it can't do the opposite. Since y
is created and assigned within the scope of our function, it goes "out of scope" as soon as the function returns. Local variables defined within the scope of a function are only accessible from that same scope - inside the function.
In [ ]:
try:
# remember that this var was created and assigned only within the function
animal
except NameError, e:
print(e)
In [ ]:
def outer(x):
def inner():
print x
return inner
We saw earlier, that the variable x
isn't directly accessible outside of the function outer()
because it's created within the scope of that function. But, Python's function closures mean that because inner()
is not defined in the global scope, it keeps track of the surrounding namespace wherein it was defined. We can verify this by inspecting an example object:
In [ ]:
o = outer(7)
o()
In [ ]:
try:
x
except NameError, e:
print(e)
In [ ]:
print( dir(o) )
In [ ]:
print( o.func_closure )
And, there in the repr
of the object's func_closure
attribute, we can see there is an int
still stored! This is the value that we passed in during the creation of the function.
In [ ]: