Python hints and tricks

I’ve put together a collection of nice tricks and time savers that might help make your Python more Pythonic!

In no particular order...

Use list comprehensions

These one line constructs make creating list objects trivially easy. e.g.


In [ ]:
my_list = [ x**2 for x in range(100) ]

print(my_list[2])

For the more adventurous it’s also possible to include logic statements and nested comprehensions, but don’t overdo it, I’ve seen 5 line comprehensions before and it’s not pretty!


In [ ]:
my_constrained_list = [ x**2 for x in range(100) if x%2 == 0]

print(my_constrained_list[2])

Know when not to use list comprehensions - using generators instead

Generators allow you to declare a function that behaves as an iterator. That is the resulting expression is not evaluated and stored in memory when it is declared (as for a list comprehension), rather it is evaluated each time the function is called.

For cases where the expression is evaluated only once, or where the expression would be too large to store in memory, the benefits are obvious. It is easy to define functions which act as generators, but you can also use ‘generator comprehension’ which is almost identical to a list comprehension except using parenthesis, e.g.


In [ ]:
my_gen = ( x^2 for x in range(10**9) )

Note however that you can't index them directly, only iterate over them (which makes sense if you think about it):


In [ ]:
total = 0
for val in my_gen:
    total += val
    if val > 10**4: 
        break
     
print(total)
# print my_gen[4] <-- This won't work

Dictionary comprehensions

Dictionaries are a very useful construct in Python, and it is very easy to generate dictionaries using dictionary comprehensions to specify each key:value pair, e.g.


In [ ]:
my_dict = { 'Book' : 5, 'Monkey': 7, 'Paper': 23.4 }

print(my_dict['Monkey'])

Or make them using iterators:


In [ ]:
my_new_dict = { x: x**2 for x in range(100) }

print(my_new_dict[5])

Dictionary values as functions / Classes

It may not be immediately obvious to new Python programmers but because Classes and functions are first class objects it is trivially easy to store these in lists, or even dictionaries. (One great example of this is an implementation of the strategy pattern using dictionaries.)


In [ ]:
def nuts():
    return 'Peanuts'

def cheese():
    return 'Edam'

feed = {'Monkey': nuts, 'Mouse': cheese}

my_food = feed['Mouse']()

print(my_food)

The 'map' function

This function makes it really easy to perform operations on any collection of objects. e.g.


In [ ]:
def calculate_squares(x):
    return x**2

squares = map(calculate_squares, range(20))

It returns an iterator mapping the function given onto the list values (which may just be any form of iterable).

Parallel map

It's trivial to map a function across all the cores in your machine


In [ ]:
from multiprocessing import Pool

squares = Pool().map(calculate_squares, range(20))

Unpacking arguments

It is possible to unpack a list into a function call as mandatory arguments. e.g.


In [ ]:
def example_function(food_type, amount, colour=''):
    print("Type: {}".format(food_type))
    print("Amount: {}".format(amount))
    print("Colour: {}".format(colour))
    
arguments = ['Nuts', 50]
example_function(*arguments) # unpacks my list into mandatory arguments

or, unpacking dictionaries for optional arguments:


In [ ]:
arg_dict = {'colour': 'Blue'}
example_function('Cheese', 2, **arg_dict)

or, both:


In [ ]:
example_function(*arguments, **arg_dict)

You can even unpack numpy arrays! Note that the order matters for mandatory arguments, but not optional ones.

Unpacking return values

It’s also possible to unpack return values of a function directly:


In [ ]:
import numpy as np

def moments(x):
    """Return the first three momments of a (normal) distribution"""
    return 1, np.mean(x), np.std(x)

print(moments(np.arange(10)))

first, second, third = moments(np.arange(10))

print(third)

A great example of this demonstrating this and the previous example is in-place value swapping - e.g:


In [ ]:
a, b = 5, 10

print(a, b)

a, b = b, a

print(a, b)

For (almost) any numerical work use Numpy!

Numpy is a numerical library with very fast linear algebra operations and a number of extremely useful constructs. See http://www.numpy.org/.

Chained comparisons

It is really easy to chain (ternary) comparisons together in an intuitive way e.g.


In [ ]:
def five():
    print("5 being called")
    return 5

def six():
    print("6 being called")
    return 6

if 1 < five() < six():
    print(True)

if 1 > five() > six():
    print(True)

Also, the function five() only gets evaluated once, and the second comparison still gets short circuited if the first fails.

Conditional assignment

Though often frowned upone this is actually very readable in Python:


In [ ]:
test = 'Yes' if 1 < five() < six() else 'No'

print('Did my test pass?: {}'.format(test))

Advanced indexing

There are a number of ways of indexing lists which you may not have been aware of:

  • You can count backwards, e.g. access the last element in a list using my_list[-1]
  • Reversing a list using my_list[::-1].
  • The above is just a special case of setting an increment e.g. my_list[::2] gives a step of 2.
  • All of the above work on strings!

Using enumerate

The function enumerate returns a counter as well as the item to be enumerated which can be very useful if you need the index of an item as well as the item itself. e.g.


In [ ]:
for i, x in enumerate(my_list):
    print(i, my_list[i-2])

Default dictionary values

In order to avoid having to catch KeyErrors every time you query a dictionary you can use the get method to provide a default value if the key is not present.


In [ ]:
val = 0
try:
    val = my_dict[101]
except KeyError:
    print("That key of the dictionary doesn't exist")

print(my_dict.get(101, 4))

Running external processes

It's really straightforward to call another process in Python:


In [ ]:
from subprocess import check_output, call

call('ls')

In [ ]:
check_output(['ls', '-l'])

Also - there is a defaultdict collection which gives keys default values, or use my_dict.setdefault to set a default on a standard dict. There are some subtle differences though about when the default is created, and some code might expect a KeyError, so take care with this one.

Named formatting

You may have noticed I've been using implicit formatting to fill in values. This is probably fine when there is only one value, and it works when there is more, but it's probably best to use named placeholders, e.g.:


In [ ]:
print("The {foo} is {bar}".format(foo='answer', bar=42))

# Note that you can also unpack a dict into format!

words = {'foo': 'answer', 'bar': '7x6'}
print("The {foo} is {bar}".format(**words))

Classes can be created at run-time

This one is definitely not for the feint hearted. Because classes are first class objects in Python it is possible to define them at run-time, e.g. within if statements or even functions. Use with care!


In [ ]:
x = 6

if x < five():
    class test(object):
        def number(self):
            return x
else:
    class test(object):
        def number(self):
            return 5
        
print(test().number())

The with statement is your friend

The with statement is a bit like a try, except block, but is intended for standard code flow, rather than exception handling. For example, a really common use is with file handling:


In [ ]:
with open('test', 'w') as f:
    pass
    # do something

The ‘with’ statement doesn’t take care of the fact that the file may not exist, or other IO errors, but it does ensure that if an exception occurs in the ‘do something’ block then the file gets closed regardless. Obviously, this is most useful for IO, or network connections where you have to ensure some finally block is executed, but should be extendable to more general scenarios.

But it's also possible to create your own implementations. In order to be able to use a with statement in your own code you can create a context manager which implements both enter() and exit() methods (see PEP-343 for details), or more simply use the built-in contextlib. A good example is provided by StackOverflow (http://stackoverflow.com/questions/3012488/what-is-the-python-with-statement-designed-for):


In [ ]:
from contextlib import contextmanager
import os

@contextmanager
def working_directory(path):
    current_dir = os.getcwd()
    os.chdir(path)
    try:
        yield
    finally:
        os.chdir(current_dir)
        
with working_directory("/"): 
    pass
    # do something within data/stuff
    
# here I am back again in the original working directory