Python hints and tricks

I’ve put together a collection of nice tricks and time savers that might help make your Python more Pythonic!

In no particular order...

Use list comprehensions

These one line constructs make creating list objects trivially easy. e.g.


In [1]:
my_list = [ x**2 for x in range(100) ]

print(my_list[2])


4

For the more adventurous it’s also possible to include logic statements and nested comprehensions, but don’t overdo it, I’ve seen 5 line comprehensions before and it’s not pretty!


In [2]:
my_constrained_list = [ x**2 for x in range(100) if x%2 == 0]

print(my_constrained_list[2])


16

Know when not to use list comprehensions - using generators instead

Generators allow you to declare a function that behaves as an iterator. That is the resulting expression is not evaluated and stored in memory when it is declared (as for a list comprehension), rather it is evaluated each time the function is called.

For cases where the expression is evaluated only once, or where the expression would be too large to store in memory, the benefits are obvious. It is easy to define functions which act as generators, but you can also use ‘generator comprehension’ which is almost identical to a list comprehension except using parenthesis, e.g.


In [3]:
my_gen = ( x^2 for x in range(10**9) )

Note however that you can't index them directly, only iterate over them (which makes sense if you think about it):


In [4]:
total = 0
for val in my_gen:
    total += val
    if val > 10**4: 
        break
     
print(total)
# print my_gen[4] <-- This won't work


50005002

Dictionary comprehensions

Dictionaries are a very useful construct in Python, and it is very easy to generate dictionaries using dictionary comprehensions to specify each key:value pair, e.g.


In [19]:
my_dict = { 'Book' : 5, 'Monkey': 7, 'Paper': 23.4 }

print(my_dict['Monkey'])


7

Or make them using iterators:


In [5]:
my_new_dict = { x: x**2 for x in range(100) }

print(my_new_dict[5])


25

Dictionary values as functions / Classes

It may not be immediately obvious to new Python programmers but because Classes and functions are first class objects it is trivially easy to store these in lists, or even dictionaries. (One great example of this is an implementation of the strategy pattern using dictionaries.)


In [6]:
def nuts():
    return 'Peanuts'

def cheese():
    return 'Edam'

feed = {'Monkey': nuts, 'Mouse': cheese}

my_food = feed['Mouse']()

print(my_food)


Edam

The 'map' function

This function makes it really easy to perform operations on any collection of objects. e.g.


In [17]:
def calculate_squares(x):
    return x**2

squares = map(calculate_squares, range(20))

It returns an iterator mapping the function given onto the list values (which may just be any form of iterable).

Parallel map

It's trivial to map a function across all the cores in your machine


In [18]:
from multiprocessing import Pool

squares = Pool().map(calculate_squares, range(20))

Unpacking arguments

It is possible to unpack a list into a function call as mandatory arguments. e.g.


In [8]:
def example_function(food_type, amount, colour=''):
    print("Type: {}".format(food_type))
    print("Amount: {}".format(amount))
    print("Colour: {}".format(colour))
    
arguments = ['Nuts', 50]
example_function(*arguments) # unpacks my list into mandatory arguments


Type: Nuts
Amount: 50
Colour: 

or, unpacking dictionaries for optional arguments:


In [9]:
arg_dict = {'colour': 'Blue'}
example_function('Cheese', 2, **arg_dict)


Type: Cheese
Amount: 2
Colour: Blue

or, both:


In [10]:
example_function(*arguments, **arg_dict)


Type: Nuts
Amount: 50
Colour: Blue

You can even unpack numpy arrays! Note that the order matters for mandatory arguments, but not optional ones.

Unpacking return values

It’s also possible to unpack return values of a function directly:


In [11]:
import numpy as np

def moments(x):
    """Return the first three momments of a (normal) distribution"""
    return 1, np.mean(x), np.std(x)

print(moments(np.arange(10)))

first, second, third = moments(np.arange(10))

print(third)


(1, 4.5, 2.8722813232690143)
2.87228132327

A great example of this demonstrating this and the previous example is in-place value swapping - e.g:


In [12]:
a, b = 5, 10

print(a, b)

a, b = b, a

print(a, b)


5 10
10 5

For (almost) any numerical work use Numpy!

Numpy is a numerical library with very fast linear algebra operations and a number of extremely useful constructs. See http://www.numpy.org/.

Chained comparisons

It is really easy to chain (ternary) comparisons together in an intuitive way e.g.


In [13]:
def five():
    print("5 being called")
    return 5

def six():
    print("6 being called")
    return 6

if 1 < five() < six():
    print(True)

if 1 > five() > six():
    print(True)


5 being called
6 being called
True
5 being called

Also, the function five() only gets evaluated once, and the second comparison still gets short circuited if the first fails.

Conditional assignment

Though often frowned upone this is actually very readable in Python:


In [16]:
test = 'Yes' if 1 < five() < six() else 'No'

print('Did my test pass?: {}'.format(test))


5 being called
6 being called
Did my test pass?: Yes

Advanced indexing

There are a number of ways of indexing lists which you may not have been aware of:

  • You can count backwards, e.g. access the last element in a list using my_list[-1]
  • Reversing a list using my_list[::-1].
  • The above is just a special case of setting an increment e.g. my_list[::2] gives a step of 2.
  • All of the above work on strings!

Using enumerate

The function enumerate returns a counter as well as the item to be enumerated which can be very useful if you need the index of an item as well as the item itself. e.g.


In [20]:
for i, x in enumerate(my_list):
    print(i, my_list[i-2])


0 9604
1 9801
2 0
3 1
4 4
5 9
6 16
7 25
8 36
9 49
10 64
11 81
12 100
13 121
14 144
15 169
16 196
17 225
18 256
19 289
20 324
21 361
22 400
23 441
24 484
25 529
26 576
27 625
28 676
29 729
30 784
31 841
32 900
33 961
34 1024
35 1089
36 1156
37 1225
38 1296
39 1369
40 1444
41 1521
42 1600
43 1681
44 1764
45 1849
46 1936
47 2025
48 2116
49 2209
50 2304
51 2401
52 2500
53 2601
54 2704
55 2809
56 2916
57 3025
58 3136
59 3249
60 3364
61 3481
62 3600
63 3721
64 3844
65 3969
66 4096
67 4225
68 4356
69 4489
70 4624
71 4761
72 4900
73 5041
74 5184
75 5329
76 5476
77 5625
78 5776
79 5929
80 6084
81 6241
82 6400
83 6561
84 6724
85 6889
86 7056
87 7225
88 7396
89 7569
90 7744
91 7921
92 8100
93 8281
94 8464
95 8649
96 8836
97 9025
98 9216
99 9409

Default dictionary values

In order to avoid having to catch KeyErrors every time you query a dictionary you can use the get method to provide a default value if the key is not present.


In [21]:
val = 0
try:
    val = my_dict[101]
except KeyError:
    print("That key of the dictionary doesn't exist")

print(my_dict.get(101, 4))


That key of the dictionary doesn't exist
4

Running external processes

It's really straightforward to call another process in Python:


In [14]:
from subprocess import check_output, call

call('ls')


Out[14]:
0

In [15]:
check_output(['ls', '-l'])


Out[15]:
b'total 10080\n-rwxr-xr-x@ 1 watson-parris  staff   693625  5 Dec 13:28 cartopy_intro.ipynb\n-rw-r--r--@ 1 watson-parris  staff  2577382  7 Dec 16:47 cis_introduction.ipynb\n-rw-r--r--@ 1 watson-parris  staff   110189  7 Dec 16:45 col_output.nc\n-rw-r--r--@ 1 watson-parris  staff    21233  8 Dec 16:03 hints_and_tricks.ipynb\n-rw-r--r--@ 1 watson-parris  staff   510254  6 Dec 08:16 iris_short_intro.ipynb\n-rwxr-xr-x@ 1 watson-parris  staff   661244  6 Dec 08:07 matplotlib_intro.ipynb\n-rw-r--r--@ 1 watson-parris  staff    53525  6 Dec 08:15 numpy_intro.ipynb\n-rw-r--r--@ 1 watson-parris  staff    14432  8 Dec 12:05 object_oriented_programming.ipynb\n-rw-r--r--@ 1 watson-parris  staff   157963  8 Dec 09:45 optimisation.ipynb\n-rw-r--r--@ 1 watson-parris  staff   312146  8 Dec 15:42 pandas_introduction.ipynb\n-rw-r--r--@ 1 watson-parris  staff    24031  6 Dec 10:40 python_introduction.ipynb\n'

Also - there is a defaultdict collection which gives keys default values, or use my_dict.setdefault to set a default on a standard dict. There are some subtle differences though about when the default is created, and some code might expect a KeyError, so take care with this one.

Named formatting

You may have noticed I've been using implicit formatting to fill in values. This is probably fine when there is only one value, and it works when there is more, but it's probably best to use named placeholders, e.g.:


In [23]:
print("The {foo} is {bar}".format(foo='answer', bar=42))

# Note that you can also unpack a dict into format!

words = {'foo': 'answer', 'bar': '7x6'}
print("The {foo} is {bar}".format(**words))


The answer is 42
The answer is 7x6

Classes can be created at run-time

This one is definitely not for the feint hearted. Because classes are first class objects in Python it is possible to define them at run-time, e.g. within if statements or even functions. Use with care!


In [24]:
x = 6

if x < five():
    class test(object):
        def number(self):
            return x
else:
    class test(object):
        def number(self):
            return 5
        
print(test().number())


5 being called
5

The with statement is your friend

The with statement is a bit like a try, except block, but is intended for standard code flow, rather than exception handling. For example, a really common use is with file handling:


In [ ]:
with open('test', 'w') as f:
    pass
    # do something

The ‘with’ statement doesn’t take care of the fact that the file may not exist, or other IO errors, but it does ensure that if an exception occurs in the ‘do something’ block then the file gets closed regardless. Obviously, this is most useful for IO, or network connections where you have to ensure some finally block is executed, but should be extendable to more general scenarios.

But it's also possible to create your own implementations. In order to be able to use a with statement in your own code you can create a context manager which implements both enter() and exit() methods (see PEP-343 for details), or more simply use the built-in contextlib. A good example is provided by StackOverflow (http://stackoverflow.com/questions/3012488/what-is-the-python-with-statement-designed-for):


In [ ]:
from contextlib import contextmanager
import os

@contextmanager
def working_directory(path):
    current_dir = os.getcwd()
    os.chdir(path)
    try:
        yield
    finally:
        os.chdir(current_dir)
        
with working_directory("/"): 
    pass
    # do something within data/stuff
    
# here I am back again in the original working directory