This notebook was put together by [Jake Vanderplas](http://www.vanderplas.com) for UW's [Astro 599](http://www.astro.washington.edu/users/vanderplas/Astr599/) course. Source and license info is on [GitHub](https://github.com/jakevdp/2013_fall_ASTR599/).

Advanced Python

Mishmash:

Classes, Exceptions, Iterators, and Generators

We have spent much of our time so far taking a look at scientific tools in Python. But a big part of using Python is an in-depth knowledge of the language itself. The topics here may not have direct science applications, but you'd be surprised when and where they can pop up as you use and write scientific Python code.

We'll dive a bit deeper into a few of these topics here.

Advanced Python: Outline

Here is what we plan to cover in this section:

  • Classes: defining your own objects
  • Exceptions: handling the unexpected
  • Iterators: sequences on-the-fly
  • Generator Expressions: the sky's the limit

Classes

Python can be used as an object-oriented language. An object is an entity that encapsulates data, called attributes, and functionality, called methods.

Everything in Python is an object. Take for example the complex object:


In [2]:
z = 1 + 2j

In [3]:
# The type function allows us to inspect the object type
type(z)


Out[3]:
complex

In [4]:
# "calling" an object type is akin to constructing an object
z = complex(1, 2)

In [5]:
# z has real and imaginary attributes
print z.real
print z.imag


1.0
2.0

In [6]:
# z has methods to operate on these attributes
print z.conjugate()


(1-2j)

Every data container you see in Python is an object, from integers and floats to lists to numpy arrays.

Classes: creating your own objects

Here we'll show a quick example of spinning our own complex-like object, using a class.

Class definitions look like this:


In [7]:
class MyClass(object):
    # attributes and methods are defined here
    pass

In [8]:
# create a MyClass "instance" named m
m = MyClass()
print type(m)


<class '__main__.MyClass'>

Class Initialization

Things get a bit more interesting when we define the __init__ method:


In [9]:
class MyClass(object):
    def __init__(self):
        print self
        print "initialization called"
        pass
    
m = MyClass()


<__main__.MyClass object at 0x10349d290>
initialization called

In [10]:
print m


<__main__.MyClass object at 0x10349d290>

The first argument of __init__() points to the object itself, and is usually called self by convention.

Note above that when we print self and when we print m, we see that they point to the same thing. self is m

A more interesting initialization

We can use the __init__() method to accept initialization keyword arguments. Here we'll allow the user to pass a value to the initialization, which is saved in the class:


In [11]:
class MyClass(object):
    def __init__(self, value):
        self.value = value

In [12]:
m = MyClass(5.0)  # note: the self argument is always implied

print m.value


5.0

Adding some methods

Now let's add a squared() method, which returns the square of the value:


In [13]:
class MyClass(object):
    def __init__(self, value):
        self.value = value
        
    def squared(self):
        return self.value ** 2

In [14]:
m = MyClass(5)
print m.squared()


25

Methods act just like functions: they can have any number of arguments or keyword arguments, they can accept *args and **kwargs arguments, and can call other methods or functions.

Other special methods

There are numerous special methods, indicated by double underscores. One important one is the __repr__ method, which controls how an instance of the class is represented when it is output:


In [15]:
class MyClass(object):
    def __init__(self, value):
        self.value = value
        
    def squared(self):
        return self.value ** 2
    
    def __repr__(self):
        return "MyClass(value=" + str(self.value) + ")"

In [16]:
m = MyClass(10)
print m
print type(m)


MyClass(value=10)
<class '__main__.MyClass'>

Other special methods

Other special methods to be aware of:

  • String representations: __str__, __repr__, __hash__, etc.
  • Arithmetic: __add__, __sub__, __mul__, __div__, etc.
  • Item access: __getitem__, __setitem__, etc.
  • Attribute Access: __getattr__, __setattr__, etc.
  • Comparison: __eq__, __lt__, __gt__, etc.
  • Constructors/Destructors: __new__, __init__, __del__, etc.
  • Type Conversion: __int__, __long__, __float__, etc.

For a nice discussion and explanation of these and many other special double-underscore methods, see http://www.rafekettler.com/magicmethods.html

Exercise: A Custom Complex Object

Create a class MyComplex which behaves like the built-in complex numbers. You should be able to execute the following code and see these results:

>>> z = MyComplex(2, 3)
>>> print z
(2, 3j)
>>> print z.real
2
>>> print z.imag
3
>>> print z.conjugate()
(2, -3j)
>>> print type(z.conjugate())
<class '__main__.MyComplex'>

Note that the conjugate() method should return a new object of type MyComplex.


In [16]:

If you finish this quickly, search online for help on defining the __add__ method such that you can compute:

>>> z + z.conjugate()
(4, 0j)

In [16]:

Exceptions

Handling the Unexpected

Sometimes things go wrong in your code, and this is where exceptions come in. For example, you may have illegal inputs to an operation:


In [17]:
0/0


---------------------------------------------------------------------------
ZeroDivisionError                         Traceback (most recent call last)
<ipython-input-17-6549dea6d1ae> in <module>()
----> 1 0/0

ZeroDivisionError: integer division or modulo by zero

Or you may call a function with the wrong number of arguments:


In [18]:
from math import sqrt
sqrt(2, 3)


---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-18-80e335c13035> in <module>()
      1 from math import sqrt
----> 2 sqrt(2, 3)

TypeError: sqrt() takes exactly one argument (2 given)

Or you may choose an index that is out of range:


In [19]:
L = [4, 5, 6]
L[100]


---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-19-25e9eefaf101> in <module>()
      1 L = [4, 5, 6]
----> 2 L[100]

IndexError: list index out of range

Or a dictionary key that doesn't exist:


In [20]:
D = {'a':2, 'b':300}
print D['Q']


---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-20-3e70854ae47f> in <module>()
      1 D = {'a':2, 'b':300}
----> 2 print D['Q']

KeyError: 'Q'

Or the wrong value for a conversion function:


In [21]:
x = int('ABC')


---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-21-3cb6d4b5dc55> in <module>()
----> 1 x = int('ABC')

ValueError: invalid literal for int() with base 10: 'ABC'

These are known as Exceptions, and handling them appropriately is a big part of writing usable code.

Handling exceptions: try...except

Exceptions can be handled using the try and except statements:


In [22]:
def try_division(value):
    try:
        x = value / value
        return x
    except ZeroDivisionError:
        return 'Not A Number'
    
print try_division(1)
print try_division(0)


1
Not A Number

In [23]:
def get_an_int():
    while True:
        try:
            x = int(raw_input("Enter an integer: "))
            print "  >> Thank you!"
            break
        except ValueError:
            print "  >> Boo.  That's not an integer."
    return x

get_an_int()


Enter an integer: R2D2
  >> Boo.  That's not an integer.
Enter an integer: C3PO
  >> Boo.  That's not an integer.
Enter an integer: 55
  >> Thank you!
Out[23]:
55

Advanced Exception Handling

Other things to be aware of:

  • you may use multiple except statements for different exception types
  • else and finally statements can fine-tune the exception handling

More information is available in the Python documentation and in the scipy lectures

Raising your own exceptions

In addition to handling exceptions, you can also raise your own exceptions using the raise keyword:


In [24]:
def laugh(N):
    if N < 0:
        raise ValueError("N must be positive")
    return N * "ha! "

In [25]:
laugh(10)


Out[25]:
'ha! ha! ha! ha! ha! ha! ha! ha! ha! ha! '

In [26]:
laugh(-4)


---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-26-8f794d3f8feb> in <module>()
----> 1 laugh(-4)

<ipython-input-24-11268c7129b2> in laugh(N)
      1 def laugh(N):
      2     if N < 0:
----> 3         raise ValueError("N must be positive")
      4     return N * "ha! "

ValueError: N must be positive

Custom Exceptions

For your own projects, you may desire to define custom exception types, which is done through class inheritance.

The important point to note here is that exceptions themselves are classes:


In [27]:
v = ValueError("message")
print type(v)


<type 'exceptions.ValueError'>

When you raise an exception, you are creating an instance of the exception type, and passing it to the raise keyword:


In [28]:
raise ValueError("error message")


---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-28-c623d04e0b6c> in <module>()
----> 1 raise ValueError("error message")

ValueError: error message

Later in the quarter we'll dive into object-oriented programming, but here's a quick preview of the principle of inheritance: new objects derived from existing objects:


In [29]:
# define a custom exception, inheriting from the base class Exception
class CustomException(Exception):
    # can define custom behavior here
    pass

raise CustomException("error message")


---------------------------------------------------------------------------
CustomException                           Traceback (most recent call last)
<ipython-input-29-06445ec64aea> in <module>()
      4     pass
      5 
----> 6 raise CustomException("error message")

CustomException: error message

In [29]:

Iterators

Iterators are a high-level concept in Python that allow a sequence of objects to be examined in sequence.

We've seen a basic example of this in the for-loop:


In [30]:
for i in range(10):
    print i


0
1
2
3
4
5
6
7
8
9

One weakness here, though, is that (in Python 2.x) the range() function actually constructs a list, which is then iterated through.

So, if we were to do something like

for i in range(100000000):
    print i

then before anything in the loop is executed, Python would first construct a list containing 100 million integers: that's close to a terabyte of memory!

Fortunately, Python provides iterators: objects that look and act like a list, but generate the items on-the-fly.

The iterator equivalent of range() is the function xrange() (note that in Python 3, range() itself returns an iterator rather than a list)


In [31]:
print range(10)
print xrange(10)


[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
xrange(10)

In [32]:
for i in range(10):
    print i,
print

for i in xrange(10):
    print i,
print


0 1 2 3 4 5 6 7 8 9
0 1 2 3 4 5 6 7 8 9

In [33]:
import sys
print sys.getsizeof(range(1000))
print sys.getsizeof(xrange(1000))


8072
40

Handy Iterators to know about


In [34]:
D = {'a':0, 'b':1, 'c':2}

print D.keys()
print D.iterkeys()


['a', 'c', 'b']
<dictionary-keyiterator object at 0x103494d60>

In [35]:
print D.values()
print D.itervalues()


[0, 2, 1]
<dictionary-valueiterator object at 0x103494e10>

In [36]:
print D.items()
print D.iteritems()


[('a', 0), ('c', 2), ('b', 1)]
<dictionary-itemiterator object at 0x103494ec0>

In [37]:
for key in D.iterkeys():
    print key


a
c
b

In [38]:
for key in D:
    print key


a
c
b

In [39]:
for key, val in D.iteritems():
    print key, val


a 0
c 2
b 1

itertools: more sophisticated iterations


In [40]:
import itertools
dir(itertools)


Out[40]:
['__doc__',
 '__file__',
 '__name__',
 '__package__',
 'chain',
 'combinations',
 'combinations_with_replacement',
 'compress',
 'count',
 'cycle',
 'dropwhile',
 'groupby',
 'ifilter',
 'ifilterfalse',
 'imap',
 'islice',
 'izip',
 'izip_longest',
 'permutations',
 'product',
 'repeat',
 'starmap',
 'takewhile',
 'tee']

In [41]:
for c in itertools.combinations([1, 2, 3, 4], 2):
    print c


(1, 2)
(1, 3)
(1, 4)
(2, 3)
(2, 4)
(3, 4)

In [42]:
for p in itertools.permutations([1, 2, 3]):
    print p


(1, 2, 3)
(1, 3, 2)
(2, 1, 3)
(2, 3, 1)
(3, 1, 2)
(3, 2, 1)

In [43]:
for val in itertools.chain(range(0, 4), range(-4, 0)):
    print val,


0 1 2 3 -4 -3 -2 -1

In [44]:
# zip: itertools.izip is an iterator equivalent
for val in zip([1, 2, 3], ['a', 'b', 'c']):
    print val


(1, 'a')
(2, 'b')
(3, 'c')

Quick Exercise:

Write a function count_pairs(N, m) which returns the number of pairs of numbers in the sequence $0 ... N-1$ whose sum is divisible by m.

For example, if N = 3 and m = 2, the pairs are

[(0, 1), (0, 2), (1, 2)]

The sum of each pair respectively is [1, 2, 3], and there is a single pair whose sum is divisible by 2, so the result is 1.

  1. What is the result for $(N,m) = (10, 2)$?
  2. What is the result for $(N,m) = (1000, 5)$?

From iterators to generators: the yield statement

Python provides a yield statement that allows you to make your own iterators. Technically the result is called a "generator object".

For example, here's one way you can create an generator that returns all even numbers in a sequence:


In [45]:
def select_evens(L):
    for value in L:
        if value % 2 == 0:
            yield value

In [46]:
for val in select_evens([1,2,5,3,6,4]):
    print val


2
6
4

The yield statement is like a return statement, but the iterator remembers where it is in the execution, and comes back to that point on the next pass.

Example: Primes via the Sieve of Eratosthenes

As an example of the power of generators, we'll create a generator object that yields the first $N$ prime numbers, using a version of the Sieve of Eratosthenes:


In [47]:
def iter_primes(Nprimes):
    N = 2
    found_primes = []
    while True:
        if all([N % p != 0 for p in found_primes]):
            found_primes.append(N)
            yield N
        if len(found_primes) >= Nprimes:
            break
        N += 1

In [48]:
# Find the first twenty primes
for N in iter_primes(20):
    print N,


2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71

Generator Expressions: Generators on the fly

We previously saw examples of list comprehensions which can create lists succinctly in one line:


In [49]:
L = []
for i in range(20):
    if i % 3 > 0:
        L.append(i)
L


Out[49]:
[1, 2, 4, 5, 7, 8, 10, 11, 13, 14, 16, 17, 19]

In [50]:
# or, as a list comprehension
[i for i in range(20) if i % 3 > 0]


Out[50]:
[1, 2, 4, 5, 7, 8, 10, 11, 13, 14, 16, 17, 19]

The corresponding construction of an iterator is known as a "generator expression":


In [51]:
def genfunc():
    for i in range(20):
        if i % 3 > 0:
            yield i
print genfunc()
print list(genfunc())  # convert iterator to list


<generator object genfunc at 0x103432fa0>
[1, 2, 4, 5, 7, 8, 10, 11, 13, 14, 16, 17, 19]

In [52]:
# or, equivalently, as a "generator expression"
g = (i for i in range(20) if i % 3 > 0)
print g
print list(g) # convert generator expression to list


<generator object <genexpr> at 0x103432c80>
[1, 2, 4, 5, 7, 8, 10, 11, 13, 14, 16, 17, 19]

The syntax is identical to that of list comprehensions, except we surround the expression with () rather than with []. Again, this may seem a bit specialized, but it allows some extremely powerful constructions in Python, and it's one of the features of Python that some people get very excited about.