The basics of generators

A Generator is a type of iterator which can be constructed natively inside a Python def function. A def function is a generator function if it contains any yield statements. And the immediate return value of calling such a function is a generator iterator, or just a generator.

A generator doesn't execute all at once like a regular function does. In fact, calling a generator function doesn't execute any of that function's code immediately, whereas calling a regular function executes all of its code immediately until it returns. Whenever a generator is run, it runs until the end of the function OR until the next yield statement. At that point, the value in the yield expression is returned to the caller, and the execution of the generator is suspended until it is next called.

While execution is suspended, the full local state of the execution of the generator function closure is stored in memory. This allows it to be resumed later, as if it had never paused.

A generator is most commonly used as a normal iterator. This means that it can be automatically iterated over with a for-loop, or manually iterated over with next(). It can also be passed to any function that expects an iterator or an iterable. next() returns the value of the first yield statement it encounters, or raises StopIteration if the generator function completes.



In [1]:

    
def range_generator_function(stop):
    """Naive implementation of builtins.range generator."""
    # This function runs immediately, since it has no `yield` statements.
    # It is a normal function, which happens to return a generator iterator.
    print("Running line 1")
    if not isinstance(stop, int):
        raise TypeError('stop must be an int')
    if stop < 0:
        raise ValueError('stop must be >= 0')
    print("Running line 2")
    range_generator = _range_generator_function(stop=stop)
    print("Running line 3")
    return range_generator

def _range_generator_function(stop):
    # This function does not run immediately, since it has `yield` statements.
    # It is a generator function, and returns a generator iterator.
    index = 0
    print("Running line 4")
    while index < stop:
        print("Running line 5 with index", index)
        yield index
        print("Running line 6 with index", index)
        index += 1
    print("Running line 7 with index", index)



In [2]:

    
range_generator = range_generator_function(2)  # Executes all prints in `range_generator_function()`,
range_generator                                # but none in `_range_generator_function()`.









    



Running line 1
Running line 2
Running line 3






    Out[2]:





<generator object _range_generator_function at 0x106322e08>



In [3]:

    
import collections
isinstance(range_generator, collections.Iterable), isinstance(range_generator, collections.Iterator)









    Out[3]:





(True, True)



In [4]:

    
isinstance(range_generator, collections.Generator)









    Out[4]:





True



In [5]:

    
next(range_generator)









    



Running line 4
Running line 5 with index 0






    Out[5]:





0



In [6]:

    
next(range_generator)









    



Running line 6 with index 0
Running line 5 with index 1






    Out[6]:





1



In [7]:

    
import traceback

try:
    next(range_generator)
except StopIteration:
    traceback.print_exc()









    



Running line 6 with index 1
Running line 7 with index 2






    



Traceback (most recent call last):
  File "<ipython-input-7-94c725fdbb96>", line 4, in <module>
    next(range_generator)
StopIteration



In [8]:

    
next(range_generator, 2)  # Generator is exhausted, nothing more will get printed.









    Out[8]:





2



In [9]:

    
range_generator = range_generator_function(4)
for item in range_generator:
    print('yielded', item)









    



Running line 1
Running line 2
Running line 3
Running line 4
Running line 5 with index 0
yielded 0
Running line 6 with index 0
Running line 5 with index 1
yielded 1
Running line 6 with index 1
Running line 5 with index 2
yielded 2
Running line 6 with index 2
Running line 5 with index 3
yielded 3
Running line 6 with index 3
Running line 7 with index 4

Advanced uses of generators

The above covers the majority of use cases of generators. It is probably the case that most Python programmers have used the above, or at least encountered it while reading other Python code.

However, generators are more than just a special syntax for iterators. There is more to the generator protocol, though it tends to be less widely used and known.

The full generator protocol: send, throw, and close

Generators implement three extra methods besides the two (__iter__() and __next__()) that it needs to be an iterator. These are the send(), throw(), and close() methods. Together, they allow for even more control over how a generator executes its body when it resumes execution.

Generators haven't always had this functionality. It was added in Python 2.5, via PEP 342.

Generator.send

send() is a slight generalization of __next__(). It accepts a single value, and that value becomes the result of the yield expression as the generator resumes. The function returns the next value to be yielded, or raises StopIteration if the generator completes or returns. __next__() is actually equivalent to send(None).



In [10]:

    
def generator_function():
    print((yield 0))
    print((yield 1))
    
generator = generator_function()
print(next(generator))  # Advance generator to first `yield` statement.
item = generator.send('print this')
print('yielded', item)
try:
    next(generator)  # Same as `generator.send(None)`
except StopIteration:
    pass









    



0
print this
yielded 1
None

Generator.throw

throw() accepts an exception, and raises it inside the generator, at the yield where the generator is currently paused. If the generator manages to yield another value, that will get returned from the function. Otherwise, any exceptions raised out of the generator will be propagated out (the same is true for send() and __next__()).



In [11]:

    
import traceback

class ExpectedError(Exception): pass

def generator_function():
    for i in range(2):
        try:
            yield i
        except ExpectedError as exc:
            print('Caught exception', repr(exc))
            continue
        except Exception as exc:
            print('Did not catch exception', repr(exc))
            raise
    return i
        
generator = generator_function()
next(generator)
item = generator.throw(ExpectedError)
print('yielded', item)
try:
    generator.throw(KeyError('key'))
except KeyError:
    traceback.print_exc()









    



Caught exception ExpectedError()
yielded 1
Did not catch exception KeyError('key',)






    



Traceback (most recent call last):
  File "<ipython-input-11-2c9c9b5b897b>", line 22, in <module>
    generator.throw(KeyError('key'))
  File "<ipython-input-11-2c9c9b5b897b>", line 8, in generator_function
    yield i
KeyError: 'key'



In [12]:

    
generator = generator_function()
next(generator)
item = generator.throw(ExpectedError)
print('yielded', item)
try:
    generator.throw(ExpectedError)
except StopIteration as exc:
    traceback.print_exc()
    print(repr(exc))









    



Caught exception ExpectedError()
yielded 1
Caught exception ExpectedError()
StopIteration(1,)






    



Traceback (most recent call last):
  File "<ipython-input-12-edf97ebf6adb>", line 6, in <module>
    generator.throw(ExpectedError)
StopIteration: 1

Generator.close

close() instructs the generator to stop yielding elements and exit. It is similar, though not identical, to throw(GeneratorExit).

The GeneratorExit class is a subclass of BaseException, but is not a subclass of Exception. This makes it less likely that a generator function will catch and ignore it by mistake.

close() normally returns None, if the generator function returns without any errors, or if the GeneratorExit exception is propagated out of the generator function. If the generator function raises a new exception while handling GeneratorExit, then that exception is raised by close(). It is illegal for the generator to yield a new value while handling close(), so doing so causes RuntimeError to be raised instead.



In [13]:

    
def generator_function():
    try:
        yield
    except:
        traceback.print_exc()
        raise
    print('About to yield 1')
    yield 1
    
generator = generator_function()
next(generator)
generator.close()









    



Traceback (most recent call last):
  File "<ipython-input-13-05ae44296ab2>", line 3, in generator_function
    yield
GeneratorExit



In [14]:

    
def generator_function():
    try:
        yield 0
    except:
        raise KeyError('key')

generator = generator_function()
next(generator)
try:
    generator.close()
except KeyError:
    traceback.print_exc()









    



Traceback (most recent call last):
  File "<ipython-input-14-ddf0738c9bf7>", line 3, in generator_function
    yield 0
GeneratorExit

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<ipython-input-14-ddf0738c9bf7>", line 10, in <module>
    generator.close()
  File "<ipython-input-14-ddf0738c9bf7>", line 5, in generator_function
    raise KeyError('key')
KeyError: 'key'

Generators as suspendable/resumable program state

When executing a normal function, it begins execution as soon as it is called, continues (without giving up control) until it completes (by returning or throwing), and cleans up its call stack when exiting.

But with a generator, it gives up control every time it hits a yield, even if the generator isn't done executing yet. When it does so, then generator becomes suspended. So that its execution can be continued later, its locals() are saved on the generator object. Later, it can be resumed (by calling __next__(), send(), throw(), or close()), and its locals() will be restored before execution resumes from the yield point.

This has been implicitly assumed in previous examples. For example, in the section The basics of generators, the generator function _range_generator_function() relies on this mechanism in order to remember the values of stop and index between steps. There will be a more advanced example of this in the section Generators as coroutines.

There are other clever ways in which this can be utilized. One great usage of generators is in the implementation of @contextlib.contextmanager. This decorator combines the power of generators with the power of the context manager protocol, in order to produce something quite powerful.

`@contextlib.contextmanager`

Code sample from https://github.com/python/cpython/blob/v3.6.1/Lib/contextlib.py
Copyright (c) 2001-2017 Python Software Foundation.
All Rights Reserved.
License: Python license, https://www.python.org/3.6/license.html
Some modifications made so as to only highlight the interesting parts.

def contextmanager(func):
    """@contextmanager decorator.

    Typical usage:

        @contextmanager
        def some_generator(<arguments>):
            <setup>
            try:
                yield <value>
            finally:
                <cleanup>

    This makes this:

        with some_generator(<arguments>) as <variable>:
            <body>

    equivalent to this:

        <setup>
        try:
            <variable> = <value>
            <body>
        finally:
            <cleanup>

    """
    @functools.wraps(func)
    def helper(*args, **kwds):
        return _GeneratorContextManager(func, args, kwds)
    return helper

class _GeneratorContextManager(ContextDecorator, AbstractContextManager):

    def __init__(self, func, args, kwds):
        self.gen = func(*args, **kwds)

    def __enter__(self):
        try:
            return next(self.gen)
        except StopIteration:
            raise RuntimeError("generator didn't yield") from None

    def __exit__(self, type, value, traceback):
        if type is None:
            try:
                next(self.gen)
            except StopIteration:
                return
            else:
                raise RuntimeError("generator didn't stop")
        else:
            try:
                self.gen.throw(type, value, traceback)
                raise RuntimeError("generator didn't stop after throw()")
            except StopIteration as exc:
                return exc is not value

The @contextlib.contextmanager decorator makes it really easy for any developer to create their own modular and reusable setup/cleanup logic, or enter/exit logic, without needing to create a new class with __enter__() and __exit__() methods. @contextlib.contextmanager handles that, with the help of the generator that the developer defines.

When __enter__() is called upon entering the block, __next__() is called in order to perform the setup logic, which is everything up until the yield. When the yield is encountered, the generator suspends its execution and saves its state, the __enter__() call returns, and the Python interpretter begins executing the block.

When __exit__() is called upon exiting the block, either __next__() or throw() will be called in order to perform the cleanup logic. Upon resuming the generator, its internal state is restored, and the cleanup can proceed as if it were a normal function.

When the cleanup logic is placed inside a finally block inside the generator function, we get an awesome combination of Python features (generators, context managers, and finally blocks) which guarantees that your cleanup code will always be executed whenever the block exits (assuming no bugs in @contextlib.contextmanager).

This shows how flexible the yielding execution model is. When a generator yields, the caller doesn't have to immediately resume the generator. It can go off and do something else, and then resume it at any point in the future. The really neat thing here, is that the yield is essentially replaced with the execution of an arbitrary block of Python code. In future chapters, we'll look at other clever uses of suspending execution which can produce very powerful models of program execution.

Generators as coroutines

With the full generator protocol, a generator is equivalent to a semicoroutine. It is also possible to build a custom dispatcher/trampoline, in order to implement a coroutine system. Reference: https://en.wikipedia.org/wiki/Coroutine#Comparison_with_generators. This is used in various concurrency libraries, and is demonstrated in David Beazley's PyCon 2015 talk "Python Concurrency From the Ground Up: LIVE!". See also the examples given in PEP 342.

Here's a silly example of a semicoroutine in action.



In [15]:

    
import collections

class StopAdder(Exception): pass

def adder_function():
    total = 0
    while True:
        print('At start of adder loop, current total is', total)
        try:
            integers = (yield total)
        except (Exception, GeneratorExit) as exc:
            print('Adder received exception', repr(exc), 'and is returning with final total', total)
            return total
        if not isinstance(integers, (list, tuple)):
            integers = [integers]
        if integers and isinstance(integers[0], collections.Iterable):
            integers = integers[0]
        print('Adder received', integers)
        total += sum(integers)

def send_values_into_adder(adder, *integers):
    print('Sending', integers, 'into adder')
    current_total = adder.send(integers)
    print('Current total in adder is', current_total)
    return current_total

adder = adder_function()
next(adder)

send_values_into_adder(adder)
print()
send_values_into_adder(adder, 10)
print()
send_values_into_adder(adder, 1, 2, 3)
print()
send_values_into_adder(adder, range(8))
print()
print('Sending StopAdder into adder')
try:
    adder.throw(StopAdder)
except StopIteration as exc:
    print('Final total from adder is', exc.value)









    



At start of adder loop, current total is 0
Sending () into adder
Adder received ()
At start of adder loop, current total is 0
Current total in adder is 0

Sending (10,) into adder
Adder received (10,)
At start of adder loop, current total is 10
Current total in adder is 10

Sending (1, 2, 3) into adder
Adder received (1, 2, 3)
At start of adder loop, current total is 16
Current total in adder is 16

Sending (range(0, 8),) into adder
Adder received range(0, 8)
At start of adder loop, current total is 44
Current total in adder is 44

Sending StopAdder into adder
Adder received exception StopAdder() and is returning with final total 44
Final total from adder is 44

Generators that return values

In the following example, the generator will yield once, and then raise StopIteration when it hits the return statement.



In [16]:

    
def generator_function():
    yield 0
    return

generator = generator_function()
next(generator)









    Out[16]:





0



In [17]:

    
import traceback

try:
    next(generator)
except StopIteration:
    traceback.print_exc()









    



Traceback (most recent call last):
  File "<ipython-input-17-ef31004ccce8>", line 4, in <module>
    next(generator)
StopIteration

It might be thought that generators will only pass value via yield, and will always have empty return statements. This is true up through Python 3.2, but that changed in Python 3.3. Now, just like in normal functions, a generator function can return arbitrary values with return statements (but note that this is illegal in earlier versions of Python, and will cause a SyntaxError).



In [18]:

    
import traceback

def generator_function_that_returns_a_value():
    yield 0
    return 'return_value'

generator = generator_function_that_returns_a_value()
next(generator)

try:
    next(generator)
except StopIteration as exc:
    traceback.print_exc()
    print(repr(exc))
    print(repr(exc.value))
    
try:
    next(generator)
except StopIteration as exc:
    traceback.print_exc()  # Subsequent calls to `next()` do not use the return value
    print(repr(exc))       # when raising `StopIteration`.
    print(repr(exc.value))









    



StopIteration('return_value',)
'return_value'
StopIteration()
None






    



Traceback (most recent call last):
  File "<ipython-input-18-a206ca87066f>", line 11, in <module>
    next(generator)
StopIteration: return_value
Traceback (most recent call last):
  File "<ipython-input-18-a206ca87066f>", line 18, in <module>
    next(generator)
StopIteration

Notice that the Python interpretter converted the return 'return_value' into a raise StopIteration('return_value'). Also notice that the exception object has a .value attribute, which holds this value.

This behavior is hidden when using for-loops or other functionality that catches and ignores StopIteration.



In [19]:

    
for item in generator_function_that_returns_a_value():
    print('yielded', item)









    



yielded 0



In [20]:

    
list(generator_function_that_returns_a_value())









    Out[20]:





[0]

Custom generator classes

Generators aren't required to be implemented using generator functions. If you really wanted to, you could also implement one yourself. Just make sure to define the send() and throw() methods correctly, and to correctly define __next__(), __iter__(), and close() in terms of those.

Starting in Python 3.5, the collections module defines an abstract Generator baseclass. When subclassing this, you only need to define send() and throw(). The other required methods are already implemented for you correctly.



In [21]:

    
import collections

class Adder(collections.Generator):
    def __init__(self):
        super().__init__()
        self.total = 0
        self.stopped = False
        
    def __repr__(self):
        return f"<{self.__class__.__name__}: total={self.total!r} stopped={self.stopped!r}>"
    
    def send(self, integers):
        if self.stopped:
            raise StopIteration
        print(f"At start of {self.send}, current total is", self.total)
        if not isinstance(integers, (list, tuple)):
            integers = [integers]
        if integers and isinstance(integers[0], collections.Iterable):
            integers = integers[0]
        print(f"{self.send} received", integers)
        self.total += sum(integers)
        print(f"At end of {self.send}, returning current total", self.total)
        return self.total
    
    def throw(self, exc_type, exc_value=None, exc_traceback=None):
        if self.stopped:
            raise StopIteration
        exc_info = (exc_type, exc_value, exc_traceback)
        print(f"At start of {self.throw}, current total is", self.total)
        self.stopped = True
        print(f"{self.throw} received exception", exc_info, "and is returning with final total", self.total)
        raise StopIteration(self.total)
        

def send_values_into_adder(adder, *integers):
    print('Sending', integers, 'into', adder)
    current_total = adder.send(integers)
    print('Current total in', adder, 'is', current_total)
    return current_total


adder = Adder()
print(adder)
print()

adder.send([])
print()
adder.send(10)
print()
adder.send([1, 2, 3])
print()
adder.send(range(8))
print()
print('Sending StopAdder into adder')
try:
    adder.throw(StopAdder)
except StopIteration as exc:
    print('Final total from adder is', exc.value)









    



<Adder: total=0 stopped=False>

At start of <bound method Adder.send of <Adder: total=0 stopped=False>>, current total is 0
<bound method Adder.send of <Adder: total=0 stopped=False>> received []
At end of <bound method Adder.send of <Adder: total=0 stopped=False>>, returning current total 0

At start of <bound method Adder.send of <Adder: total=0 stopped=False>>, current total is 0
<bound method Adder.send of <Adder: total=0 stopped=False>> received [10]
At end of <bound method Adder.send of <Adder: total=10 stopped=False>>, returning current total 10

At start of <bound method Adder.send of <Adder: total=10 stopped=False>>, current total is 10
<bound method Adder.send of <Adder: total=10 stopped=False>> received [1, 2, 3]
At end of <bound method Adder.send of <Adder: total=16 stopped=False>>, returning current total 16

At start of <bound method Adder.send of <Adder: total=16 stopped=False>>, current total is 16
<bound method Adder.send of <Adder: total=16 stopped=False>> received range(0, 8)
At end of <bound method Adder.send of <Adder: total=44 stopped=False>>, returning current total 44

Sending StopAdder into adder
At start of <bound method Adder.throw of <Adder: total=44 stopped=False>>, current total is 44
<bound method Adder.throw of <Adder: total=44 stopped=True>> received exception (<class '__main__.StopAdder'>, None, None) and is returning with final total 44
Final total from adder is 44

Copyright Attributions

This chapter includes a code sample from https://github.com/python/cpython/blob/v3.6.1/Lib/contextlib.py.
Copyright (c) 2001-2017 Python Software Foundation.
All Rights Reserved.
License: Python license, https://www.python.org/3.6/license.html

License

License: Apache License, Version 2.0
Jordan Moldow, 2017

Copyright 2017 Jordan Moldow

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.