A Generator
is a type of iterator which can be constructed natively inside a Python def
function. A def
function is a generator function if it contains any yield
statements. And the immediate return value of calling such a function is a generator iterator, or just a generator.
A generator doesn't execute all at once like a regular function does. In fact, calling a generator function doesn't execute any of that function's code immediately, whereas calling a regular function executes all of its code immediately until it returns. Whenever a generator is run, it runs until the end of the function OR until the next yield
statement. At that point, the value in the yield
expression is returned to the caller, and the execution of the generator is suspended until it is next called.
While execution is suspended, the full local state of the execution of the generator function closure is stored in memory. This allows it to be resumed later, as if it had never paused.
A generator is most commonly used as a normal iterator. This means that it can be automatically iterated over with a for-loop, or manually iterated over with next()
. It can also be passed to any function that expects an iterator or an iterable. next()
returns the value of the first yield
statement it encounters, or raises StopIteration
if the generator function completes.
In [1]:
def range_generator_function(stop):
"""Naive implementation of builtins.range generator."""
# This function runs immediately, since it has no `yield` statements.
# It is a normal function, which happens to return a generator iterator.
print("Running line 1")
if not isinstance(stop, int):
raise TypeError('stop must be an int')
if stop < 0:
raise ValueError('stop must be >= 0')
print("Running line 2")
range_generator = _range_generator_function(stop=stop)
print("Running line 3")
return range_generator
def _range_generator_function(stop):
# This function does not run immediately, since it has `yield` statements.
# It is a generator function, and returns a generator iterator.
index = 0
print("Running line 4")
while index < stop:
print("Running line 5 with index", index)
yield index
print("Running line 6 with index", index)
index += 1
print("Running line 7 with index", index)
In [2]:
range_generator = range_generator_function(2) # Executes all prints in `range_generator_function()`,
range_generator # but none in `_range_generator_function()`.
Out[2]:
In [3]:
import collections
isinstance(range_generator, collections.Iterable), isinstance(range_generator, collections.Iterator)
Out[3]:
In [4]:
isinstance(range_generator, collections.Generator)
Out[4]:
In [5]:
next(range_generator)
Out[5]:
In [6]:
next(range_generator)
Out[6]:
In [7]:
import traceback
try:
next(range_generator)
except StopIteration:
traceback.print_exc()
In [8]:
next(range_generator, 2) # Generator is exhausted, nothing more will get printed.
Out[8]:
In [9]:
range_generator = range_generator_function(4)
for item in range_generator:
print('yielded', item)
The above covers the majority of use cases of generators. It is probably the case that most Python programmers have used the above, or at least encountered it while reading other Python code.
However, generators are more than just a special syntax for iterators. There is more to the generator protocol, though it tends to be less widely used and known.
Generators implement three extra methods besides the two (__iter__()
and __next__()
) that it needs to be an iterator. These are the send()
, throw()
, and close()
methods. Together, they allow for even more control over how a generator executes its body when it resumes execution.
Generators haven't always had this functionality. It was added in Python 2.5, via PEP 342.
send()
is a slight generalization of __next__()
. It accepts a single value, and that value becomes the result of the yield
expression as the generator resumes. The function returns the next value to be yielded, or raises StopIteration
if the generator completes or returns. __next__()
is actually equivalent to send(None)
.
In [10]:
def generator_function():
print((yield 0))
print((yield 1))
generator = generator_function()
print(next(generator)) # Advance generator to first `yield` statement.
item = generator.send('print this')
print('yielded', item)
try:
next(generator) # Same as `generator.send(None)`
except StopIteration:
pass
throw()
accepts an exception, and raises it inside the generator, at the yield
where the generator is currently paused. If the generator manages to yield
another value, that will get returned from the function. Otherwise, any exceptions raised out of the generator will be propagated out (the same is true for send()
and __next__()
).
In [11]:
import traceback
class ExpectedError(Exception): pass
def generator_function():
for i in range(2):
try:
yield i
except ExpectedError as exc:
print('Caught exception', repr(exc))
continue
except Exception as exc:
print('Did not catch exception', repr(exc))
raise
return i
generator = generator_function()
next(generator)
item = generator.throw(ExpectedError)
print('yielded', item)
try:
generator.throw(KeyError('key'))
except KeyError:
traceback.print_exc()
In [12]:
generator = generator_function()
next(generator)
item = generator.throw(ExpectedError)
print('yielded', item)
try:
generator.throw(ExpectedError)
except StopIteration as exc:
traceback.print_exc()
print(repr(exc))
close()
instructs the generator to stop yielding elements and exit. It is similar, though not identical, to throw(GeneratorExit)
.
The GeneratorExit
class is a subclass of BaseException
, but is not a subclass of Exception
. This makes it less likely that a generator function will catch and ignore it by mistake.
close()
normally returns None
, if the generator function returns without any errors, or if the GeneratorExit
exception is propagated out of the generator function. If the generator function raises a new exception while handling GeneratorExit
, then that exception is raised by close()
. It is illegal for the generator to yield a new value while handling close()
, so doing so causes RuntimeError
to be raised instead.
In [13]:
def generator_function():
try:
yield
except:
traceback.print_exc()
raise
print('About to yield 1')
yield 1
generator = generator_function()
next(generator)
generator.close()
In [14]:
def generator_function():
try:
yield 0
except:
raise KeyError('key')
generator = generator_function()
next(generator)
try:
generator.close()
except KeyError:
traceback.print_exc()
When executing a normal function, it begins execution as soon as it is called, continues (without giving up control) until it completes (by returning or throwing), and cleans up its call stack when exiting.
But with a generator, it gives up control every time it hits a yield
, even if the generator isn't done executing yet. When it does so, then generator becomes suspended. So that its execution can be continued later, its locals()
are saved on the generator object. Later, it can be resumed (by calling __next__()
, send()
, throw()
, or close()
), and its locals()
will be restored before execution resumes from the yield
point.
This has been implicitly assumed in previous examples. For example, in the section The basics of generators, the generator function _range_generator_function()
relies on this mechanism in order to remember the values of stop
and index
between steps. There will be a more advanced example of this in the section Generators as coroutines.
There are other clever ways in which this can be utilized. One great usage of generators is in the implementation of @contextlib.contextmanager
. This decorator combines the power of generators with the power of the context manager protocol, in order to produce something quite powerful.
Code sample from https://github.com/python/cpython/blob/v3.6.1/Lib/contextlib.py
Copyright (c) 2001-2017 Python Software Foundation.
All Rights Reserved.
License: Python license, https://www.python.org/3.6/license.html
Some modifications made so as to only highlight the interesting parts.
def contextmanager(func):
"""@contextmanager decorator.
Typical usage:
@contextmanager
def some_generator(<arguments>):
<setup>
try:
yield <value>
finally:
<cleanup>
This makes this:
with some_generator(<arguments>) as <variable>:
<body>
equivalent to this:
<setup>
try:
<variable> = <value>
<body>
finally:
<cleanup>
"""
@functools.wraps(func)
def helper(*args, **kwds):
return _GeneratorContextManager(func, args, kwds)
return helper
class _GeneratorContextManager(ContextDecorator, AbstractContextManager):
def __init__(self, func, args, kwds):
self.gen = func(*args, **kwds)
def __enter__(self):
try:
return next(self.gen)
except StopIteration:
raise RuntimeError("generator didn't yield") from None
def __exit__(self, type, value, traceback):
if type is None:
try:
next(self.gen)
except StopIteration:
return
else:
raise RuntimeError("generator didn't stop")
else:
try:
self.gen.throw(type, value, traceback)
raise RuntimeError("generator didn't stop after throw()")
except StopIteration as exc:
return exc is not value
The @contextlib.contextmanager
decorator makes it really easy for any developer to create their own modular and reusable setup/cleanup logic, or enter/exit logic, without needing to create a new class with __enter__()
and __exit__()
methods. @contextlib.contextmanager
handles that, with the help of the generator that the developer defines.
When __enter__()
is called upon entering the block, __next__()
is called in order to perform the setup logic, which is everything up until the yield
. When the yield
is encountered, the generator suspends its execution and saves its state, the __enter__()
call returns, and the Python interpretter begins executing the block.
When __exit__()
is called upon exiting the block, either __next__()
or throw()
will be called in order to perform the cleanup logic. Upon resuming the generator, its internal state is restored, and the cleanup can proceed as if it were a normal function.
When the cleanup logic is placed inside a finally
block inside the generator function, we get an awesome combination of Python features (generators, context managers, and finally
blocks) which guarantees that your cleanup code will always be executed whenever the block exits (assuming no bugs in @contextlib.contextmanager
).
This shows how flexible the yielding execution model is. When a generator yields, the caller doesn't have to immediately resume the generator. It can go off and do something else, and then resume it at any point in the future. The really neat thing here, is that the yield
is essentially replaced with the execution of an arbitrary block of Python code. In future chapters, we'll look at other clever uses of suspending execution which can produce very powerful models of program execution.
With the full generator protocol, a generator is equivalent to a semicoroutine. It is also possible to build a custom dispatcher/trampoline, in order to implement a coroutine system. Reference: https://en.wikipedia.org/wiki/Coroutine#Comparison_with_generators. This is used in various concurrency libraries, and is demonstrated in David Beazley's PyCon 2015 talk "Python Concurrency From the Ground Up: LIVE!". See also the examples given in PEP 342.
Here's a silly example of a semicoroutine in action.
In [15]:
import collections
class StopAdder(Exception): pass
def adder_function():
total = 0
while True:
print('At start of adder loop, current total is', total)
try:
integers = (yield total)
except (Exception, GeneratorExit) as exc:
print('Adder received exception', repr(exc), 'and is returning with final total', total)
return total
if not isinstance(integers, (list, tuple)):
integers = [integers]
if integers and isinstance(integers[0], collections.Iterable):
integers = integers[0]
print('Adder received', integers)
total += sum(integers)
def send_values_into_adder(adder, *integers):
print('Sending', integers, 'into adder')
current_total = adder.send(integers)
print('Current total in adder is', current_total)
return current_total
adder = adder_function()
next(adder)
send_values_into_adder(adder)
print()
send_values_into_adder(adder, 10)
print()
send_values_into_adder(adder, 1, 2, 3)
print()
send_values_into_adder(adder, range(8))
print()
print('Sending StopAdder into adder')
try:
adder.throw(StopAdder)
except StopIteration as exc:
print('Final total from adder is', exc.value)
In the following example, the generator will yield once, and then raise StopIteration
when it hits the return
statement.
In [16]:
def generator_function():
yield 0
return
generator = generator_function()
next(generator)
Out[16]:
In [17]:
import traceback
try:
next(generator)
except StopIteration:
traceback.print_exc()
It might be thought that generators will only pass value via yield
, and will always have empty return
statements. This is true up through Python 3.2, but that changed in Python 3.3. Now, just like in normal functions, a generator function can return arbitrary values with return
statements (but note that this is illegal in earlier versions of Python, and will cause a SyntaxError
).
In [18]:
import traceback
def generator_function_that_returns_a_value():
yield 0
return 'return_value'
generator = generator_function_that_returns_a_value()
next(generator)
try:
next(generator)
except StopIteration as exc:
traceback.print_exc()
print(repr(exc))
print(repr(exc.value))
try:
next(generator)
except StopIteration as exc:
traceback.print_exc() # Subsequent calls to `next()` do not use the return value
print(repr(exc)) # when raising `StopIteration`.
print(repr(exc.value))
Notice that the Python interpretter converted the return 'return_value'
into a raise StopIteration('return_value')
. Also notice that the exception object has a .value
attribute, which holds this value.
This behavior is hidden when using for-loops or other functionality that catches and ignores StopIteration
.
In [19]:
for item in generator_function_that_returns_a_value():
print('yielded', item)
In [20]:
list(generator_function_that_returns_a_value())
Out[20]:
Generators aren't required to be implemented using generator functions. If you really wanted to, you could also implement one yourself. Just make sure to define the send()
and throw()
methods correctly, and to correctly define __next__()
, __iter__()
, and close()
in terms of those.
Starting in Python 3.5, the collections
module defines an abstract Generator
baseclass. When subclassing this, you only need to define send()
and throw()
. The other required methods are already implemented for you correctly.
In [21]:
import collections
class Adder(collections.Generator):
def __init__(self):
super().__init__()
self.total = 0
self.stopped = False
def __repr__(self):
return f"<{self.__class__.__name__}: total={self.total!r} stopped={self.stopped!r}>"
def send(self, integers):
if self.stopped:
raise StopIteration
print(f"At start of {self.send}, current total is", self.total)
if not isinstance(integers, (list, tuple)):
integers = [integers]
if integers and isinstance(integers[0], collections.Iterable):
integers = integers[0]
print(f"{self.send} received", integers)
self.total += sum(integers)
print(f"At end of {self.send}, returning current total", self.total)
return self.total
def throw(self, exc_type, exc_value=None, exc_traceback=None):
if self.stopped:
raise StopIteration
exc_info = (exc_type, exc_value, exc_traceback)
print(f"At start of {self.throw}, current total is", self.total)
self.stopped = True
print(f"{self.throw} received exception", exc_info, "and is returning with final total", self.total)
raise StopIteration(self.total)
def send_values_into_adder(adder, *integers):
print('Sending', integers, 'into', adder)
current_total = adder.send(integers)
print('Current total in', adder, 'is', current_total)
return current_total
adder = Adder()
print(adder)
print()
adder.send([])
print()
adder.send(10)
print()
adder.send([1, 2, 3])
print()
adder.send(range(8))
print()
print('Sending StopAdder into adder')
try:
adder.throw(StopAdder)
except StopIteration as exc:
print('Final total from adder is', exc.value)
This chapter includes a code sample from https://github.com/python/cpython/blob/v3.6.1/Lib/contextlib.py.
Copyright (c) 2001-2017 Python Software Foundation.
All Rights Reserved.
License: Python license, https://www.python.org/3.6/license.html
License: Apache License, Version 2.0
Jordan Moldow, 2017
Copyright 2017 Jordan Moldow Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.