A generator is a function that returns an object (iterator) which can be iterated over (one value at a time).. In other words, a generator is a statefull (with memory) function that for a sequence of identical calls produces a sequence of different results. Generators are usually used to implement iterators, when we don't want to store all the generated values in memory.

Generators and the yield statement


In [12]:
def yrange(max_i):
    i = 0
    while i < max_i:
        yield i  # A special return that continues with the next() call.
        i += 1   # next() continues here.

In [13]:
for i in yrange(10):
    print(i, end=' ')


0 1 2 3 4 5 6 7 8 9 

In [57]:
# https://es.wikipedia.org/wiki/Sucesi%C3%B3n_de_Fibonacci
# https://www.mathsisfun.com/numbers/fibonacci-sequence.htm
def fib(n):
    i = 0
    a, b = 0, 1
    while i < n:
        yield a
        a, b = b, a+b
        i += 1
        
for i in fib(10):
    print(i, end=' ')


0 1 1 2 3 5 8 13 21 34 

Generator expressions

Generator expressions can be considered as memory efficient generalization of list comprehensions, and also as compact representations of simple generators.


In [38]:
list_comprehension = [x*x for x in range(100)]
print(list_comprehension)


[0, 1, 4, 9, 16, 25, 36, 49, 64, 81, 100, 121, 144, 169, 196, 225, 256, 289, 324, 361, 400, 441, 484, 529, 576, 625, 676, 729, 784, 841, 900, 961, 1024, 1089, 1156, 1225, 1296, 1369, 1444, 1521, 1600, 1681, 1764, 1849, 1936, 2025, 2116, 2209, 2304, 2401, 2500, 2601, 2704, 2809, 2916, 3025, 3136, 3249, 3364, 3481, 3600, 3721, 3844, 3969, 4096, 4225, 4356, 4489, 4624, 4761, 4900, 5041, 5184, 5329, 5476, 5625, 5776, 5929, 6084, 6241, 6400, 6561, 6724, 6889, 7056, 7225, 7396, 7569, 7744, 7921, 8100, 8281, 8464, 8649, 8836, 9025, 9216, 9409, 9604, 9801]

In [24]:
import sys
sys.getsizeof(list_comprehension)


Out[24]:
904

In [25]:
generator_expression = (x*x for x in range(100))
generator_expression


Out[25]:
<generator object <genexpr> at 0x7f35d79d67b0>

In [26]:
sys.getsizeof(generator_expression)


Out[26]:
112

A generator expression can be faster than a list comprehension when the list does not fit in the cache.


In [33]:
%timeit sum([x*x for x in range(10)])


717 ns ± 9.73 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

In [31]:
%timeit sum([x*x for x in range(10000000)])


802 ms ± 11.5 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [34]:
%timeit sum(x*x for x in range(10))


804 ns ± 9.54 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

In [32]:
%timeit sum(x*x for x in range(10000000))


683 ms ± 9.09 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Generator expressions can be used for defining loops


In [58]:
for x in (i*2 for i in range(10)):
    print(x, end=' ')


0 2 4 6 8 10 12 14 16 18 

Nesting generator expressions


In [62]:
import time
c = 0
now = time.time()
# Notice that this is a memoryless process whilst list compressions produce lists.
for i in (x for x in range(2, 2000) if all(x % y != 0 for y in range(2, int(x ** 0.5) + 1))):
    c += 1
    print(i, end=' ')
print('\n{} primes found in {} seconds'.format(c,time.time() - now))


2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97 101 103 107 109 113 127 131 137 139 149 151 157 163 167 173 179 181 191 193 197 199 211 223 227 229 233 239 241 251 257 263 269 271 277 281 283 293 307 311 313 317 331 337 347 349 353 359 367 373 379 383 389 397 401 409 419 421 431 433 439 443 449 457 461 463 467 479 487 491 499 503 509 521 523 541 547 557 563 569 571 577 587 593 599 601 607 613 617 619 631 641 643 647 653 659 661 673 677 683 691 701 709 719 727 733 739 743 751 757 761 769 773 787 797 809 811 821 823 827 829 839 853 857 859 863 877 881 883 887 907 911 919 929 937 941 947 953 967 971 977 983 991 997 1009 1013 1019 1021 1031 1033 1039 1049 1051 1061 1063 1069 1087 1091 1093 1097 1103 1109 1117 1123 1129 1151 1153 1163 1171 1181 1187 1193 1201 1213 1217 1223 1229 1231 1237 1249 1259 1277 1279 1283 1289 1291 1297 1301 1303 1307 1319 1321 1327 1361 1367 1373 1381 1399 1409 1423 1427 1429 1433 1439 1447 1451 1453 1459 1471 1481 1483 1487 1489 1493 1499 1511 1523 1531 1543 1549 1553 1559 1567 1571 1579 1583 1597 1601 1607 1609 1613 1619 1621 1627 1637 1657 1663 1667 1669 1693 1697 1699 1709 1721 1723 1733 1741 1747 1753 1759 1777 1783 1787 1789 1801 1811 1823 1831 1847 1861 1867 1871 1873 1877 1879 1889 1901 1907 1913 1931 1933 1949 1951 1973 1979 1987 1993 1997 1999 
303 primes found in 0.0437009334564209 seconds

Relationship between generators and generator expressions

A generator expression is a compact representation of a generator.


In [52]:
powers_of_two_generator_expression = (x**2 for x in range(10))
print(next(powers_of_two_generator_expression))
print(next(powers_of_two_generator_expression))
print(next(powers_of_two_generator_expression))


0
1
4

In [53]:
def generate_powers_of_two(exp):
    for x in exp:
        yield x**2
g = generate_powers_of_two(iter(range(10)))
print(next(g))
print(next(g))
print(next(g))


0
1
4

Default behaviour

When we iterate over a (function) generator or a generator expression, the next() function is automatically invoked.


In [66]:
powers_of_myself_generator_expression = (x**x for x in range(10))

In [67]:
for i in powers_of_myself_generator_expression:
    print(i)


1
1
4
27
256
3125
46656
823543
16777216
387420489

In [68]:
def generate_powers_of_myself(exp):
    for x in exp:
        yield x**x
for i in generate_powers_of_myself(iter(range(10))):
    print(i)


1
1
4
27
256
3125
46656
823543
16777216
387420489

6. Generators and Coroutines

Coroutines can be classified as generators that consume data (and, as expected, generate some data). Equivalently, generators (and generator expression) can be considered as coroutines that does not consume data.


In [69]:
def minimize():
    current = yield
    while True:
        value = yield current  # Receives "value" and returns "current"
        current = min(value, current)
        
it = minimize()
next(it)  # "Prime" the coroutine (neccesary to reach the second yield).
print(it.send(10))
print(it.send(4))
print(it.send(22))
print(it.send(-1))


10
4
4
-1