- Iterators are easy to understand


In [ ]:
spam = [0, 1, 2, 3, 4]

for item in spam:
    print item
else:
    print "Looped whole list"

In [ ]:
# What is really happening here?


it = iter(spam)                            # Obtain an iterator
try:
    item = it.next()                      # Retrieve first item through the iterator
    while True:
        # Body of the for loop goes here
        print item
        item = it.next()                  # Retrieve next item through the iterator
except StopIteration:                     # Capture iterator exception
    # Body of the else clause goes here
    print "Looped whole list"

In [ ]:
# Another example

spam = "spam"
it = iter(spam)

In [ ]:
print it.next()
print it.next()
print it.next()
print it.next()

In [ ]:
# Once the StopIteration is raised an iterator is useless, there is no 'restart'
print it.next()

Python Generators


In [ ]:
# expression generator

spam = [0, 1, 2, 3, 4]
fooo = (2 ** s for s in spam)  # Syntax similar to list comprehension but between parentheses
print fooo

In [ ]:
print fooo.next()
print fooo.next()
print fooo.next()
print fooo.next()
print fooo.next()

In [ ]:
# Generator is exhausted
print fooo.next()
  • Generators are a simple and powerful tool for creating iterators.
  • Each iteration is computed on demand
  • In general terms they are more efficient than list comprehension or loops
    • If not the whole sequence is traversed
      • When looking for a certain element
      • When an exception is raised
    • So they save computing power and memory
  • Used to operate with I/O, with big amounts of data (e.g. DB queries)...

yield


In [ ]:
def countdown(n):
    while n > 0:
        yield n
        n -= 1

In [ ]:
gen_5 = countdown(5)
gen_5

In [ ]:
# where is the sequence?
print gen_5.next()
print gen_5.next()
print gen_5.next()
print gen_5.next()
print gen_5.next()

In [ ]:
gen_5.next()

In [ ]:
for i in countdown(5):
    print i,
  • yield makes a function a generator
  • the function only executes on next (easier than implements iteration)
  • it produces a value and suspend the execution of the function

In [ ]:
# Let's see another example with yield tail -f and grep
import time


def follow(thefile):
    thefile.seek(0, 2)  # Go to the end of the file
    while True:
        line = thefile.readline()
        if not line:
            time.sleep(0.1)    # Sleep briefly
            continue
        yield line

In [ ]:
logfile = open("fichero.txt")
for line in follow(logfile):
    print line,

In [ ]:
# Ensure f is closed
if logfile and not logfile.closed:
    logfile.close()

using generators to build a pipeline as unix (tail + grep)


In [ ]:
def grep(pattern, lines):
    for line in lines:
        if pattern in line:
            yield line

# TODO: use a generator expression

In [ ]:
# Set up a processing pipe : tail -f | grep "tefcon"
logfile = open("fichero.txt")
loglines = follow(logfile)
pylines = grep("python", loglines)

In [ ]:
# nothing happens until now
# Pull results out of the processing pipeline
for line in pylines:
    print line,

In [ ]:
# Ensure f is closed
if logfile and not logfile.closed:
    logfile.close()

In [ ]:
# Yield can be used as an expression too

def g_grep(pattern):
    print "Looking for %s" % pattern
    while True:
        line = (yield)
        if pattern in line:
            print line,

Coroutines

  • Using yield as this way we get a coroutine
  • function not just returns values, it can consume values that we send

In [ ]:
g = g_grep("python")
g.next()

In [ ]:
g.send("Prueba a ver si encontramos algo")
g.send("Hemos recibido python")
  • Sent values are returned in (yield)
  • Execution as a generator function
  • coroutines responds to next and send

In [ ]:
# avoid the first next call -> decorator
import functools
def coroutine(func):
    def wrapper(*args, **kwargs):
        cr = func(*args, **kwargs)
        cr.next()
        return cr
    return wrapper

In [ ]:
@coroutine
def cool_grep(pattern):
    print "Looking for %s" % pattern
    while True:
        line = (yield)
        if pattern in line:
            print line,

In [ ]:
g = cool_grep("python")

In [ ]:
# no need of call next
g.send("Prueba a ver si encontramos algo")
g.send("Prueba a ver si python es cool")

In [ ]:
# use close to shutdown a coroutine (can run forever)

In [ ]:
@coroutine
def last_grep(pattern):
    print "Looking for %s" % pattern
    try:
        while True:
            line = (yield)
            if pattern in line:
                print line,
    except GeneratorExit:
        print "Going away. Goodbye"

In [ ]:
# Exceptions can be thrown inside a coroutine
g = last_grep("python")
g.send("Prueba a ver si encontramos algo")
g.send("Prueba a ver si python es cool")

In [ ]:
g.close()

In [ ]:
g.send("prueba a ver si python es cool")

In [ ]:
# can send exceptions
g.throw(RuntimeError, "Lanza una excepcion")
  • generators produces values and coroutines mostly consumes
  • DO NOT mix the concepts to avoid exploiting your mind
  • Coroutines are not for iteratin

In [ ]:
def countdown_bug(n):
    print "Counting down from", n
    while n >= 0:
        newvalue = (yield n)
        # If a new value got sent in, reset n with it
        if newvalue is not None:
            n = newvalue
        else:
            n -= 1

In [ ]:
c = countdown_bug(5)
for n in c:
    print n
    if n == 5:
        c.send(3)

What has happened here?

  • chain coroutines together and push data through the pipe using send()
  • you need a source that normally is not a coroutine
  • you will also needs a pipelines sinks (end-point) that consumes data and processes
  • don't mix the concepts too much

lets go back to the tail -f and grep

our source is tail -f


In [ ]:
import time
def c_follow(thefile, target):
    thefile.seek(0,2)  # Go to the end of the file
    while True:
        line = thefile.readline()
        if not line:
            time.sleep(0.1)    # Sleep briefly
        else:
            target.send(line)

In [ ]:
# a sink: just print
@coroutine
def printer(name):
    while True:
        line = (yield)
        print name + " : " + line,

In [ ]:
# example
f = open("fichero.txt")
c_follow(f, printer("uno"))

In [ ]:
# Ensure f is closed
if f and not f.closed:
    f.close()

In [ ]:
# Pipeline filters: grep
@coroutine
def c_grep(pattern,target):
    while True:
        line = (yield)  # Receive a line
        if pattern in line:
            target.send(line)
            # Send to next stage

In [ ]:
# Exercise: tail -f "fichero.txt" | grep "python"
# do not forget the last print as sink

In [ ]:
# We have the same, with iterators we pull data with iteration
# With coroutines we push data with send
# BROADCAST
@coroutine
def broadcast(targets):
    while True:
        item = (yield)
        for target in targets:
            target.send(item)

In [ ]:
f = open("fichero.txt")
c_follow(f,
         broadcast([c_grep('python', printer("uno")),
                    c_grep('hodor', printer("dos")),
                    c_grep('hold', printer("tres"))])
)
  • coroutines add routing
  • complex arrrangment of pipes, branches, merging...

In [ ]:
if f and not f.closed:
    f.close()
f = open("fichero.txt")
p = printer("uno")
c_follow(f,
         broadcast([c_grep('python', p),
                    c_grep('hodor', p),
                    c_grep('hold', p)])
)

In [ ]:
if f and not f.closed:
    f.close()