Iterator and generator

4.2 代理迭代 — python3-cookbook 3.0.0 文档



In [1]:

    
# 多行结果输出支持
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"



In [74]:

    
import itertools
item = iter([1, 2, 3, 4])



In [3]:

    
item









    Out[3]:





<list_iterator at 0x7f95e4545d68>



In [4]:

    
for i in item:
    print(i)

iter() 函数的使用简化了代码， iter(s) 只是简单的通过调用 s.iter() 方法来返回对应的迭代器对象，就跟 len(s) 会调用 s.len() 原理是一样的
迭代器协议需要 iter() 方法返回一个实现了 next() 方法的迭代器对象



In [5]:

    
def frange(start, stop, increment):
    x = start
    while x < stop:
        yield x
        x += increment



In [6]:

    
for i in frange(1, 19, 2):
    print(i)

一个函数中需要有一个 yield 语句即可将其转换为一个生成器。跟普通函数不同的是，生成器只能用于迭代操作

class Node:
    def __init__(self, value):
        self._value = value
        self._children = []

    def __repr__(self):
        return 'Node({!r})'.format(self._value)

    def add_child(self, node):
        self._children.append(node)

    def __iter__(self):
        return iter(self._children)

    def depth_first(self):
        yield self
        for c in self:
            yield from c.depth_first()

自定义迭代必须实现 __iter__方法

通过在自定义类上实现 __reversed__() 方法来实现反向迭代
反向迭代仅仅当对象的大小可预先确定或者对象实现了 __reversed__() 的特殊方法时才能生效。如果两者都不符合，那你必须先将对象转换为一个列表才行

class Countdown:
    def __init__(self, start):
        self.start = start

    # Forward iterator
    def __iter__(self):
        n = self.start
        while n > 0:
            yield n
            n -= 1

    # Reverse iterator
    def __reversed__(self):
        n = 1
        while n <= self.start:
            yield n
            n += 1

for rr in reversed(Countdown(30)):
    print(rr)
for rr in Countdown(30):
    print(rr)



In [7]:

    
a = [1, 2, 3, 4]
for x in reversed(a):
    print(x)

numba对numpy的大型数据进行加速, 小型数据的话，效果就不好了



In [40]:

    
import numba
@numba.jit
def sum2d(arr):
    M, N = arr.shape
    result = 0.0
    for i in range(M):
        for j in range(N):
            result += arr[i,j]
    return result



In [45]:

    
import numpy as np
xx = np.random.rand(4000, 4000) * 80000



In [46]:

    
sum2d(xx)









    Out[46]:





639777260772.9878

迭代器切片

函数 itertools.islice() 正好适用于在迭代器和生成器上做切片操作
这里要着重强调的一点是 islice() 会消耗掉传入的迭代器中的数据。必须考虑到迭代器是不可逆的这个事实。所以如果你需要之后再次访问这个迭代器的话，那你就得先将它里面的数据放入一个列表中



In [49]:

    
def count(n):
    while True:
        yield n
        n += 1



In [50]:

    
c = count(0)



In [51]:

    
# Now using islice()
import itertools
for x in itertools.islice(c, 10, 20):
    print(x)

排列组合的迭代

itertools模块提供了三个函数来解决这类问题。其中一个是 itertools.permutations() ，它接受一个集合并产生一个元组序列，每个元组由集合中所有元素的一个可能排列组成。也就是说通过打乱集合中元素排列顺序生成一个元组
使用 itertools.combinations() 可得到输入集合中元素的所有的组合
而函数 itertools.combinations_with_replacement() 允许同一个元素被选择多次 有放回

排列



In [53]:

    
items = ['a', 'b', 'c']
from itertools import permutations
for p in permutations(items):
    print(p)









    



('a', 'b', 'c')
('a', 'c', 'b')
('b', 'a', 'c')
('b', 'c', 'a')
('c', 'a', 'b')
('c', 'b', 'a')



In [54]:

    
# 指定长度的所有排列
for p in permutations(items, 2):
    print(p)









    



('a', 'b')
('a', 'c')
('b', 'a')
('b', 'c')
('c', 'a')
('c', 'b')



In [59]:

    
list(permutations(range(4), 2))









    Out[59]:





[(0, 1),
 (0, 2),
 (0, 3),
 (1, 0),
 (1, 2),
 (1, 3),
 (2, 0),
 (2, 1),
 (2, 3),
 (3, 0),
 (3, 1),
 (3, 2)]

组合



In [61]:

    
from itertools import combinations
for c in combinations(items, 3):
    print(c)









    



('a', 'b', 'c')



In [62]:

    
for c in combinations(items, 2):
   print(c)









    



('a', 'b')
('a', 'c')
('b', 'c')



In [63]:

    
for c in combinations(items, 1):
     print(c)









    



('a',)
('b',)
('c',)



In [64]:

    
for c in combinations(range(5), 2):
     print(c)









    



(0, 1)
(0, 2)
(0, 3)
(0, 4)
(1, 2)
(1, 3)
(1, 4)
(2, 3)
(2, 4)
(3, 4)

有放回抽样



In [67]:

    
from itertools import combinations_with_replacement
for c in combinations_with_replacement(items, 3):
    print(c)









    



('a', 'a', 'a')
('a', 'a', 'b')
('a', 'a', 'c')
('a', 'b', 'b')
('a', 'b', 'c')
('a', 'c', 'c')
('b', 'b', 'b')
('b', 'b', 'c')
('b', 'c', 'c')
('c', 'c', 'c')

不同集合上元素的迭代

itertools.chain() 方法可以用来简化这个任务。它接受一个可迭代对象列表作为输入，并返回一个迭代器，有效的屏蔽掉在多个容器中迭代细节
itertools.chain() 接受一个或多个可迭代对象最为输入参数。然后创建一个迭代器，依次连续的返回每个可迭代对象中的元素。这种方式要比先将序列合并再迭代要高效的多



In [70]:

    
from itertools import chain
a = [1, 2, 3, 4]
b = ['x', 'y', 'z']
# 迭代完 a 再迭代 b
for x in chain(a, b):
    print(x)



In [73]:

    
a = [1, 2, 3, 4]
b = ('x', 'y', 'z')
# 迭代完 a 再迭代 b
# 只要 a，b 是可迭代的就行
for x in chain(a, b):
    print(x)

展开嵌套的序列

可以写一个包含 yield from 语句的递归生成器来轻松解决这个问题
额外的参数 ignore_types 和检测语句 isinstance(x, ignore_types) 用来将字符串和字节排除在可迭代对象外，防止将它们再展开成单个的字符。这样的话字符串数组就能最终返回我们所期望的结果了



In [76]:

    
from collections import Iterable

def flatten(items, ignore_types=(str, bytes)):
    for x in items:
        if isinstance(x, Iterable) and not isinstance(x, ignore_types):
            yield from flatten(x)
        else:
            yield x

items = [1, 2, [3, 4, [5, 6], 7], 8]
for x in flatten(items):
    print(x)

iter 函数一个鲜为人知的特性是它接受一个可选的 callable 对象和一个标记(结尾)值作为输入参数。当以这种方式使用的时候，它会创建一个迭代器，这个迭代器会不断调用 callable 对象直到返回值和标记值相等为止



In [ ]: