Iterators

In Python, an Iterator is an instance of any class that defines the __iter__() and __next__() (next() in Python 2) magic methods.

iter(iterator) is the same as iterator.__iter__(), and should always return back iterator, to indicate that the object is an iterator for itself.

next(iterator) is the same as iterator.__next__(), and value = next(iterator, default) is the same as

try:
    value = iterator.__next__()
except StopIteration:
    value = default

__next__() is a method that computes and returns the next element of the iterator. When no more elements are remaining, it must raise StopIteration. Thus, __next__() mutates the internal state of the iterator, and by default and by convention, iterators are one-time use only. All rules for passing around mutable objects apply. If you pass an iterator to a function, then it might read some or all of the remaining elements, which will not be seen by the next function that tries to read from the iterator.


In [1]:
# Example iterator.

import collections


class RangeIterator(collections.Iterator):
    """Naive implementation of builtins.range() iterator."""
    def __init__(self, stop):
        if not isinstance(stop, int):
            raise TypeError('stop must be an int')
        if stop < 0:
            raise ValueError('stop must be >= 0')
        super().__init__()
        self.stop = stop
        self.next_item = 0 if (stop > 0) else StopIteration()
        
    def __repr__(self):
        return f"<{self.__class__.__name__}({self.stop!r}): next_item={self.next_item!r}>"
    
    # __iter__ is already defined in `collections.Iterator` as
    #
    # def __iter__(self):
    #     return self
    
    def __next__(self):
        item = self.next_item
        if isinstance(item, StopIteration):
            raise StopIteration
        self.next_item += 1
        if self.next_item >= self.stop:
            self.next_item = StopIteration()
        return item

In [2]:
range_iterator = RangeIterator(2)
range_iterator


Out[2]:
<RangeIterator(2): next_item=0>

In [3]:
iter(range_iterator), iter(range_iterator) is range_iterator


Out[3]:
(<RangeIterator(2): next_item=0>, True)

In [4]:
next(range_iterator), range_iterator


Out[4]:
(0, <RangeIterator(2): next_item=1>)

In [5]:
next(range_iterator), range_iterator


Out[5]:
(1, <RangeIterator(2): next_item=StopIteration()>)

In [6]:
import traceback

try:
    next(range_iterator)
except StopIteration:
    traceback.print_exc()


Traceback (most recent call last):
  File "<ipython-input-6-e6a2b2ed7925>", line 4, in <module>
    next(range_iterator)
  File "<ipython-input-1-ea2a2b1cf1f9>", line 28, in __next__
    raise StopIteration
StopIteration

In [7]:
next(range_iterator, 2)


Out[7]:
2

Iterables

In Python, an Iterable is an instance of any class that defines the __iter__() magic method. Iterator is a subclass of Iterable.

iter(iterable) is the same as iterable.__iter__(), and should always return an iterator for the iterable. This iterator can then be iterated over, in order to retrieve the elements of the iterable.

Iterables (including all iterators) can be one-time use only, but they can also be reusable. When they are reusable, __iter__() will return a brand-new iterator that starts at the first item, and advancing the iterator will have no effect on the original iterable.


In [8]:
# Example iterable.

import collections


class RangeIterable(collections.Iterable):
    """Naive implementation of an builtins.range() iterable."""
    def __init__(self, stop):
        super().__init__()
        self.stop = stop
        
    def __repr__(self):
        return f"{self.__class__.__name__}({self.stop!r})"
    
    def __iter__(self):
        return RangeIterator(stop=self.stop)

In [9]:
range_iterable = RangeIterable(2)
range_iterable


Out[9]:
RangeIterable(2)

In [10]:
import traceback

try:
    next(range_iterable)
except TypeError:
    traceback.print_exc()


Traceback (most recent call last):
  File "<ipython-input-10-4644b570951a>", line 4, in <module>
    next(range_iterable)
TypeError: 'RangeIterable' object is not an iterator

In [11]:
iter(range_iterable)


Out[11]:
<RangeIterator(2): next_item=0>

In [12]:
iter(range_iterable) is range_iterable


Out[12]:
False

In [13]:
iter(range_iterable) is iter(range_iterable)


Out[13]:
False

In [14]:
next(iter(range_iterable))


Out[14]:
0

Syntax for iteration: for-loops

Python code can, and often does, manually iterate through iterators and iterables, via iter() and next(). However, the Python language syntax has built-in support for automatically iterating over the elements of an iterable. The for-loop, which is common in most modern programming languages (though Python's for-loop is really a for-each loop), runs through an iterable one element at a time, making use of the current element in a block of code.

for item in iterable:
    # do something with item, such as
    print(item)

If we ignore the semantics of continue, break, and else, a for-loop generally looks like

for TARGET in ITER:
    BLOCK

and is syntactic sugar for something like

iterable = (ITER)
iterator = iter(iterable)
running = True
while running:
    try:
        TARGET = next(iterator)
    except StopIteration:
        running = False
    else:
        BLOCK

Note that the for-loop construct has special handling for StopIteration. The for-loop construct is intimately aware of the iterator protocol, and knows to catch StopIteration and interpret it as the end of iteration.


In [15]:
for item in RangeIterable(2):
    print(item)


0
1

In [16]:
def manual_simplified_for_loop(iterable, function):
    iterator = iter(iterable)
    running = True
    while running:
        try:
            item = next(iterator)
        except StopIteration:
            running = False
        else:
            function(item)

In [17]:
manual_simplified_for_loop(RangeIterable(2), print)


0
1

Other uses of iterables

Many functions accept iterables and then iterate over them, either manually or with for-loops.


In [18]:
list(RangeIterable(5))


Out[18]:
[0, 1, 2, 3, 4]

In [19]:
list(filter(None, RangeIterable(5)))


Out[19]:
[1, 2, 3, 4]

License

License: Apache License, Version 2.0
Jordan Moldow, 2017

Copyright 2017 Jordan Moldow

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.