Keeping the Last N Items

Problem

  • You want to make a list of the largest or smallest N items in a collection.

Solution

  • The heapq module has two functions—nlargest() and nsmallest()

Example 1


In [1]:
import heapq
nums = [1, 8, 2, 23, 7, -4, 18, 23, 42, 37, 2] 
print(heapq.nlargest(3, nums)) # Prints [42, 37, 23] 
print(heapq.nsmallest(3, nums)) # Prints [-4, 1, 2]


[42, 37, 23]
[-4, 1, 2]

Example 2

  • Use with key parameter

In [2]:
portfolio = [
       {'name': 'IBM', 'shares': 100, 'price': 91.1},
       {'name': 'AAPL', 'shares': 50, 'price': 543.22},
       {'name': 'FB', 'shares': 200, 'price': 21.09},
       {'name': 'HPQ', 'shares': 35, 'price': 31.75},
       {'name': 'YHOO', 'shares': 45, 'price': 16.35},
       {'name': 'ACME', 'shares': 75, 'price': 115.65}
]

cheap = heapq.nsmallest(3, portfolio, key=lambda s: s['price'])
expensive = heapq.nlargest(3, portfolio, key=lambda s: s['price'])

print(cheap)
print(expensive)


[{'name': 'YHOO', 'price': 16.35, 'shares': 45}, {'name': 'FB', 'price': 21.09, 'shares': 200}, {'name': 'HPQ', 'price': 31.75, 'shares': 35}]
[{'name': 'AAPL', 'price': 543.22, 'shares': 50}, {'name': 'ACME', 'price': 115.65, 'shares': 75}, {'name': 'IBM', 'price': 91.1, 'shares': 100}]

Example 2

  • The following code performs a simple text match on a sequence of lines and yields the matching line along with the previous N lines of context when found

In [34]:
import os

file_dir = os.path.dirname(os.path.realpath('__file__'))
filename = os.path.abspath(os.path.join(file_dir, "..", "code/src/1/keeping_the_last_n_items/somefile.txt"))

In [36]:
!head $filename












In [37]:
from collections import deque

def search(lines, pattern, history=5):
    previous_lines = deque(maxlen=history) 
    for line in lines:
        if pattern in line:
            yield line, previous_lines
        previous_lines.append(line)
        
# Example use on a file
if __name__ == '__main__':
    with open(filename) as f:
        for line, prevlines in search(f, 'python', 5):
            for pline in prevlines:
                print(pline, end='') 
            print(line, end='') 
            print('-'*20)


Keeping a limited history is a perfect use for a `collections.deque`.
For example, the following code performs a simple text match on a
sequence of lines and prints the matching line along with the previous
N lines of context when found:

[source,python]
--------------------
        previous_lines.append(line)

# Example use on a file
if __name__ == '__main__':
    with open('somefile.txt') as f:
         search(f, 'python', 5)
--------------------