Python 中内置的 heapq
库和 queue
分别提供了堆和优先队列结构,其中优先队列 queue.PriorityQueue
本身也是基于 heapq
实现的,因此我们这次重点看一下 heapq
。
堆(Heap)是一种特殊形式的完全二叉树,其中父节点的值总是大于子节点,根据其性质,Python 中可以用一个满足 heap[k] <= heap[2*k+1] and heap[k] <= heap[2*k+2]
的列表来实现(heapq
也确实是这么做的)。堆可以用于实现调度器(例见:Python 3.5 之协程),更常用的是优先队列(例如:ImageColorTheme)。
heapq
提供了下面这些方法:
In [1]:
import heapq
print(heapq.__all__)
由于 Heap 是通过列表实现的,我们可以直接用列表创建:
In [2]:
from heapq import *
heap = []
heappush(heap, 3)
heappush(heap, 2)
heappush(heap, 1)
print(heap)
In [3]:
heap = list(reversed(range(5)))
print("List: ", heap)
heapify(heap)
print("Heap: ", heap)
每次从 Heap 中 pop
出来的元素都是最小的(因而可以据此实现堆排序):
In [4]:
heap = [5,4,3,2,1]
heapify(heap)
print(heappop(heap))
print(heappop(heap))
print(heappop(heap))
In [5]:
from queue import PriorityQueue as PQueue
pq = PQueue()
pq.put((5 * -1, 'Python'))
pq.put((4 * -1, 'C'))
pq.put((3 * -1, 'Js'))
print("Inside PriorityQueue: ", pq.queue) # 内部存储
while not pq.empty():
print(pq.get()[1])
In [6]:
import random
lst = [random.randrange(1, 100) for _ in range(5)]
lst.sort()
print("List: ", lst)
print("Poped: ", heappop(lst))
heappush(lst, 4)
print("Heap: ", lst)
In [7]:
heap = [random.randrange(1, 1000) for _ in range(1000)]
heapify(heap)
print("N largest: ", nlargest(10, heap))
print("N smallest: ", nsmallest(10, heap))
print(len(heap)) # 不原地修改
In [8]:
heapA = sorted([random.randrange(1, 100) for _ in range(3)])
heapB = sorted([random.randrange(1, 100) for _ in range(3)])
merged = []
for i in merge(heapA, heapB):
merged.append(i)
print(merged)
最后两个方法 heapreplace
和 heappushpop
分别相当于:
In [9]:
lstA = [1,2,3,4,5]
lstB = [1,2,3,4,5]
poped = heapreplace(lstA, 0)
print("lstA: ", lstA, "poped: ", poped)
# is equal to...
poped = heappop(lstB)
heappush(lstB, 0)
print("lstB: ", lstA, "poped: ", poped)
print("*"*30)
poped = heappushpop(lstA, 9)
print("lstA: ", lstA, "poped: ", poped)
# is equal to...
heappush(lstB, 9)
poped = heappop(lstB)
print("lstB: ", lstB, "poped: ", poped)
这两个方法的执行效率要比分开写的方法高,但要注意 heapreplace
要取代的值是否比 heap[0]
大,如果不是,可以用更有效的方法:
In [10]:
item = 0
lstA = [1,2,3,4,5]
if item < lstA[0]:
# replace
poped = lstA[0]
lstA[0] = item
print("lstA: ", lstA, "poped: ", poped)