In [1]:
# %load ../../preconfig.py
%matplotlib inline
import matplotlib.pyplot as plt
import seaborn as sns
plt.rcParams['axes.grid'] = False
#import numpy as np
#import pandas as pd
#import itertools
import logging
logger = logging.getLogger()
A disjoint-set data structure maintains a collection $\delta = {S_1, S_2, \dotsc, \S_k}$ of disjoint dynamic sets.
We identify each set by a representative, which is some member of the set.
Letting $x$ denote an object, we wish to support the following operations:
MAKE-SET(x): $S_x = {x} : S_x \in \delta$
UNION(x,y): $S_x \cup S_y : x \in S_x, y \in S_y$
FIND-SET(x): $S_x : x \in S_x$
analyze the running times in terms of two parameters:
$n$, the number of MAKE-SET operations,
$m$, the total number of all kinds of operations.
The number of UNION operations must be less than MAKE-SET's, namely, $< n$. And we also have $m \geq n$.
An application: determining the connected components of an undirected graph.
In [3]:
plt.imshow(plt.imread('./res/find_connected.png'))
Out[3]:
In [5]:
plt.figure(figsize=(12,8))
plt.imshow(plt.imread('./res/fig21_1.png'))
Out[5]:
In [6]:
# Exercises
In [7]:
plt.figure(figsize=(15,8))
plt.imshow(plt.imread('./res/fig21_2.png'))
Out[7]:
MAKE-SET and FIND-SET requires $O(1)$ time.
note: FIND-SET(x), where $x$ is the pointer to the object expected.
UNION:
randomly merge: append one to another list, and update the related pointer. $O(n)$.
weighted-union heuristic: append the shortest list onto the longer. $\Omega(n)$
each list need to include the length of the list.
construct a sequence of $m$ operations on $n$ objects that requires $O(m + n \lg n)$ time.
In [8]:
# Exercises
Disjoint-set forests: We represent sets by rooted trees, with each node containing one member and each tree representing one set.
For each node, we maintain a rank, which is an upper bound on the height of the node.
UNION operation: We make the root with smaller rank point to the root with larger rank.
In [2]:
plt.imshow(plt.imread('./res/fig21_4.png'))
Out[2]:
In [3]:
plt.imshow(plt.imread('./res/fig21_5.png'))
Out[3]:
In [4]:
# Pseudocode for disjoint-set forests
plt.imshow(plt.imread('./res/forest.png'))
Out[4]:
When we use both union by rank and path compression, the worst-case running time is $O(m \alpha(n))$, where $\alpha(n)$ is a very slowly growing function, which we define in Section 21.4.
In [5]:
# Exercises