Definition(s)

The closest pair of points problem or closest pair problem is a problem of computational geometry: given n points in metric space, find a pair of points with the smallest distance between them.

The closest pair problem for points in the Euclidean plane was among the first geometric problems that were treated at the origins of the systematic study of the computational complexity of geometric algorithms.

Algorithm(s)


In [1]:
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
from operator import itemgetter

%matplotlib inline

In [2]:
def euclid_distance(p, q):
    return np.sqrt((p[0] - q[0]) ** 2 + (p[1] - q[1]) ** 2)

In [3]:
def search(points, st, dr):
    if st >= dr:
        return np.inf, None, None
    elif st + 1 == dr:
        # sort on y
        if points[st][1] > points[dr][1]:
            points[st], points[dr] = points[dr], points[st]
        return euclid_distance(points[st], points[dr]), points[st], points[dr]

    m = (st + dr) // 2
    median_x = points[m][0]

    dleft, pleft, qleft = search(points, st, m)
    dright, pright, qright = search(points, m + 1, dr)

    if dleft < dright:
        d, p, q = dleft, pleft, qleft
    else:
        d, p, q = dright, pright, qright

    # merge the two halves on y
    aux = []
    i, j = st, m + 1

    while i <= m and j <= dr:
        if points[i][1] <= points[j][1]:
            aux.append(points[i])
            i += 1
        else:
            aux.append(points[j])
            j += 1

    while i <= m:
        aux.append(points[i])
        i += 1

    while j <= dr:
        aux.append(points[j])
        j += 1

    # copy back the points
    points[st:dr+1] = aux

    # select a set of points to be tested
    good_points = []

    for i in range(st, dr + 1):
        if abs(points[i][0] - median_x) <= d:
            good_points.append(points[i])
            
    for i in range(len(good_points)):
        j, cnt = i - 1, 8

        # go for at most 8 steps
        while j >= 0 and cnt > 0:
            tmp_d = euclid_distance(aux[i], aux[j])
            if tmp_d < d:
                d, p, q = tmp_d, aux[i], aux[j]

            j -= 1
            cnt -= 1

    return d, p, q

def closest_pair(points):
    points.sort(key = itemgetter(0))
    
    return search(points, 0, len(points) - 1)

Naive implementation of closest_pair


In [4]:
def naive_closest_pair(points):
    best, p, q = np.inf, None, None
    n = len(points)

    for i in range(n):
        for j in range(i + 1, n):
            d = euclid_distance(points[i], points[j])

            if d < best:
                best, p, q = d, points[i], points[j]

    return best, p, q

Draw points (with closest-pair marked as red)


In [5]:
def draw_points(points, p, q):
    xs, ys = zip(*points)
    plt.figure(figsize=(10,10))
    plt.scatter(xs, ys)
    plt.scatter([p[0], q[0]], [p[1], q[1]], s=100, c='red')
    plt.plot([p[0], q[0]], [p[1], q[1]], 'k', c='red')
    plt.show()

Run(s)


In [6]:
points = [(26, 77), (12, 37), (14, 18), (19, 96), (71, 95), (91, 9), (98, 43), (66, 77), (2, 75), (94, 91)]

xs, ys = zip(*points)

d, p, q = closest_pair(points)
assert d == naive_closest_pair(points)[0]

print("The closest pair of points is ({0}, {1}) at distance {2}".format(p, q, d))

draw_points(points, p, q)


The closest pair of points is ((66, 77), (71, 95)) at distance 18.681541692269406

In [8]:
N = 10
x = np.random.rand(N) * 100
y = np.random.rand(N) * 100

points = list(zip(x, y))

d, p, q = closest_pair(points)
assert d == naive_closest_pair(points)[0]

print("The closest pair of points is ({0}, {1}) at distance {2}".format(p, q, d))

draw_points(points, p, q)


The closest pair of points is ((71.074721300595129, 24.832192386236439), (74.09492734061395, 14.947443509594061)) at distance 10.335855304639715

In [9]:
N = 20
x = np.random.randint(100, size=N)
y = np.random.randint(100, size=N)

points = list(zip(x, y))

d, p, q = closest_pair(points)
assert d == naive_closest_pair(points)[0]

print("The closest pair of points is ({0}, {1}) at distance {2}".format(p, q, d))

draw_points(points, p, q)


The closest pair of points is ((71, 59), (76, 61)) at distance 5.385164807134504

In [13]:
N = 20
x = np.random.rand(N)
y = np.random.rand(N)

points = list(zip(x, y))

d, p, q = closest_pair(points)
assert d == naive_closest_pair(points)[0]

print("The closest pair of points is ({0}, {1}) at distance {2}".format(p, q, d))

draw_points(points, p, q)


The closest pair of points is ((0.90216690037000002, 0.040293494361602344), (0.94031501471282197, 0.04016455177547773)) at distance 0.038148332258481944