Closest Pair of Points in Python
We are given an array of n points in the plane, and our task is to find the pair of points in the array that are the closest to each other. This problem arises in a number of applications. In air traffic control, for example, you might want to keep an eye out for planes flying too close together, as this could indicate a collision.
The most basic brute force solution is to compute the distance between each pair and return the smallest value. We can calculate the shortest distance using the Divide and Conquer strategy in O(N log N) time. This post goes over an O(n x (Logn)2) method. We will discuss an O(N log N) approach in a separate post.
Here is a code with appropriate comments to help you guide through the entire code and approach.
The algorithm calculates the distance between the nearest pair of points among the n points given.
The strategy employed -> Divide and conquer
The points are sorted first based on Xco-ords and then separately based on Yco-ords.
And, using the divide and conquer method, the shortest distance is obtained recursively.
The closest points may be on opposite sides of the partition.
This case was handled by forming a strip of points whose Xco-ords distance from the mid-points. Xco-ords is less than closest pair dis. To reduce sorting time, points sorted by Yco-ords are used in this step.
The closest pair distance is found in the point strip.
(closest_in_strip)
min(closest_pair_dis, closest_in_strip) would be the final answer.
Time complexity: O(n * log n)
"""
def euclidean_distance_sqr(point1, point2):
return (point1[0] - point2[0]) ** 2 + (point1[1] - point2[1]) ** 2
def column_based_sort(array, column=0):
return sorted(array, key=lambda x: x[column])
def dis_between_closest_pair(points, points_counts, min_dis=float("inf")):
for i in range(points_counts - 1):
for j in range(i + 1, points_counts):
current_dis = euclidean_distance_sqr(points[i], points[j])
if current_dis < min_dis:
min_dis = current_dis
return min_dis
def dis_between_closest_in_strip(points, points_counts, min_dis=float("inf")):
for i in range(min(6, points_counts - 1), points_counts):
for j in range(max(0, i - 6), i):
current_dis = euclidean_distance_sqr(points[i], points[j])
if current_dis < min_dis:
min_dis = current_dis
return min_dis
def closest_pair_of_points_sqr(points_sorted_on_x, points_sorted_on_y, points_counts):
# base case
if points_counts <= 3:
return dis_between_closest_pair(points_sorted_on_x, points_counts)
# recursion
mid = points_counts // 2
closest_in_left = closest_pair_of_points_sqr(
points_sorted_on_x, points_sorted_on_y[:mid], mid
)
closest_in_right = closest_pair_of_points_sqr(
points_sorted_on_y, points_sorted_on_y[mid:], points_counts - mid
)
closest_pair_dis = min(closest_in_left, closest_in_right)
cross_strip = []
for point in points_sorted_on_x:
if abs(point[0] - points_sorted_on_x[mid][0]) < closest_pair_dis:
cross_strip.append(point)
closest_in_strip = dis_between_closest_in_strip(
cross_strip, len(cross_strip), closest_pair_dis
)
return min(closest_pair_dis, closest_in_strip)
def closest_pair_of_points(points, points_counts):
points_sorted_on_x = column_based_sort(points, column=0)
points_sorted_on_y = column_based_sort(points, column=1)
return (
closest_pair_of_points_sqr(
points_sorted_on_x, points_sorted_on_y, points_counts
)
) ** 0.5
if __name__ == "__main__":
points = [(2, 3), (12, 30), (40, 50), (5, 1), (12, 10), (3, 4)]
print("Distance:", closest_pair_of_points(points, len(points)))
Time Complexity
Let T be the time complexity of the preceding algorithm (n). Assume we're using an O(N log N) sorting algorithm. The preceding algorithm divides all points into two sets and calls for two sets recursively. It finds the strip in O(n) time after dividing, sorts the strip in O(N log N) time, and finally finds the closest points in the strip in O(n) time. As a result, T(n) can be written as follows.
- T(n) = 2T(n/2) + O(n) + O(nLogn) + O(n)
- T(n) = 2T(n/2) + O(nLogn)
- T(n) = T(n x Logn x Logn)