Python 如何索引点列表以更快地搜索附近的点?

Python 如何索引点列表以更快地搜索附近的点?,python,search,indexing,point,Python,Search,Indexing,Point,对于(x,y)点的列表,我尝试查找每个点的附近点 from collections import defaultdict from math import sqrt from random import randint # Generate a list of random (x, y) points points = [(randint(0, 100), randint(0, 100)) for _ in range(1000)] def is_nearby(point_a, point_

对于(x,y)点的列表,我尝试查找每个点的附近点

from collections import defaultdict
from math import sqrt
from random import randint

# Generate a list of random (x, y) points
points = [(randint(0, 100), randint(0, 100)) for _ in range(1000)]

def is_nearby(point_a, point_b, max_distance=5):
    """Two points are nearby if their Euclidean distance is less than max_distance"""
    distance = sqrt((point_b[0] - point_a[0])**2 + (point_b[1] - point_a[1])**2)
    return distance < max_distance

# For each point, find nearby points that are within a radius of 5
nearby_points = defaultdict(list)
for point in points:
    for neighbour in points:
        if point != neighbour:
            if is_nearby(point, neighbour):
                nearby_points[point].append(neighbour)
从集合导入defaultdict
从数学导入sqrt
从随机导入randint
#生成随机(x,y)点列表
分数=[(randint(01000),randint(01000))表示范围(1000)]
def在附近(点a、点b、最大距离=5):
“”“如果两个点的欧几里德距离小于最大距离,则它们就在附近”“”
距离=sqrt((点_b[0]-点_a[0])**2+(点_b[1]-点_a[1])**2)
返回距离<最大距离
#对于每个点,查找半径为5的附近点
附近的_点=默认DICT(列表)
对于点到点:
对于相邻点:
如果点!=邻居:
如果_在附近(点、邻居):
邻近点[点]。追加(相邻)
是否有任何方法可以索引
以加快上述搜索?我觉得一定有比O(len(points)**2)更快的方法


编辑:点通常可以是浮点数,而不仅仅是整数。这是一个具有固定网格的版本,其中每个网格点保存有样本数

然后可以将搜索缩小到问题点周围的空间

from random import randint
import math

N = 100
N_SAMPLES = 1000

# create the grid
grd = [[0 for _ in range(N)] for __ in range(N)]

# set the number of points at a given gridpoint
for _ in range(N_SAMPLES):
    grd[randint(0, 99)][randint(0, 99)] += 1

def find_neighbours(grid, point, distance):

    # this will be: (x, y): number of points there
    points = {}

    for x in range(point[0]-distance, point[0]+distance):
        if x < 0 or x > N-1:
            continue
        for y in range(point[1]-distance, point[1]+distance):
            if y < 0 or y > N-1:
                continue
            dst = math.hypot(point[0]-x, point[1]-y)
            if dst > distance:
                continue
            if grd[x][y] > 0:
                points[(x, y)] = grd[x][y]
    return points

print(find_neighbours(grid=grd, point=(45, 36), distance=5))
# -> {(44, 37): 1, (45, 33): 1, ...}
# meadning: there is one neighbour at (44, 37) etc...
输出如下所示:

(44, 36)
   Point(x=44.03, y=36.93)
(41, 36)
   Point(x=41.91, y=36.55)
   Point(x=41.73, y=36.53)
   Point(x=41.56, y=36.88)
...
为方便起见,
Points
现在是一个类。但可能没有必要

根据点的密集程度或稀疏程度,也可以将网格表示为指向列表或点的字典

此外,
find_
函数仅在该版本中接受由
int
s组成的起点
point
。这也可能会被改进


还有很大的改进空间:可以使用三角法限制
y
轴的范围。对于圆内的点,无需单独检查;详细检查只需在圆的外缘附近进行。

如果网格仅为100*100,则可以将点排列在该网格内。这样可以大大减少搜索空间。谢谢-如果点是浮点而不是整数呢?这种方法只有在我们将浮动舍入INT时才有效。我认为调整上述方法会有效。不要在固定网格上搜索,而是在左二等分((点[0]+/-距离,点[1]+/-距离),点)之间搜索。
(44, 36)
   Point(x=44.03, y=36.93)
(41, 36)
   Point(x=41.91, y=36.55)
   Point(x=41.73, y=36.53)
   Point(x=41.56, y=36.88)
...