Python 在坐标数组中查找最大的十字形状_Python_Numpy_Optimization

Python 在坐标数组中查找最大的十字形状

python numpy optimization

Python 在坐标数组中查找最大的十字形状,python,numpy,optimization,Python,Numpy,Optimization,根据左/右/上/下侧的最大倍数查找质心坐标下面的代码正在工作，但没有更大的数组结束。我如何优化这一点：（如果numpy很重要，我将通过region找到带有region=region\u坐标的质心。tolist（））对于测试： array_test = ([[0, 0], [1, 0], [1, 1], [0, 1], [2, 1], [2, 2], [3, 1], [3, 0], [2, 0], [3, 2]]) print(find_centroid(array_test)) 否无限

根据左/右/上/下侧的最大倍数查找质心坐标

下面的代码正在工作，但没有更大的数组结束。我如何优化这一点：

（如果numpy很重要，我将通过region找到带有

region=region\u坐标的质心。tolist（）

）

对于测试：

array_test = ([[0, 0], [1, 0], [1, 1], [0, 1], [2, 1], [2, 2], [3, 1], [3, 0], [2, 0], [3, 2]])
print(find_centroid(array_test))

否无限循环已解释如果区域是numpy数组，这部分代码将使您陷入无限循环：

while True: if [x, y] in region: ...
这是因为，当在数组上使用时，中的运算符
将在列表中的任何元素与数组的任何子列表元素匹配时返回True 相反，您可以使用python的any 和all 方法： if (np.array(region)==[x,y]).all(axis=1).any(axis=0): all（axis=1）将按正确顺序检查每个子列表中的两个值是否相等我们得到了一个布尔值数组。如果任何布尔值为True，则至少存在一个匹配项将这两个列表中的任何一个强制转换为numpy数组就足以使此测试成为可能但如果。。。如果两个元素都是列表，in操作符将按预期工作，但在这种情况下，您应该确保区域及其每个子列表都是列表，而不是numpy数组。铸造它是行不通的。原因如下： import numpy as np array_test = [[0, 0], [1, 0]] print([1,1] in array_test) # prints False, as expected # numpy always compares element-wise, when both elements have the same length print([1,1] == np.array([1,0])) # Prints [True, False] print(np.array([1,1]) == np.array([1,0])) # [Line 6] Prints [True, False] # Errors when ambiguous "in" print([1,1] == np.array(array_test)) # Prints [[False False] [ True False]] print([1,1] in np.array(array_test)) # Prints true as explained, because we have at least one True print([1,1] in list(np.array(array_test))) #Error because numpy doesn't know how to evaluate the result at line 6 另一个版本这是我的方法。也许有更好的办法，这只是我的两分钱预滤波潜在质心首先，我将组成该地区所有可能的交叉点（从现在起我称之为“中心”）。首先，我要计算每个x坐标和y坐标。为了简单起见，我将使用numpy import numpy as np # We count every x values. We keep those that are present at least twice. x_counts = dict(zip(*np.unique(array_test[:,0], return_counts=True))) y_counts = dict(zip(*np.unique(array_test[:,1], return_counts=True))) # If an x is present once, then there cannot be any center in this column. x_inter = [coord for coord, count in x_counts.items() if count>=2] # Same with y and rows. y_inter = [coord for coord, count in y_counts.items() if count>=2] # Next, we create all combinations of (x, y) # an filter in the combinations present in our region. possible_centroids = np.array([(x,y) for x,y in product(x_inter, y_inter) if (array_test==np.array([x,y])).all(axis=1).any()) 测量臂长为了计算我们中心的力量，我们首先使用一个函数来测量手臂长度。让我们用一个方向参数使它有点参数化 # Since we are in 2D and we have no diagonal, there are four possible directions. directions = np.array([[0,1], [0, -1], [1, 0], [-1, 0]]) def get_arm_length(center, direction): position = center+direction # going one step in the direction # We keep track of the length in the direction. length = 0 # adding 1 as long as the next step in direction is in region while (region==position).all(axis=1).any(): position += direction length+=1 return length 测量每个潜在质心现在，我们可以测试四个方向，针对每个潜在的质心（之前选择的），并在整个过程中保持最佳质心 best_center=(0,[-1, -1]) # => (power, center_coords) for center in centers: # Setting to 1, which is the identity element of the product (x * 1 == x) power = 1 for direction in directions: # We multiply by the power along the four axes. power *= get_arm_length(center, direction) # if a more powerful one is found, we store it power and coords. if power > best_center[0]: best_center = power, center # At this point, we found most powerful center, which is our centroid. 把它们放在一起这是完整的代码 def find_centroid2(region): region = np.array(region) # Directions: directions = np.array([[0,1], [0, -1], [1, 0], [-1, 0]]) def get_arm_length(center, direction): position = center+direction length = 1 while (region==position).all(axis=1).any(axis=0): position+= direction length+=1 return length # Intersections: x_counts = dict(zip(*np.unique(region[:,0], return_counts=True))) y_counts = dict(zip(*np.unique(region[:,1], return_counts=True))) x_inter = [coord for coord, count in x_counts.items() if count>=2] y_inter = [coord for coord, count in y_counts.items() if count>=2] centers = np.array([(x,y) for x,y in product(x_inter, y_inter) if (region==np.array([x,y])).all(axis=1).any()]) # Measuring each center's "power": best_center=(0,[-1, -1]) # => (power, center_coords) for center in centers: power = 1 for direction in directions: power *= get_arm_length(center, direction) if power > best_center[0]: best_center = power, center return best_center[1] 最优化的最优化我们不必测试所有虚拟中心来保留属于我们区域的虚拟中心，而是可以过滤我们的区域，并将具有坐标的单元计数两次或两次以上 def find_centroid3(region): region = np.array(region) # Directions: directions = np.array([[0,1], [0, -1], [1, 0], [-1, 0]]) def get_arm_length(center, direction): position = center+direction length = 1 while (region==position).all(axis=1).any(axis=0): position+= direction length+=1 return length # Intersections: # It's better to filter the cells instead of computing and testing all combinations x_counts = [x[0] for x in zip(*np.unique(region[:,0], return_counts=True)) if x[1]>=2] y_counts = [y[0] for y in zip(*np.unique(region[:,1], return_counts=True)) if y[1]>=2] centers = [[x,y] for x,y in region if x in x_counts or y in y_counts] # Measuring each center's "power": best_center=(0,[-1, -1]) # => (power, center_coords) for center in centers: power = 1 for direction in directions: power *= get_arm_length(center, direction) if power > best_center[0]: best_center = power, center return best_center[1] 比较V2 随机区域的准备，有很多细胞 # Keeping the grid fairly big and filled # 150*150 grid (22'500 cells) with 15'000 filled cells max. array_test = np.random.randint(15, size=(150, 2)) # => len = 15'000 # Getting rid of duplicates, else they will mess with the counting. # Assuming your own grids also don't have any new_array = [list(array_test[0])] for elem in array_test[1:]: if (elem != np.array(new_array)).any(axis=1).all(): new_array.append(elem) array_test = np.array(new_array) # => len = 10'959, all are unique cells 结果: find_centroid(array_test) # Original version. Result = [64 127] # 16 s ± 117 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) find_centroid(array_test) # Proposed version 1. result = [61 127] # 13.1 s ± 87.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) find_centroid3(array_test) # Proposed version 2. Result = [61, 127] # 9.49 s ± 47.6 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) 我尝试了几种网格大小，使其保持最大半填充比较V1 [过时] 您的原始代码（针对处理无限循环进行了更正）：拟议守则： %%timeit find_centroid2(array_test) # Result => array([73, 16]) # 17.2 s ± 76.9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) 这不是一个巨大的优化，但无论如何，这是一个优化。也许其他一些评论和想法可以让它变得更好我尝试了几种网格大小，使其保持最大半填充对于任何需要更好（可能是完美的）答案的人来说，答案是使用一个小形状，通过腐蚀在中的数组上，而循环移除边框，直到找到： -一个十字：中心坐标 -多交叉：分别比较最佳交叉点 Wow。感谢您的指导；）真高兴看到这样的描述。让我在我的代码中替换你的，我会来回答的。好吧，关于“in”操作符，我已经有一个错误了。我描述的情况是在numpy数组上使用“in”时，而不是在python列表上<例如，数组（[[1,0]，[0,0]]）将返回True 。但是[[1,0,2]] 中的[1,2]将返回False 。让我更正一下。我忘记了一个旧的变量名。更正了，很抱歉！这就是在笔记本电脑中编程时发生的情况。正如图表所示，它更快。使用更大的区域进行计算仍然太长，我认为问题在于对所有“中心”进行计算，如果我们在按相同的x或y值对区域进行排序后只进行一次计算，或者像最常见的那样通过内置函数选择最多的出现次数。我说得对吗？回答你，我找到了一个更好的解决方案。在启发式方法中，排序是个好主意。但如果我们想100%确定找到了质心，排序对我们没有帮助。例如，7*5*7*5 小于6*6*6。如果我们被要求在事先给定重心的情况下找到质心，这将是一个巨大的优化。至于计数器，这是一个聪明而简单的想法，但它似乎比dict（zip（*unique））方法慢4倍左右。 %%timeit find_centroid(array_test) # Result => array([73, 16]) # 21.4 s ± 397 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) %%timeit find_centroid2(array_test) # Result => array([73, 16]) # 17.2 s ± 76.9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)