Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/342.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 在坐标数组中查找最大的十字形状_Python_Numpy_Optimization - Fatal编程技术网

Python 在坐标数组中查找最大的十字形状

Python 在坐标数组中查找最大的十字形状,python,numpy,optimization,Python,Numpy,Optimization,根据左/右/上/下侧的最大倍数查找质心坐标 下面的代码正在工作,但没有更大的数组结束。 我如何优化这一点: (如果numpy很重要,我将通过region找到带有region=region\u坐标的质心。tolist()) 对于测试: array_test = ([[0, 0], [1, 0], [1, 1], [0, 1], [2, 1], [2, 2], [3, 1], [3, 0], [2, 0], [3, 2]]) print(find_centroid(array_test)) 否无限

根据左/右/上/下侧的最大倍数查找质心坐标

下面的代码正在工作,但没有更大的数组结束。 我如何优化这一点:

(如果numpy很重要,我将通过region找到带有
region=region\u坐标的质心。tolist()

对于测试:

array_test = ([[0, 0], [1, 0], [1, 1], [0, 1], [2, 1], [2, 2], [3, 1], [3, 0], [2, 0], [3, 2]])
print(find_centroid(array_test))
无限循环已解释 如果区域是numpy数组,这部分代码将使您陷入无限循环

while True:
    if [x, y] in region:
    ...
这是因为,当在数组上使用时,中的运算符
将在列表中的任何元素与数组的任何子列表元素匹配时返回True

相反,您可以使用python的
any
all
方法:

if (np.array(region)==[x,y]).all(axis=1).any(axis=0):
  • all(axis=1)
    将按正确顺序检查每个子列表中的两个值是否相等

  • 我们得到了一个布尔值数组。如果任何布尔值为True,则至少存在一个匹配项

  • 将这两个列表中的任何一个强制转换为numpy数组就足以使此测试成为可能

    但如果。。。 如果两个元素都是列表,in操作符将按预期工作,但在这种情况下,您应该确保区域及其每个子列表都是列表,而不是numpy数组。铸造它是行不通的。原因如下:

    import numpy as np
    array_test = [[0, 0], [1, 0]]
    print([1,1] in array_test) # prints False, as expected
    
    # numpy always compares element-wise, when both elements have the same length
    print([1,1] == np.array([1,0])) # Prints [True, False]
    print(np.array([1,1]) == np.array([1,0]))  # [Line 6] Prints [True, False]
    # Errors when ambiguous "in"
    print([1,1] == np.array(array_test)) # Prints [[False False] [ True False]]
    print([1,1] in np.array(array_test)) # Prints true as explained, because we have at least one True 
    
    print([1,1] in list(np.array(array_test))) #Error because numpy doesn't know how to evaluate the result at line 6
    
    另一个版本 这是我的方法。也许有更好的办法,这只是我的两分钱

    预滤波潜在质心 首先,我将组成该地区所有可能的交叉点(从现在起我称之为“中心”)。首先,我要计算每个x坐标和y坐标。为了简单起见,我将使用numpy

    import numpy as np
    # We count every x values. We keep those that are present at least twice. 
    x_counts = dict(zip(*np.unique(array_test[:,0], return_counts=True)))
    y_counts = dict(zip(*np.unique(array_test[:,1], return_counts=True)))
    # If an x is present once, then there cannot be any center in this column.
    x_inter = [coord  for coord, count in x_counts.items() if count>=2]
    # Same with y and rows.
    y_inter = [coord  for coord, count in y_counts.items() if count>=2]
    # Next, we create all combinations of (x, y)
    # an filter in the combinations present in our region.
    possible_centroids = np.array([(x,y) for x,y in product(x_inter, y_inter)
            if (array_test==np.array([x,y])).all(axis=1).any())
    
    测量臂长 为了计算我们中心的力量,我们首先使用一个函数来测量手臂长度。让我们用一个
    方向
    参数使它有点参数化

    # Since we are in 2D and we have no diagonal, there are four possible directions.
    directions = np.array([[0,1], [0, -1], [1, 0], [-1, 0]])
    
    def get_arm_length(center, direction):
        position = center+direction # going one step in the direction
        # We keep track of the length in the direction. 
        length = 0
        # adding 1 as long as the next step in direction is in region
        while (region==position).all(axis=1).any():
            position += direction
            length+=1
        return length
    
    测量每个潜在质心 现在,我们可以测试四个方向,针对每个潜在的质心(之前选择的),并在整个过程中保持最佳质心

    best_center=(0,[-1, -1]) # => (power, center_coords)
    for center in centers:
        # Setting to 1, which is the identity element of the product (x * 1 == x)
        power = 1
        for direction in directions:
            # We multiply by the power along the four axes.
            power *= get_arm_length(center, direction)
        # if a more powerful one is found, we store it power and coords.
        if power > best_center[0]:
            best_center = power, center
    # At this point, we found most powerful center, which is our centroid.
    
    把它们放在一起 这是完整的代码

    def find_centroid2(region):    
       
        region = np.array(region)
    
        # Directions:
        directions = np.array([[0,1], [0, -1], [1, 0], [-1, 0]])
    
        def get_arm_length(center, direction):
            position = center+direction
            length = 1
            while (region==position).all(axis=1).any(axis=0):
                position+= direction
                length+=1
            return length
    
        # Intersections: 
        x_counts = dict(zip(*np.unique(region[:,0], return_counts=True))) 
        y_counts = dict(zip(*np.unique(region[:,1], return_counts=True)))
        x_inter = [coord  for coord, count in x_counts.items() if count>=2]
        y_inter = [coord  for coord, count in y_counts.items() if count>=2]
        centers = np.array([(x,y) for x,y in product(x_inter, y_inter) if (region==np.array([x,y])).all(axis=1).any()])
        # Measuring each center's "power":
        best_center=(0,[-1, -1]) # => (power, center_coords)
        for center in centers:
            power = 1
            for direction in directions:
                power *= get_arm_length(center, direction)
            if power > best_center[0]:
                best_center = power, center
        return best_center[1]
    
    最优化的最优化 我们不必测试所有虚拟中心来保留属于我们区域的虚拟中心,而是可以过滤我们的区域,并将具有坐标的单元计数两次或两次以上

    def find_centroid3(region):    
    
        region = np.array(region)
    
        # Directions:
        directions = np.array([[0,1], [0, -1], [1, 0], [-1, 0]])
    
        def get_arm_length(center, direction):
            position = center+direction
            length = 1
            while (region==position).all(axis=1).any(axis=0):
                position+= direction
                length+=1
            return length
    
        # Intersections: 
        # It's better to filter the cells instead of computing and testing all combinations
        x_counts = [x[0] for x in zip(*np.unique(region[:,0], return_counts=True)) if x[1]>=2]
        y_counts = [y[0] for y in zip(*np.unique(region[:,1], return_counts=True)) if y[1]>=2]
        centers = [[x,y] for x,y in region if x in x_counts or y in y_counts]
    
        # Measuring each center's "power":
        best_center=(0,[-1, -1]) # => (power, center_coords)
        for center in centers:
            power = 1
            for direction in directions:
                power *= get_arm_length(center, direction)
            if power > best_center[0]:
                best_center = power, center
        return best_center[1]
    
    比较V2 随机区域的准备,有很多细胞

    # Keeping the grid fairly big and filled
    # 150*150 grid (22'500 cells) with 15'000 filled cells max.
    array_test = np.random.randint(15, size=(150, 2)) # => len = 15'000
    # Getting rid of duplicates, else they will mess with the counting.
    # Assuming your own grids also don't have any
    new_array = [list(array_test[0])]
    for elem in array_test[1:]:
        if (elem != np.array(new_array)).any(axis=1).all():
            new_array.append(elem)
    array_test = np.array(new_array) # => len = 10'959, all are unique cells
    
    结果:

    find_centroid(array_test) # Original version. Result = [64 127]
    # 16 s ± 117 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
    find_centroid(array_test) # Proposed version 1. result = [61 127]
    # 13.1 s ± 87.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
    find_centroid3(array_test) # Proposed version 2. Result = [61, 127]
    # 9.49 s ± 47.6 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
    
    我尝试了几种网格大小,使其保持最大半填充

    比较V1 [过时]

    您的原始代码(针对处理无限循环进行了更正):

    拟议守则:

    %%timeit
    find_centroid2(array_test) # Result => array([73, 16])
    # 17.2 s ± 76.9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
    
    这不是一个巨大的优化,但无论如何,这是一个优化。 也许其他一些评论和想法可以让它变得更好

    我尝试了几种网格大小,使其保持最大半填充


    对于任何需要更好(可能是完美的)答案的人来说,答案是使用一个小形状,通过
    腐蚀
    中的数组上,而
    循环移除边框,直到找到:
    -一个十字:中心坐标
    
    -多交叉:分别比较最佳交叉点

    Wow。感谢您的指导;)真高兴看到这样的描述。让我在我的代码中替换你的,我会来回答的。好吧,关于“in”操作符,我已经有一个错误了。我描述的情况是在numpy数组上使用“in”时,而不是在python列表上<例如,数组([[1,0],[0,0]])
    将返回
    True
    。但是[[1,0,2]]
    中的
    [1,2]将返回
    False
    。让我更正一下。我忘记了一个旧的变量名。更正了,很抱歉!这就是在笔记本电脑中编程时发生的情况。正如图表所示,它更快。使用更大的区域进行计算仍然太长,我认为问题在于对所有“中心”进行计算,如果我们在按相同的x或y值对区域进行排序后只进行一次计算,或者像最常见的那样通过内置函数选择最多的出现次数。我说得对吗?回答你,我找到了一个更好的解决方案。在启发式方法中,排序是个好主意。但如果我们想100%确定找到了质心,排序对我们没有帮助。例如,
    7*5*7*5
    小于
    6*6*6
    。如果我们被要求在事先给定重心的情况下找到质心,这将是一个巨大的优化。至于计数器,这是一个聪明而简单的想法,但它似乎比dict(zip(*unique))方法慢4倍左右。
    %%timeit
    find_centroid(array_test) # Result => array([73, 16])
    # 21.4 s ± 397 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
    
    %%timeit
    find_centroid2(array_test) # Result => array([73, 16])
    # 17.2 s ± 76.9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)