Python 检测二维列表中的矩形(相同元素值的子数组)
矩形定义为1和0组成的二维数组中任意矩形形状的零部分。典型示例:Python 检测二维列表中的矩形(相同元素值的子数组),python,list,numpy,multidimensional-array,scipy,Python,List,Numpy,Multidimensional Array,Scipy,矩形定义为1和0组成的二维数组中任意矩形形状的零部分。典型示例: [ [1, 1, 1, 1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1, 1, 1, 1, 0], [1, 1, 1, 0, 0, 0, 1, 0, 0], [1, 0, 1, 0, 0, 0, 1, 0, 0], [1, 0, 1, 1, 1, 1, 1, 1, 1], [1, 0, 1, 0, 0, 1, 1, 1, 1], [1, 1, 1, 0, 0, 1, 1, 1, 1]
[
[1, 1, 1, 1, 1, 1, 1, 1, 1],
[1, 1, 1, 1, 1, 1, 1, 1, 0],
[1, 1, 1, 0, 0, 0, 1, 0, 0],
[1, 0, 1, 0, 0, 0, 1, 0, 0],
[1, 0, 1, 1, 1, 1, 1, 1, 1],
[1, 0, 1, 0, 0, 1, 1, 1, 1],
[1, 1, 1, 0, 0, 1, 1, 1, 1],
[1, 1, 1, 1, 1, 1, 1, 1, 1],
]
在本例中,有三个这样的阵列:
我的目标是确定每个数组的坐标(外部3个极值)
我首先将二维列表
转换为numpy
数组:
image_as_np_array = np.array(two_d_list)
然后我可以得到所有零的坐标,这样:
np.argwhere(image_as_np_array == 0)
但这仅仅提供了一种获取索引的快捷方式,方法是迭代每一行并调用.index()
,然后在二维列表中与该行的索引组合
我设想现在做一些事情,比如删除
np.argwhere()
(或np.where()
)的任何元素,其中只有一个0出现(实际上忽略了不能构成矩形一部分的任何行),然后尝试对齐连续坐标,但我一直在思考如何处理任何一行可能包含不止一个矩形的一部分的情况(如上面第3行和第4行的情况)。是否有一个numpy
函数或我可以利用的函数?我已经使用编写了一个简单的算法。其思想是逐列遍历数组中的列,并将一系列零检测为潜在的新矩形。在每列中,您必须检查先前检测到的矩形是否已结束,如果是,则将其添加到结果中
import numpy as np
from sets import Set
from collections import namedtuple
example = np.array([
[1, 1, 1, 1, 1, 1, 1, 1, 1],
[1, 1, 1, 1, 1, 1, 1, 1, 0],
[1, 1, 1, 0, 0, 0, 1, 0, 0],
[1, 0, 1, 0, 0, 0, 1, 0, 0],
[1, 0, 1, 1, 1, 1, 1, 1, 1],
[1, 0, 1, 0, 0, 1, 1, 1, 1],
[1, 1, 1, 0, 0, 1, 1, 1, 1],
[1, 1, 1, 1, 1, 1, 1, 1, 1],
])
Rectangle = namedtuple("Rectangle", "left top bottom right")
def sweep(A):
height = A.shape[0]
length = A.shape[1]
rectangles = dict() # detected rectangles {(rowstart, rowend): col}
result = []
# sweep the matrix column by column
for i in xrange(length):
column = A[:, i]
# for currently detected rectangles check if we should extend them or end
for r in rectangles.keys():
# detect non rectangles shapes like requesten in question edit and del those rectangles
if all([x == 0 for x in column[r[0]:r[1]+1]]) and ((r[0]-1>0 and column[r[0]-1]==0) or (r[1]+1<height and column[r[1]+1]==0)):
del rectangles[r]
elif any([x == 0 for x in column[r[0]:r[1]+1]]) and not all([x == 0 for x in column[r[0]:r[1]+1]]):
del rectangles[r]
# special case in the last column - add detected rectangles
elif i == length - 1 and all([x == 0 for x in column[r[0]:r[1]+1]]):
result.append(Rectangle(rectangles[r], r[0], r[1], i))
# if detected rectangle is not extended - add to result and del from list
elif all([x == 1 for x in column[r[0]:r[1]+1]]):
result.append(Rectangle(rectangles[r], r[0], r[1], i-1))
del rectangles[r]
newRectangle = False
start = 0
# go through the column and check if any new rectangles appear
for j in xrange(height):
# new rectangle in column detected
if column[j] == 0 and not newRectangle and j+1 < height and column[j+1] == 0:
start = j
newRectangle = True
# new rectangle in column ends
elif column[j] == 1 and newRectangle:
# check if new detected rectangle is already on the list
if not (start, j-1) in rectangles:
rectangles[(start, j-1)] = i
newRectangle = False
# delete single column rectangles
resultWithout1ColumnRectangles = []
for r in result:
if r[0] != r[3]:
resultWithout1ColumnRectangles.append(r)
return resultWithout1ColumnRectangles
print example
print sweep(example)
我不知道numpy,所以这里有一个简单的Python解决方案:
from collections import namedtuple
Rectangle = namedtuple("Rectangle", "top bottom left right")
def find_rectangles(arr):
# Deeply copy the array so that it can be modified safely
arr = [row[:] for row in arr]
rectangles = []
for top, row in enumerate(arr):
start = 0
# Look for rectangles whose top row is here
while True:
try:
left = row.index(0, start)
except ValueError:
break
# Set start to one past the last 0 in the contiguous line of 0s
try:
start = row.index(1, left)
except ValueError:
start = len(row)
right = start - 1
if ( # Width == 1
left == right or
# There are 0s above
top > 0 and not all(arr[top-1][left:right + 1])):
continue
bottom = top + 1
while (bottom < len(arr) and
# No extra zeroes on the sides
(left == 0 or arr[bottom][left-1]) and
(right == len(row) - 1 or arr[bottom][right + 1]) and
# All zeroes in the row
not any(arr[bottom][left:right + 1])):
bottom += 1
# The loop ends when bottom has gone too far, so backtrack
bottom -= 1
if ( # Height == 1
bottom == top or
# There are 0s beneath
(bottom < len(arr) - 1 and
not all(arr[bottom + 1][left:right+1]))):
continue
rectangles.append(Rectangle(top, bottom, left, right))
# Remove the rectangle so that it doesn't affect future searches
for i in range(top, bottom+1):
arr[i][left:right+1] = [1] * (right + 1 - left)
return rectangles
这是正确的,因为注释表明右侧的“矩形”不被计算,因为有一个额外的0突出。我建议您添加更多的测试用例
我希望它会相当快,因为许多低级迭代都是通过调用
index
和any
完成的,因此,即使没有numpy的帮助,C代码的使用也是相当不错的。每个矩形必须至少有两行两列吗?比如说,我在3x3正方形的侧面粘贴一个2x2正方形,得到一个看起来像3x5矩形的形状,并切出两个角0,我是像前面提到的两个正方形那样计算,还是计算它包含的更大的2x5矩形,并“浪费”其他30年代?1 x n
或n x 1
是矩形还是否?从这个例子中,我假设没有,那么一个0的矩形必须被1完全包围?在这种情况下,示例中最右边的矩形无效。但是在右上角有一个额外的0。矩形包含在连续的非矩形形状中。矩形的右上角有四个0邻居。每个返回列表中的六个数字代表什么?我建议像我一样使用namedtuple
。你也应该阅读评论,我们的结论是三个矩形中的一个不应该被计算在内。这不是矩形不能在最后一列结束,而是它不能有任何突出的部分。请仔细阅读评论。@Tony谢谢。根据Alex的正确观察,我更新了问题以显示新的图像。只有两个矩形(单宽,单高)是有效的。你的解决方案的复杂度是多少?要计算准确的复杂度有点困难,但我认为它相当好。可能O(m*n)
,至少大约。我已经消除了在以前发现的矩形中反复检查是否包含的明显弱点。根据我的理解:当你找到0
时,你尝试将其尽可能地扩展到矩形,然后用1s
-itsO(n*m)
将其遮住。但是,对于较大的数组,线性内存复杂度可能是一个问题。如果可以将整个数组保存在内存中,则可能可以保存两次。如果这真的是一个问题,您可以避免复制数组,并在最后恢复更改:这很容易做到,因为所有的矩形都被记录下来了。或者只是让数组保持修改状态,这取决于用例,这可能无关紧要。@Wedoso对这个问题是正确的,宽度/长度为1的矩形被忽略-请参阅注释。
from collections import namedtuple
Rectangle = namedtuple("Rectangle", "top bottom left right")
def find_rectangles(arr):
# Deeply copy the array so that it can be modified safely
arr = [row[:] for row in arr]
rectangles = []
for top, row in enumerate(arr):
start = 0
# Look for rectangles whose top row is here
while True:
try:
left = row.index(0, start)
except ValueError:
break
# Set start to one past the last 0 in the contiguous line of 0s
try:
start = row.index(1, left)
except ValueError:
start = len(row)
right = start - 1
if ( # Width == 1
left == right or
# There are 0s above
top > 0 and not all(arr[top-1][left:right + 1])):
continue
bottom = top + 1
while (bottom < len(arr) and
# No extra zeroes on the sides
(left == 0 or arr[bottom][left-1]) and
(right == len(row) - 1 or arr[bottom][right + 1]) and
# All zeroes in the row
not any(arr[bottom][left:right + 1])):
bottom += 1
# The loop ends when bottom has gone too far, so backtrack
bottom -= 1
if ( # Height == 1
bottom == top or
# There are 0s beneath
(bottom < len(arr) - 1 and
not all(arr[bottom + 1][left:right+1]))):
continue
rectangles.append(Rectangle(top, bottom, left, right))
# Remove the rectangle so that it doesn't affect future searches
for i in range(top, bottom+1):
arr[i][left:right+1] = [1] * (right + 1 - left)
return rectangles
[Rectangle(top=2, bottom=3, left=3, right=5),
Rectangle(top=5, bottom=6, left=3, right=4)]