Python 3.x 寻找多元时间序列/矩阵模式的最快方法
我有以下代码Python 3.x 寻找多元时间序列/矩阵模式的最快方法,python-3.x,Python 3.x,我有以下代码 a = [[1,2,3], [4,5,6], [7,8,9]] patterns = ('a',[[1,2],[8,9]]) for pattern in patterns[1]: for x in a: for index in range(len(x)): if x[index:index+len(pattern)]==pattern: x[index:index+len(pattern)]=[
a = [[1,2,3], [4,5,6], [7,8,9]]
patterns = ('a',[[1,2],[8,9]])
for pattern in patterns[1]:
for x in a:
for index in range(len(x)):
if x[index:index+len(pattern)]==pattern:
x[index:index+len(pattern)]=[patterns[0] for p in pattern]
此代码查找多行模式,但不考虑模式的对齐,也不应在转换矩阵中的任何内容之前找到完整的模式。但如何做到这一点,目前我还没有意识到
从形式上讲,问题如下: 我有一个矩阵
matrix=[
[1,2,3],
[4,5,6],
[7,8,9]]
我想找到一个像
[1,2]
[any,5]
因此,模式[1,2]在下面任何一行中,第一个值为任意值,第二个值为5
or
[1]
[4]
因此,a 1和a 4在同一列中
or
[2,3]
[8,9]
因此,2和3在行内相邻,8和9相邻,而2和8在同一列内,3和9在同一列内
to transform the matrix into (given the first pattern and transforming it into 'a')
output = [
[a,a,3],
[4,a,6],
[7,8,9]]
现在,我看了以下问题:
,但我不是用正确的关键字搜索,就是这个问题是新的
我自己也会使用类似于
if matrix[index:index+len(pattern)]==pattern
一旦发现一个模式,就有额外的花絮在较低的行中检查,但是这太慢了,因为行的长度是数万行,而行几乎是千行
我需要在同一个矩阵上多次重复此搜索和替换操作,从而得到如下矩阵:
Given:
input = [
[1,2,3],
[4,5,6],
[1,8,9]]
and
a=[[1,2,any],
[any,8,9]]
b=[[3],
[6]]
c=[[4,5],
[1,any]]
Output = [
[a,a,b],
[c,c,b],
[c,a,a]]
感谢您的关注,如果我的格式有任何错误,请告诉我这是我在Stack上的第一篇文章,这里有一个可能的方法。这只是尝试一次匹配模式的一行,如果成功,则移动到下一行,直到实现完全匹配。如果发生这种情况,它将用传递的替换字符串替换保存的索引
def replace_pattern(matrix, pattern_list, replacement, start_row, col=None, idxs=[]):
# if the pattern list is empty we found our pattern, lets replace the idxs
if (pattern_list == []):
replace_idx(matrix, replacement, idxs)
return True
n_rows = len(matrix)
n_cols = len(matrix[0])
pattern = pattern_list[0]
pattern_size = len(pattern)
# impossible to complete pattern if we have more lines remaining in the pattern than in the matrix
if (start_row + len(pattern_list) > n_rows):
return False
for row in range(start_row, n_rows):
# if we already found part of the pattern previously we only need to check a fixed position
if col != None:
new_idxs = idxs + [(row, filter_idx(pattern, col))]
if match(matrix[row][col : (col + pattern_size)], pattern) and replace_pattern(matrix, pattern_list[1:], replacement, row + 1, col, new_idxs):
return True
# if we have not found part of the pattern yet, we can search in every position of the current line
else:
for pos in range(0, n_cols - pattern_size + 1):
new_idxs = idxs + [(row, filter_idx(pattern, pos))]
if match(matrix[row][pos : (pos + pattern_size)], pattern) and replace_pattern(matrix, pattern_list[1:], replacement, row + 1, pos, new_idxs):
return True
return False
def replace_idx(matrix, replacement, idxs):
for entry in idxs:
row = entry[0]
for col in entry[1]:
matrix[row][col] = replacement
此函数使用了一些帮助函数:
这个函数确定模式是否与某些值匹配
def match(values, pattern):
for i in range(len(values)):
if values[i] == 'any' or pattern[i] == 'any':
continue
else:
if values[i] != pattern[i]:
return False
return True
这一个用“any”过滤掉模式索引,因为您不希望替换这些单元格
def filter_idx(pattern, col):
pattern_size = len(pattern)
l = []
for i in range(col, col + pattern_size):
if pattern[i - col] != 'any':
l.append(i)
return l
最后一个用传递的替换字符串替换(行,[cols])对
def replace_pattern(matrix, pattern_list, replacement, start_row, col=None, idxs=[]):
# if the pattern list is empty we found our pattern, lets replace the idxs
if (pattern_list == []):
replace_idx(matrix, replacement, idxs)
return True
n_rows = len(matrix)
n_cols = len(matrix[0])
pattern = pattern_list[0]
pattern_size = len(pattern)
# impossible to complete pattern if we have more lines remaining in the pattern than in the matrix
if (start_row + len(pattern_list) > n_rows):
return False
for row in range(start_row, n_rows):
# if we already found part of the pattern previously we only need to check a fixed position
if col != None:
new_idxs = idxs + [(row, filter_idx(pattern, col))]
if match(matrix[row][col : (col + pattern_size)], pattern) and replace_pattern(matrix, pattern_list[1:], replacement, row + 1, col, new_idxs):
return True
# if we have not found part of the pattern yet, we can search in every position of the current line
else:
for pos in range(0, n_cols - pattern_size + 1):
new_idxs = idxs + [(row, filter_idx(pattern, pos))]
if match(matrix[row][pos : (pos + pattern_size)], pattern) and replace_pattern(matrix, pattern_list[1:], replacement, row + 1, pos, new_idxs):
return True
return False
def replace_idx(matrix, replacement, idxs):
for entry in idxs:
row = entry[0]
for col in entry[1]:
matrix[row][col] = replacement
使用以下输入:
m = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
pattern_a = [[1, 2, 'any'],['any', 8, 9]]
pattern_b =[[3], [6]]
pattern_c = [[4, 5], [7, 'any']]
replace_pattern(m, pattern_a, 'a', 0)
replace_pattern(m, pattern_b, 'b', 0)
replace_pattern(m, pattern_c, 'c', 0)
print(m)
我得到了输出:
[['a', 'a', 'b'], ['c', 'c', 'b'], ['c', 'a', 'a']]
这是一个需求转储。请展示你所做的,并使用实数表示法,因为你目前所做的是不明确的/无意义的。你所说的实数表示法是什么意思,这个术语听起来非常模糊。例如,
[,5]
,什么是?啊,对了,我的错,我将用NA替换它,我只是不想让他们写实际的Python,并展示你的尝试这是完美的谢谢你,真聪明你是如何解决它的。现在我看到了,它看起来很简单,但我被困在这几天,所以再次感谢。