在Python列表/数组中搜索和替换多个特定的元素序列

在Python列表/数组中搜索和替换多个特定的元素序列,python,arrays,list,design-patterns,iterator,Python,Arrays,List,Design Patterns,Iterator,我目前有6个单独的for循环,循环遍历一个数字列表,以匹配较大序列中的特定数字序列,并按如下方式替换它们: [...0,1,0...] => [...0,0,0...] [...0,1,1,0...] => [...0,0,0,0...] [...0,1,1,1,0...] => [...0,0,0,0,0...] for i in range(len(output_array)-2): if output_array[i] == 0 and output_array

我目前有6个单独的for循环,循环遍历一个数字列表,以匹配较大序列中的特定数字序列,并按如下方式替换它们:

[...0,1,0...] => [...0,0,0...]
[...0,1,1,0...] => [...0,0,0,0...]
[...0,1,1,1,0...] => [...0,0,0,0,0...]
for i in range(len(output_array)-2):
    if output_array[i] == 0 and output_array[i+1] == 1 and output_array[i+2] == 0:
        output_array[i+1] = 0

for i in range(len(output_array)-3):
    if output_array[i] == 0 and output_array[i+1] == 1 and output_array[i+2] == 1 and output_array[i+3] == 0:
        output_array[i+1], output_array[i+2] = 0
与之相反:

[...1,0,1...] => [...1,1,1...]
[...1,0,0,1...] => [...1,1,1,1...]
[...1,0,0,0,1...] => [...1,1,1,1,1...]
我现有的代码如下所示:

[...0,1,0...] => [...0,0,0...]
[...0,1,1,0...] => [...0,0,0,0...]
[...0,1,1,1,0...] => [...0,0,0,0,0...]
for i in range(len(output_array)-2):
    if output_array[i] == 0 and output_array[i+1] == 1 and output_array[i+2] == 0:
        output_array[i+1] = 0

for i in range(len(output_array)-3):
    if output_array[i] == 0 and output_array[i+1] == 1 and output_array[i+2] == 1 and output_array[i+3] == 0:
        output_array[i+1], output_array[i+2] = 0

总的来说,我使用蛮力检查对同一输出数组进行了6次迭代。有更快的方法吗?

当这个问题与问题相关时,OP的问题与一次快速搜索多个序列相关。虽然接受的答案很好,但我们可能不希望在基序列的每个子迭代中遍历所有搜索序列

# I would create a map between the string searched and the new one.

patterns = {}
patterns['010'] = '000'
patterns['0110'] = '0000'
patterns['01110'] = '00000'

# I would loop over the lists

lists = [[0,1,0,0,1,1,0,0,1,1,1,0]]

for lista in lists:

    # i would join the list elements as a string
    string_list = ''.join(map(str,lista))

    # we loop over the patterns
    for pattern,value in patterns.items():

        # if a pattern is detected, we replace it
        string_list = string_list.replace(pattern, value)
        lista = list(string_list)
    print lista
下面是一个算法,它只在基序列中存在(i-1)个整数序列时检查i个整数序列

# This is the driver function which takes in a) the search sequences and 
# replacements as a dictionary and b) the full sequence list in which to search 

def findSeqswithinSeq(searchSequences,baseSequence):
    seqkeys = [[int(i) for i in elem.split(",")] for elem in searchSequences]
    maxlen = max([len(elem) for elem in seqkeys])
    decisiontree = getdecisiontree(seqkeys)
    i = 0
    while i < len(baseSequence):
        (increment,replacement) = get_increment_replacement(decisiontree,baseSequence[i:i+maxlen])
        if replacement != -1:
            baseSequence[i:i+len(replacement)] = searchSequences[",".join(map(str,replacement))]
        i +=increment
    return  baseSequence

#the following function gives the dictionary of intermediate sequences allowed
def getdecisiontree(searchsequences):
    dtree = {}
    for elem in searchsequences:
        for i in range(len(elem)):
            if i+1 == len(elem):
                dtree[",".join(map(str,elem[:i+1]))] = True
            else:
                dtree[",".join(map(str,elem[:i+1]))] = False
    return dtree

# the following is the function does most of the work giving us a) how many
# positions we can skip in the search and b)whether the search seq was found
def get_increment_replacement(decisiontree,sequence):
    if str(sequence[0]) not in decisiontree:
        return (1,-1)
    for i in range(1,len(sequence)):
        key = ",".join(map(str,sequence[:i+1]))
        if key not in decisiontree:
            return (1,-1)
        elif decisiontree[key] == True:
            key = [int(i) for i in key.split(",")]
            return (len(key),key)
    return 1, -1
提出的解决方案将要搜索的序列表示为决策树

由于跳过了许多搜索点,我们应该能够用这种方法做得比O(m*n)更好(其中m是搜索序列的数量,n是基序列的长度)


编辑:基于编辑问题的更清晰性更改答案。

当此问题与问题和相关时,来自OP的问题与一次快速搜索多个序列相关。虽然接受的答案效果良好,但我们可能不希望循环搜索基序列每个子迭代的所有搜索序列

下面是一个算法,它只在基序列中存在(i-1)个整数序列时检查i个整数序列

# This is the driver function which takes in a) the search sequences and 
# replacements as a dictionary and b) the full sequence list in which to search 

def findSeqswithinSeq(searchSequences,baseSequence):
    seqkeys = [[int(i) for i in elem.split(",")] for elem in searchSequences]
    maxlen = max([len(elem) for elem in seqkeys])
    decisiontree = getdecisiontree(seqkeys)
    i = 0
    while i < len(baseSequence):
        (increment,replacement) = get_increment_replacement(decisiontree,baseSequence[i:i+maxlen])
        if replacement != -1:
            baseSequence[i:i+len(replacement)] = searchSequences[",".join(map(str,replacement))]
        i +=increment
    return  baseSequence

#the following function gives the dictionary of intermediate sequences allowed
def getdecisiontree(searchsequences):
    dtree = {}
    for elem in searchsequences:
        for i in range(len(elem)):
            if i+1 == len(elem):
                dtree[",".join(map(str,elem[:i+1]))] = True
            else:
                dtree[",".join(map(str,elem[:i+1]))] = False
    return dtree

# the following is the function does most of the work giving us a) how many
# positions we can skip in the search and b)whether the search seq was found
def get_increment_replacement(decisiontree,sequence):
    if str(sequence[0]) not in decisiontree:
        return (1,-1)
    for i in range(1,len(sequence)):
        key = ",".join(map(str,sequence[:i+1]))
        if key not in decisiontree:
            return (1,-1)
        elif decisiontree[key] == True:
            key = [int(i) for i in key.split(",")]
            return (len(key),key)
    return 1, -1
提出的解决方案将要搜索的序列表示为决策树

由于跳过了许多搜索点,我们应该能够用这种方法做得比O(m*n)更好(其中m是搜索序列的数量,n是基序列的长度)


编辑:基于编辑问题的更清晰性更改答案。

您的问题不清楚。共享一些代码和输入输出示例鉴于时间复杂性,我将从
timeit
模块开始测量不同的实现。如果您需要更多关于函数花费最多时间的详细信息,请使用复杂性通常不是一个很好的指南,因为它在很大程度上取决于假定的实现及其复杂性。Python的内置数据结构经过高度优化,可能使用与您假定的完全不同的实现。您可以添加一些示例吗?例如,
01011010
应该变成什么?它是
0001110?你的问题不清楚。分享一些代码和输入输出示例。考虑到时间复杂性,我会从
timeit
模块开始测量不同的实现。如果你需要更多关于函数花费最多时间的细节,请使用。理论复杂性通常不是一个很好的指南,因为它取决于关于假设的实现及其复杂性。Python的内置数据结构经过高度优化,可能使用与您假设的完全不同的实现。您能否添加一些示例?例如,
01011010
应该成为什么?它是
00011110
吗?“实现这一点的最快方法”-实现什么?OP不清楚替换的意图,但很清楚它们应该应用到另一个列表中。@ugotchi,上面的代码能满足您的需要吗?“实现这一点的最快方法”-实现什么?OP对替换的意图不清楚,但是清楚的是它们应该被应用到另一个列表中。@ UGOTCHI,上面的代码是否适用于您所需要的?我喜欢这种方法,这看起来它可以工作并且效率更高。考虑重复<代码>模式。避免在循环内显式查找<代码>模式[模式] <代码>。我喜欢这个方法,这看起来它可以工作并且效率更高。考虑在模式> ItMs()>代码>中进行迭代,以避免在循环内显式查找<代码>模式[模式] < /代码>。