Python 如何将字符串拆分为重复的子字符串_Python

Python 如何将字符串拆分为重复的子字符串

python

Python 如何将字符串拆分为重复的子字符串,python,Python,我有一些字符串，每个字符串都是某个字符串的一个或多个副本。例如： L = "hellohellohello" M = "good" N = "wherewhere" O = "antant" splitstring(L) ---> ["hello", "hello", "hello"] splitstring(M) ---> ["good"] splitstring(N) ---> ["where", "where"] splitstring(O) ---> ["ant

我有一些字符串，每个字符串都是某个字符串的一个或多个副本。例如：

L = "hellohellohello"
M = "good"
N = "wherewhere"
O = "antant"

splitstring(L) ---> ["hello", "hello", "hello"]
splitstring(M) ---> ["good"]
splitstring(N) ---> ["where", "where"]
splitstring(O) ---> ["ant", "ant"]

我想把这些字符串分割成一个列表，这样每个元素都有重复的部分。例如：

L = "hellohellohello"
M = "good"
N = "wherewhere"
O = "antant"

splitstring(L) ---> ["hello", "hello", "hello"]
splitstring(M) ---> ["good"]
splitstring(N) ---> ["where", "where"]
splitstring(O) ---> ["ant", "ant"]

由于每个字符串大约有1000个字符长，因此如果速度也相当快，那就太好了

请注意，在我的例子中，重复都从字符串的开头开始，并且它们之间没有间隙，因此这比在字符串中查找最大重复的一般问题要简单得多

如何做到这一点？

我将使用的方法：

import re

L = "hellohellohello"
N = "good"
N = "wherewhere"

cnt = 0
result = ''
for i in range(1,len(L)+1):
    if cnt <= len(re.findall(L[0:i],L)):
        cnt = len(re.findall(L[0:i],L))
        result = re.findall(L[0:i],L)[0]

print(result)

试试这个。它不是削减列表，而是集中精力寻找最短的模式，然后通过适当次数重复此模式来创建一个新列表

def splitstring(s):
    # searching the number of characters to split on
    proposed_pattern = s[0]
    for i, c in enumerate(s[1:], 1):
        if proposed_pattern == s[i:(i+len(proposed_pattern))]:
            # found it
            break
        else:
            proposed_pattern += c
    else:
        print 'found no pattern'
        exit(1)
    # generating the list
    n = len(proposed_pattern)
    return [proposed_pattern]*(len(s)//n)


if __name__ == '__main__':
    L = 'hellohellohellohello'
    print splitstring(L)  # prints ['hello', 'hello', 'hello', 'hello']

使用正则表达式查找重复单词，然后简单地创建一个适当长度的列表：

def splitstring(string):
    match= re.match(r'(.*?)(?:\1)*$', string)
    word= match.group(1)
    return [word] * (len(string)//len(word))

假设重复单词的长度大于1，则可以：

a = "hellohellohello"

def splitstring(string):
    for number in range(1, len(string)):
        if string[:number] == string[number:number+number]:
            return string[:number]
    #in case there is no repetition
    return string

splitstring(a)

编码：utf-8_*_ 进口稀土 ''' 请参阅下面的Gábor Erds代码 ''' N=“wherewhere” cnt=0 结果=“” countN=0 showresult=[] 对于范围（1，len（N）+1）内的i：

如果看一看这个，我想你在寻找类似的东西？另外，这个方法的复杂度是O（n），因此，根据您的要求，它应该非常快。@MridulKashyap我的问题简单得多，因为我的重复从字符串的开头开始，中间没有任何间隔。好主意，我也想过做类似的事情。这三件事我都不知道，谢谢你，先生。我将对此进行测试，并在“aabaab”上编辑失败。请添加一个解释，解释为什么您的代码与。