Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/284.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/regex/17.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 从字符串中删除多个字符序列_Python_Regex - Fatal编程技术网

Python 从字符串中删除多个字符序列

Python 从字符串中删除多个字符序列,python,regex,Python,Regex,如果我有这样的字符串: my_string = 'this is is is is a string' 如何删除多个iss,以便只显示一个 此字符串中可以包含任意数量的is,例如 my_string = 'this is is a string' other_string = 'this is is is is is is is is a string' 我想一个正则表达式解决方案是可能的,但是我不知道该怎么做。谢谢。如果您想一个接一个地删除所有副本,您可以尝试 l = my_string.

如果我有这样的字符串:

my_string = 'this is is is is a string'
如何删除多个
is
s,以便只显示一个

此字符串中可以包含任意数量的
is
,例如

my_string = 'this is is a string'
other_string = 'this is is is is is is is is a string'

我想一个正则表达式解决方案是可能的,但是我不知道该怎么做。谢谢。

如果您想一个接一个地删除所有副本,您可以尝试

l = my_string.split()
tmp = [l[0]]
for word in l:
    if word != tmp[-1]:
        tmp.append(word)
s = ''
for word in tmp:
    s += word + ' '
my_string = s
当然,如果您希望它比这更智能,那么它将更加复杂。

对于OneLiner:

>>> import itertools
>>> my_string = 'this is is a string'
>>> " ".join([k for k, g in itertools.groupby(my_string.split())])
'this is a string'
你可以用

我的做法如下:

my_string = 'this is is a string'
other_string = 'this is is is is is is is is a string'
def getStr(s):
    res = []
    [res.append(i) for i in s.split() if i not in res]
    return ' '.join(res)

print getStr(my_string)
print getStr(other_string)
输出:

this is a string
this is a string
更新正则表达式攻击它的方法:

import re
print ' '.join(re.findall(r'(?:^|)(\w+)(?:\s+\1)*', other_string))

Regex拯救我们! Python代码 演示
请参阅演示以及上的

统计“is”字符串的出现次数,并在计数器>1@MohitSharma肯定有更有效的解决方案吗?您只想删除
is
或任何重复出现的情况吗?像
'this is a string'
'this a string'
@PSidhu一样,大多数时候你都应该把你的attemptRelated:.你的非正则表达式方法将删除字符串中任何后续出现的单词:
getStr('this string is string')
-->this string is a'。虽然问题还不清楚,但我认为这可能不是OP的想法。你是对的。这对OP的问题有好处。我认为regex方法更可靠。
import re
print ' '.join(re.findall(r'(?:^|)(\w+)(?:\s+\1)*', other_string))
((\b\w+\b)\s*\2\s*)+
# capturing group
# inner capturing group
# ... consisting of a word boundary, at least ONE word character and another boundary
# followed by whitespaces
# and the formerly captured group (aka the inner group)
# the whole pattern needs to be present at least once, but can be there
# multiple times
import re

string = """
this is is is is is is is is a string
and here is another another another another example
"""
rx = r'((\b\w+\b)\s*\2\s*)+'

string = re.sub(rx, r'\2 ', string)
print string
# this is a string
# and here is another example