Python 将字符串拆分为重复元素的字符串_Python

Python 将字符串拆分为重复元素的字符串

python

Python 将字符串拆分为重复元素的字符串,python,Python,我想拆分一个字符串，如下所示： 'aaabbccccabbb' 进入在Python中实现这一点的优雅方式是什么？如果这样做更简单，可以假设字符串将只包含a、b和c。这是以下的用例：）您可以创建一个迭代器，而不必为了保持它的简短和不可读而变得聪明： def yield_same(string): it_str = iter(string) result = it_str.next() for next_chr in it_str: if next_ch

我想拆分一个字符串，如下所示：

'aaabbccccabbb'

进入

在Python中实现这一点的优雅方式是什么？如果这样做更简单，可以假设字符串将只包含a、b和c。

这是以下的用例：）

您可以创建一个迭代器，而不必为了保持它的简短和不可读而变得聪明：

def yield_same(string):
    it_str = iter(string)
    result = it_str.next()
    for next_chr in it_str:
        if next_chr != result[0]:
            yield result
            result = ""
        result += next_chr
    yield result


.. 
>>> list(yield_same("aaaaaabcbcdcdccccccdddddd"))
['aaaaaa', 'b', 'c', 'b', 'c', 'd', 'c', 'd', 'cccccc', 'dddddd']
>>>

编辑

好的，这里有一个itertools.groupby，它可能会做类似的事情。

下面是我能找到的使用正则表达式的最佳方法：

print [a for a,b in re.findall(r"((\w)\2*)", s)]

我知道有一个简单的方法可以做到这一点！没有人建议正则表达式的可能重复？我既感动又悲伤。是的，这是伊森的问题的重复。但这个问题没有一个有用的标题，依我看。

def yield_same(string):
    it_str = iter(string)
    result = it_str.next()
    for next_chr in it_str:
        if next_chr != result[0]:
            yield result
            result = ""
        result += next_chr
    yield result


.. 
>>> list(yield_same("aaaaaabcbcdcdccccccdddddd"))
['aaaaaa', 'b', 'c', 'b', 'c', 'd', 'c', 'd', 'cccccc', 'dddddd']
>>>

print [a for a,b in re.findall(r"((\w)\2*)", s)]

>>> import re
>>> s = 'aaabbccccabbb'
>>> [m.group() for m in re.finditer(r'(\w)(\1*)',s)]
['aaa', 'bb', 'cccc', 'a', 'bbb']