Python 基于半一致特征拆分字符串_Python_Regex

Python 基于半一致特征拆分字符串

python regex

Python 基于半一致特征拆分字符串,python,regex,Python,Regex,我有一个代表成绩单的文本文件。我需要找到一种方法来拆分这些内容，这样我就有了一个字符串列表来代表每个人所说的内容。所以这个, mystr = '''Bob: Hello there, how are you? Alice: I am fine how are you?''' 变成这样 mylist= ['Bob: Hello there, how are you?','Alice: I am fine how are you?'] 我对正则表达式不熟悉，但认识到这可

我有一个代表成绩单的文本文件。我需要找到一种方法来拆分这些内容，这样我就有了一个字符串列表来代表每个人所说的内容。所以这个,

mystr = '''Bob: Hello there, how are you? 

           Alice: I am fine how are you?'''

变成这样

mylist= ['Bob: Hello there, how are you?','Alice: I am fine how are you?']

我对正则表达式不熟悉，但认识到这可能是一条可行的道路。问题是，我想在姓名不同的情况下（例如，约翰、保罗、乔治、林戈等）重复这一点。保持一致的是出现一个单词（代表说话人），后跟冒号，后跟空格

re.findall(r"\S[^:]+.*", mystr)
#-> ['Bob: Hello there, how are you? ', 'Alice: I am fine how are you?']

如果冒号不在那里，那么这个正则表达式应该优先于前一个正则表达式

mystr = '''Bob Hello there, how are you? 

           Alice: I am fine how are you?'''
[_.group(0).strip() for _ in re.finditer(r"\w{1,}:+.*", mystr)]
#['Alice: I am fine how are you?']

mystr = '''Bob Hello there, how are you? 

           Alice: I am fine how are you?'''
[_.group(0).strip() for _ in re.finditer(r"\w{1,}:+.*", mystr)]
#['Alice: I am fine how are you?']