Python 打印正则表达式的所有子字符串_Python_Regex_Python 3.x_String_Printing

Python 打印正则表达式的所有子字符串

python regex python-3.x string printing

Python 打印正则表达式的所有子字符串,python,regex,python-3.x,string,printing,Python,Regex,Python 3.x,String,Printing,我正在尝试打印在文本中找到的所有子字符串。问题是findall（）不发回子字符串，而是匹配捕获，比如（'H'，'dog'）。我希望它发回一个字符串，例如“她的狗吃” 非常感谢您的帮助。您可以使用给出： <re.Match object; span=(0, 12), match='Her dog eats'> <re.Match object; span=(14, 27), match='Her bird eats'> 好的，所以我认为问题在于您使用了findall，它

我正在尝试打印在文本中找到的所有子字符串。问题是findall（）不发回子字符串，而是匹配捕获，比如（'H'，'dog'）。我希望它发回一个字符串，例如“她的狗吃”

非常感谢您的帮助。

您可以使用

给出：

<re.Match object; span=(0, 12), match='Her dog eats'>
<re.Match object; span=(14, 27), match='Her bird eats'>

好的，所以我认为问题在于您使用了

findall

，它只返回匹配部分的元组。如果使用

finditer

可以获得整个匹配对象

试试这个：

import re
text = open("text_file_thing.txt", "r")
regex_string = "(H|h)er\s+(dog|cat|bird)\s+\w+"
regex = re.compile(regex_string)
match_array = regex.finditer(text.read())

# Now you can either just loop through the iterator or
# convert it to a list if you need to keep the objects and not 
# just print them
match_list = list(match_array)

for m in match_list:
    print(m.string)

您可以定义捕获组。使用非捕获组获取整个匹配：

import re

text = """Her pig groans
Her    dog swoons.
her bird feeds.
Her cat purrs."""
regex_string = "(?:H|h)er\s+(?:dog|cat|bird)\s+\w+"
regex = re.compile(regex_string)
match_array = regex.findall(text)
print(match_array)

输出：

['Her    dog swoons', 'her bird feeds', 'Her cat purrs']

见：

（？：…）

：常规括号的非捕获版本。比赛括号内有任何正则表达式，但执行后无法检索组匹配的子字符串匹配或稍后在模式中引用

我假设

text

是一个多行文件？当匹配发生时，是否返回整行？一旦正则表达式匹配并使用部分文本，它将不会重新访问它。@PyPingu我只需要子字符串，而不是整行。它是一个多行文件。

import re

text = """Her pig groans
Her    dog swoons.
her bird feeds.
Her cat purrs."""
regex_string = "(?:H|h)er\s+(?:dog|cat|bird)\s+\w+"
regex = re.compile(regex_string)
match_array = regex.findall(text)
print(match_array)

['Her    dog swoons', 'her bird feeds', 'Her cat purrs']