段落的Python正则表达式_Python_Regex

段落的Python正则表达式

python regex

段落的Python正则表达式,python,regex,Python,Regex,嗨，我有这个作为我的测试字符串： <image> <title>CNN.com - Technology</title> <link>http://www.cnn.com/TECH/index.html?eref=rss_tech</link> CNN.com-技术 http://www.cnn.com/TECH/index.html?eref=rss_tech 我想使用python正则表达式从中选择“技术”，但是我需要它特定，以

嗨，我有这个作为我的测试字符串：

<image>
<title>CNN.com - Technology</title>
<link>http://www.cnn.com/TECH/index.html?eref=rss_tech</link>


CNN.com-技术
http://www.cnn.com/TECH/index.html?eref=rss_tech

我想使用python正则表达式从中选择“技术”，但是我需要它特定，以便它使用

和

。到目前为止，我的表达是：

'<title[^>]*>CNN.com - (.*?)</title>'

']*>CNN.com-（*？）

此表达式用于选择“技术”，这是正确的，但我不确定如何在表达式中使用

和

专门化我的代码。例如，我需要一些类似于正则表达式

']*>CNN.com-（.*？）

的东西，这些东西实际上可以产生与“技术”相同的结果

像这样的东西怎么样：

(<image>\n<title>CNN.com - )(.*?)(<\/title>\n.*)

（\nCNN.com-）（.*）（\n.*）

第2组是

Technology

您的regexp不错，但是您需要用反斜杠转义

中的斜杠，因为字符串中有换行符，所以它不匹配

换行符是空白（如空格、制表…\s在未设置UNICODE标志时相当于[\t\n\r\f\v]），因此可以使用\s来匹配它们

我想你正在使用python3，但这并不重要

s = """<image>
<title>CNN.com - Technology</title>
<link>http://www.cnn.com/TECH/index.html?eref=rss_tech</link>"""
r = r"<image>[\s]*<title[^>]*>CNN.com - (.*?)<\/title>[\s]*<link>"
m = re.search(r, s)
print(m.group(0))
print(m.group(1))

s=”“”
CNN.com-技术
http://www.cnn.com/TECH/index.html?eref=rss_tech"""
r=r“[\s]*]*>CNN.com-（.*？[\s]*”
m=重新搜索（r，s）
打印（m.group（0））
印刷品（m.group（1））

第（1）组是“技术”。

如果您对正则表达式使用“单行”选项，则可以使用

命名新行。因此，您可以：

<image>.<title[^>]*>CNN.com - (.*?)</title>.<link>

]*>CNN.com-（*？）。

在这一点上，我建议访问-python代码生成的在线正则表达式测试员-

*]*>CNN.com-（.*）。