Python 如何通过正则表达式获得最正确的匹配？_Python_Regex_Python 3.x

Python 如何通过正则表达式获得最正确的匹配？

python regex python-3.x

Python 如何通过正则表达式获得最正确的匹配？,python,regex,python-3.x,Python,Regex,Python 3.x,我认为这是一个普遍的问题。但我在别处没有找到令人满意的答案假设我从一个网站中提取一些链接。链接如下所示： http://example.com/goto/http://example1.com/123.html http://example1.com/456.html http://example.com/yyy/goto/http://example2.com/789.html http://example3.com/xxx.html 我想使用正则表达式将它们转换为真正的链接： http:

我认为这是一个普遍的问题。但我在别处没有找到令人满意的答案

假设我从一个网站中提取一些链接。链接如下所示：

http://example.com/goto/http://example1.com/123.html
http://example1.com/456.html
http://example.com/yyy/goto/http://example2.com/789.html
http://example3.com/xxx.html

我想使用正则表达式将它们转换为真正的链接：

http://example1.com/123.html
http://example1.com/456.html
http://example2.com/789.html
http://example3.com/xxx.html

然而，我不能这样做，因为RE的贪婪特性。

'http://.*$'

将只匹配整个句子。然后我尝试了

“http://.*？$”

，但也没有成功。

re.findall

。那么，还有其他方法可以做到这一点吗

对。我可以通过

str.split

或

str.index

来完成。但是我仍然很好奇是否有一个重新的解决方案。

你不需要使用正则表达式，你可以使用

str.split（）

来拆分你与

的链接，然后选取最后一部分并将其与

http/

连接起来：

>>> s="""http://example.com/goto/http://example1.com/123.html
... http://example1.com/456.html
... http://example.com/yyy/goto/http://example2.com/789.html
... http://example3.com/xxx.html"""
>>> ['http://'+s.split('//')[-1] for link in s.split('\n')]
['http://example3.com/xxx.html', 'http://example3.com/xxx.html', 'http://example3.com/xxx.html', 'http://example3.com/xxx.html']

使用regex，您只需将2

之间的所有字符替换为空字符串，但由于第一次使用时需要

中的一个：

>>[re.sub（r'（？使用此模式
^(.*?[^/])(?=\/[^/]).*?([^/]+)$  

(http://(?:(?!http:).)*)$  

http://.*?(?=http://)  

并替换为$1/$2




阅读下面的评论后，使用此模式捕获您想要的内容
(http://(?:[^h]|h(?!ttp:))*)$



还是这个模式
^(.*?[^/])(?=\/[^/]).*?([^/]+)$  

(http://(?:(?!http:).)*)$  

http://.*?(?=http://)  



还是这个模式
^(.*?[^/])(?=\/[^/]).*?([^/]+)$  

(http://(?:(?!http:).)*)$  

http://.*?(?=http://)  

并替换为“无”

好吧，我知道我可以通过一些字符串操作来完成。但我仍然需要一个一步重新解决方案。@user2923419好的，我用regex添加了另一种方法！；）但是还是推荐第一种方法！@Kasra谢谢。我没有考虑使用sub。但是sub在这里非常有用，因为它将返回不匹配的字符串。我找到了另一种不使用sub的解决方案：re.match（“（？：http://.*）（http://.+）$”，原始链接）。组（1）
@user2923419很高兴听到这个消息！那么接受我的答案呢？；）@JanneKarila，我误解了请求，更新了上面的答案