Python 使用re.findall（）提取url的完美正则表达式_Python_Regex_Python 3.x

Python 使用re.findall（）提取url的完美正则表达式

python regex python-3.x

Python 使用re.findall（）提取url的完美正则表达式,python,regex,python-3.x,Python,Regex,Python 3.x,我在谷歌上搜索正则表达式来提取url，但在一个示例中它们不起作用，或者python解释器只是挂起该url为“正则表达式，用于python中带有re.findall的url: http[s]?:\/\/(?:[a-zA-Z]|[0-9]|[$-_@.&+]|[!*,]|(?:%[0-9a-fA-F][0-9a-fA-F]))+ 如果您需要捕获组： (http[s]?:\/\/(?:[a-zA-Z]|[0-9]|[$-_@.&+]|[!*,]|(?:%[0-9a-

我在谷歌上搜索正则表达式来提取url，但在一个示例中它们不起作用，或者python解释器只是挂起

该url为“

正则表达式，用于python中带有re.findall的url:

http[s]?:\/\/(?:[a-zA-Z]|[0-9]|[$-_@.&+]|[!*\(\),]|(?:%[0-9a-fA-F][0-9a-fA-F]))+

如果您需要捕获组：

(http[s]?:\/\/(?:[a-zA-Z]|[0-9]|[$-_@.&+]|[!*\(\),]|(?:%[0-9a-fA-F][0-9a-fA-F]))+)


http matches the characters http literally (case sensitive)
[s]? match a single character present in the list
Quantifier: ? Between zero and one time, as many times as possible, giving back as needed
s the literal character s (case sensitive)
: matches the character : literally
\/ matches the character / literally
\/ matches the character / literally
Quantifier: + Between one and unlimited times, as many times as possible, giving back as needed [greedy]
[a-zA-Z] match a single character present in the list below
a-z a single character in the range between a and z (case sensitive)
A-Z a single character in the range between A and Z (case sensitive)
2nd Alternative: [0-9]
[0-9] match a single character present in the list below
0-9 a single character in the range between 0 and 9
3rd Alternative: [$-_@.&+]
[$-_@.&+] match a single character present in the list below
$-_ a single character in the range between $ and _
@.&+ a single character in the list @.&+ literally (case sensitive)
4th Alternative: [!*\(\),]
[!*\(\),] match a single character present in the list below
!* a single character in the list !* literally
\( matches the character ( literally
\) matches the character ) literally
, the literal character ,
5th Alternative: (?:%[0-9a-fA-F][0-9a-fA-F])
(?:%[0-9a-fA-F][0-9a-fA-F]) Non-capturing group
% matches the character % literally
[0-9a-fA-F] match a single character present in the list below
0-9 a single character in the range between 0 and 9
a-f a single character in the range between a and f (case sensitive)
A-F a single character in the range between A and F (case sensitive)
[0-9a-fA-F] match a single character present in the list below
0-9 a single character in the range between 0 and 9
a-f a single character in the range between a and f (case sensitive)
A-F a single character in the range between A and F (case sensitive)

@yole

（（https？：\/\/）（[\da-z\.-]+）\（[a-z\.]{2,6}）（[\/\w\.-]*）*\/））

，这里的一些正则表达式OP：请确切地告诉我们当您尝试匹配该正则表达式时会发生什么。“不要工作”没什么好谈的。这个怎么样<以下代码：：：：：（以下以下以下以下：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：（（（（：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：（（（（（（（（（）））））以下以下以下以下以下以下以下以下以下：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：：(二){二}{二}{二}{二}{二}{二}{二{四}}二{四}{五}{二}{}{二}{}{二}{四}}}{四}}{四}}}{五}}}{二}{}}{}}}}}{二}}{}}}{}}}}}{}}}}}}}}}}{}}}}}}}{*[a-z-z-z-z-x{00a1}{{00a1}{{00a1}{00a1}{{00AZ-z{{00a1}{00a1}{{00a1}{{00a1}{00a1}{0.0-1}0-0-10-0-9.[10.[a-z-z-z-x{{{00a1}}{0-0-0-0-0-0-0-0-0-9}}{0-0-0-0-0-0-0-10}{0-0-0-10}{0-0-0-0-0-0-0-10}}}}{0-0-0-0-0-10}{0-0-0-0-0-1}{0-0-0-0-0-0-0-0-10}}{0-1}}}}代码>语法错误：（unicode错误）“UnicodeScape”编解码器无法解码376-377位置的字节：截断\xXX escape谢谢您的扩展，但现在我发现了一个可能的错误。您不是想在正则表达式中选择从

到

的范围，是吗？大概您只是想匹配

、

和

中的任何一个幸运的是，通过匹配

符号，这完全打破了百分比编码逻辑。