无法调试Python正则表达式_Python_Regex_Regex Group

无法调试Python正则表达式

python regex

无法调试Python正则表达式,python,regex,regex-group,Python,Regex,Regex Group,我正在尝试调试以下Python正则表达式 <meta name="Author" content=".*(?P<uid>([a-zA-Z]*))@abc\.com.* 我使用以下字符串作为示例： <meta name="Author" content="qwerty(qwerty@abc.com)#comments=release candidate for AA 1.1"> 您能否澄清以下代码找不到组uid的原因： regex = re.compile(r'&

我正在尝试调试以下Python正则表达式

<meta name="Author" content=".*(?P<uid>([a-zA-Z]*))@abc\.com.*

我使用以下字符串作为示例：

<meta name="Author" content="qwerty(qwerty@abc.com)#comments=release candidate for AA 1.1">

您能否澄清以下代码找不到组uid的原因：

regex = re.compile(r'<meta name="Author" content=".*(?P<uid>([a-zA-Z]*))@abc\.com')
a = '<meta name="Author" content="qwerty(qwerty@abc.com)#comments=release candidate for AA 1.1">'
q = regex.search(a)
if q:
    print(q.group('uid'))

我甚至做了一个DFA，但仍然无法理解为什么找不到该组。

您所需要的就是：

regex = re.compile(r'(?P<uid>([a-zA-Z]*))@abc\.com')
a = '<meta name="Author" content="qwerty(qwerty@abc.com)#comments=release candidate for AA 1.1">'
q = regex.search(a)
if q:
    print(q.group('uid'))

退货：qwerty

正如@Błotosmętek所解释的，您的解决方案不起作用是因为。*的贪婪性问题是由。*模式的贪婪性引起的。在content=.*P[a-zA-Z]*@abc\.com中，@abc\.com之前的所有内容都由*匹配，留下一个空字符串供您的组匹配。上面Peter Prescott的解决方案是合理的，但如果您坚持使用更长的regexp，请使用：

r'<meta name="Author" content=".*\((?P<uid>[a-zA-Z]*)@abc\.com'

因此，.*在处停止匹配。

您没有匹配组uid中预期的值？请参阅Simply remove.*子模式，它们匹配需要提取的字符串前后一行中的所有文本。我明白了，Peter Prescott提出的解决方案确实有效。但是为什么re.findall找不到这个群体？@GrigoriyVolkov-如果它有效，请：re.findall为什么不能解决这个群体的贪婪？参见。空匹配仍然是匹配。findall的意思是在找到匹配项后它不会停止。非空匹配现在可以在前一个空匹配之后开始。