带贪婪组的Python正则表达式

带贪婪组的Python正则表达式,python,regex,Python,Regex,这是输入文件的内容 sb.txt JOHN:ENGINEER:35 这些是用于评估文件的模式 finp = open(r'C:\Users\dhiwakarr\PycharmProjects\BasicConcepts\sb.txt','r') for line in finp: biodata1 = re.search(r'([\w\W])+?:([\w\W])+?:([\w\W])+?',line) biodata2 = re.search(r'([\w\W]+?):([

这是输入文件的内容

 sb.txt
 JOHN:ENGINEER:35
这些是用于评估文件的模式

finp = open(r'C:\Users\dhiwakarr\PycharmProjects\BasicConcepts\sb.txt','r')

for line in finp:
   biodata1 = re.search(r'([\w\W])+?:([\w\W])+?:([\w\W])+?',line)
   biodata2 = re.search(r'([\w\W]+?):([\w\W]+?):([\w\W]+?)',line)
   print('line is '+line)
   print('re.search(r([\w\W])+?:([\w\W])+?:([\w\W])+? '+biodata1.group(1)+' '+biodata1.group(2)+' '+biodata1.group(3))
   print('re.search(r([\w\W]+?):([\w\W]+?):([\w\W]+?) '+biodata2.group(1)+' '+biodata2.group(2)+' '+biodata2.group(3))
这是我得到的输出

line is JOHN:ENGINEER:35
re.search(r([\w\W])+?:([\w\W])+?:([\w\W])+? N R 3
re.search(r([\w\W]+?):([\w\W]+?):([\w\W]+?) JOHN ENGINEER 3
我对它产生的输出有几个问题

  • 为什么第一个搜索模式与工程师约翰的最后几个字符匹配,而与35的第一个字符匹配?我原以为一旦发现约翰和工程师的第一个角色,贪婪的角色“?”就会退出

  • 有人能帮我理解“+?”的位置如何影响
    两种说法


  • biodata1和biodata2之间的区别在于括号的位置

    biodata1:

    ([\w\W])+?:([\w\W])+?:([\w\W])+?
    
    The parenthesis matches one rgument before : for group(1)
    like wise for group(2)
    But there is no ending criteria for group(3) so it matched the first letter 3 after :
    
    ([\w\W]+?):([\w\W]+?):([\w\W]+?)
    
    You are matching all the words and non-words before : whicj should atleast have 1 words for group(1)
    like wise for group(2)
    but for group(3) you are matching all the word and non-word after second:
    
    This checks if there is at least one or more character matching the given regex if so match it
    
    说明:

    ([\w\W])+?:([\w\W])+?:([\w\W])+?
    
    The parenthesis matches one rgument before : for group(1)
    like wise for group(2)
    But there is no ending criteria for group(3) so it matched the first letter 3 after :
    
    ([\w\W]+?):([\w\W]+?):([\w\W]+?)
    
    You are matching all the words and non-words before : whicj should atleast have 1 words for group(1)
    like wise for group(2)
    but for group(3) you are matching all the word and non-word after second:
    
    This checks if there is at least one or more character matching the given regex if so match it
    
    biodata2:

    ([\w\W])+?:([\w\W])+?:([\w\W])+?
    
    The parenthesis matches one rgument before : for group(1)
    like wise for group(2)
    But there is no ending criteria for group(3) so it matched the first letter 3 after :
    
    ([\w\W]+?):([\w\W]+?):([\w\W]+?)
    
    You are matching all the words and non-words before : whicj should atleast have 1 words for group(1)
    like wise for group(2)
    but for group(3) you are matching all the word and non-word after second:
    
    This checks if there is at least one or more character matching the given regex if so match it
    
    说明:

    ([\w\W])+?:([\w\W])+?:([\w\W])+?
    
    The parenthesis matches one rgument before : for group(1)
    like wise for group(2)
    But there is no ending criteria for group(3) so it matched the first letter 3 after :
    
    ([\w\W]+?):([\w\W]+?):([\w\W]+?)
    
    You are matching all the words and non-words before : whicj should atleast have 1 words for group(1)
    like wise for group(2)
    but for group(3) you are matching all the word and non-word after second:
    
    This checks if there is at least one or more character matching the given regex if so match it
    
    +?:

    ([\w\W])+?:([\w\W])+?:([\w\W])+?
    
    The parenthesis matches one rgument before : for group(1)
    like wise for group(2)
    But there is no ending criteria for group(3) so it matched the first letter 3 after :
    
    ([\w\W]+?):([\w\W]+?):([\w\W]+?)
    
    You are matching all the words and non-words before : whicj should atleast have 1 words for group(1)
    like wise for group(2)
    but for group(3) you are matching all the word and non-word after second:
    
    This checks if there is at least one or more character matching the given regex if so match it
    

    每个组应该包含什么值?我不是在组中寻找任何特定的值。我只是想弄明白为什么约翰,工程师的最后两个角色,即“N”和“R”分别匹配?我原以为贪婪会在第一个模式biodata1中找到第一个字符后立即退出匹配。还有为什么biodata2会匹配所有的字符?那么你正在尝试匹配每个字母数字单词的第一个字符?是的,这是我所期望的,但对于为什么biodata1Thanks@Vignesh Kalai中的第(1)组和第(2)组会匹配最后的字符感到困惑,但不会+?之前发现一个字符“J”就退出:然后进入下一组?在biodata1No+中?实际上,这意味着它必须至少匹配一个字符,它可以匹配多个字符。生物数据1它只能匹配一个字符,并且你给了该字符一个结束限制,即:因此它在
    之前匹配一个字符:
    如何修改它,使其在开始时只匹配一个字符,如“JOHN”中的“J”和“ENGINEER”中的“E”?@DhiwakarRavikumar我在正则表达式中不是那么高效,你可能不得不问其他人或在正则表达式标签中发布另一个问题来问这个问题