Python 正则表达式的行为出乎意料

Python 正则表达式的行为出乎意料,python,regex,Python,Regex,脚本: 输出: import re matches = ['hello', 'hey', 'hi', 'hiya'] def check_match(string): for item in matches: if re.search(item, string): print 'Match found: ' + string else: print 'Match not found: ' + string

脚本:

输出:

import re

matches = ['hello', 'hey', 'hi', 'hiya']

def check_match(string):
    for item in matches:
        if re.search(item, string):
            print 'Match found: ' + string
        else:
            print 'Match not found: ' + string

check_match('hey')
check_match('hello there')
check_match('this should not match')
check_match('oh, hiya')
有很多事情我不明白,对于初学者来说,每个字符串在这个输出中被搜索四次,有些返回两次作为找到的匹配,有些返回三次。我不确定我的代码中是什么错误导致了这种情况的发生,但是有人能试着看看是什么错误吗

预期产出如下:

Match not found: hey
Match found: hey
Match not found: hey
Match not found: hey
Match found: hello there
Match not found: hello there
Match not found: hello there
Match not found: hello there
Match not found: this should not match
Match not found: this should not match
Match found: this should not match
Match not found: this should not match
Match not found: oh, hiya
Match not found: oh, hiya
Match found: oh, hiya
Match found: oh, hiya

每个元素有4个搜索和4个输出,因为您在数组中循环,搜索并输出数组中每个元素的内容…

每个元素有4个搜索和4个输出,因为您在数组中循环,为数组中的每个元素搜索并输出某些内容…

这不是行为不正确,而是您对
re.search(…)
的误解

请参见输出后的注释:

Match found: hey
Match found: hello there
Match not found: this should not match
Match found: oh, hiya
如果您不想在输入
oh,hiya
的情况下匹配模式
hi
,则应在模式周围环绕单词边界:

Match not found: hey                    # because 'hello' is not in 'hey'
Match found: hey                        # because 'hey' is in 'hey'
Match not found: hey                    # because 'hi' is not in 'hey'
Match not found: hey                    # because 'hiya' is not in 'hey'

Match found: hello there                # because 'hello' is in 'hello there'
Match not found: hello there            # because 'hey' is not in 'hello there'
Match not found: hello there            # because 'hi' is not in 'hello there'
Match not found: hello there            # because 'hiya' is not in 'hello there'

Match not found: this should not match  # because 'hello' is not in 'this should not match'
Match not found: this should not match  # because 'hey' is not in 'this should not match'
Match found: this should not match      # because 'hi' is in 'this should not match'
Match not found: this should not match  # because 'hiya' is not in 'this should not match'

Match not found: oh, hiya               # because 'hello' is not in 'oh, hiya'
Match not found: oh, hiya               # because 'hey' is not in 'oh, hiya'
Match found: oh, hiya                   # because 'hi' is in 'oh, hiya'
Match found: oh, hiya                   # because 'hiya' is in 'oh, hiya'

这将导致它只匹配被其他字母包围的
hi
的出现(
well-hiya-there
与模式
\bhi\b
不匹配,但
well-hi-there
会匹配)。

这不是行为不正确,而是你对
重新搜索(…)
的误解

请参见输出后的注释:

Match found: hey
Match found: hello there
Match not found: this should not match
Match found: oh, hiya
如果您不想在输入
oh,hiya
的情况下匹配模式
hi
,则应在模式周围环绕单词边界:

Match not found: hey                    # because 'hello' is not in 'hey'
Match found: hey                        # because 'hey' is in 'hey'
Match not found: hey                    # because 'hi' is not in 'hey'
Match not found: hey                    # because 'hiya' is not in 'hey'

Match found: hello there                # because 'hello' is in 'hello there'
Match not found: hello there            # because 'hey' is not in 'hello there'
Match not found: hello there            # because 'hi' is not in 'hello there'
Match not found: hello there            # because 'hiya' is not in 'hello there'

Match not found: this should not match  # because 'hello' is not in 'this should not match'
Match not found: this should not match  # because 'hey' is not in 'this should not match'
Match found: this should not match      # because 'hi' is in 'this should not match'
Match not found: this should not match  # because 'hiya' is not in 'this should not match'

Match not found: oh, hiya               # because 'hello' is not in 'oh, hiya'
Match not found: oh, hiya               # because 'hey' is not in 'oh, hiya'
Match found: oh, hiya                   # because 'hi' is in 'oh, hiya'
Match found: oh, hiya                   # because 'hiya' is in 'oh, hiya'

这将导致它只匹配被其他字母包围的
hi
的出现(
well-hiya-there
将不匹配模式
\bhi\b
,但
well-hi-there
将匹配)。

for循环正在根据每个“匹配”检查字符串,并打印出每一个的“已找到”或“未找到”。您真正想要的是查看是否有匹配项,然后打印出一个“已找到”或“未找到”。我实际上不懂python,所以语法可能不正确

\bhi\b

`

for循环正在对照每个“匹配项”检查字符串,并打印出每个“匹配项”的“找到”或“未找到”。您真正想要的是查看是否有匹配项,然后打印出一个“已找到”或“未找到”。我实际上不懂python,所以语法可能不正确

\bhi\b

`

试试这个-它更简洁,并且会标记多个匹配项:

for item in matches:
    if re.search(item, string):
    found = true
if found:
    print 'Match found: ' + string
else:
    print 'Match not found: ' + string
给出:


试试这个-它更简洁,可以标记多个匹配项:

for item in matches:
    if re.search(item, string):
    found = true
if found:
    print 'Match found: ' + string
else:
    print 'Match not found: ' + string
给出:


你和哪个正则表达式相匹配?你和哪个正则表达式相匹配?