Python 将模式编译为列表_Python_Regex

Python 将模式编译为列表

python regex

Python 将模式编译为列表,python,regex,Python,Regex,因此，我目前正在将一些代码从R传输到Python。我正在加载以处理grep的文件格式如下： match id (chief\s+marketing\s+officer)|(\bc\.?m\.?o\.?\b) 3 (chief\s+technology\s+officer)|(\bc\.?t\.?o\.?\b) 4 (chief\s+information\s+officer)|(\bc\.?i\

因此，我目前正在将一些代码从R传输到Python。我正在加载以处理grep的文件格式如下：

match                                               id
(chief\s+marketing\s+officer)|(\bc\.?m\.?o\.?\b)    3
(chief\s+technology\s+officer)|(\bc\.?t\.?o\.?\b)   4
(chief\s+information\s+officer)|(\bc\.?i\.?o\.?\b)  5
(\bdirector\b)                                      11

我在将其加载到pandas数据帧并预编译模式时遇到问题

def compilePatterns():
    matches = levels['match']
    patterns = []
    for match in matches:
        pat = re.compile(r''+ match)
        patterns.append(pat)
    return patterns

现在，使用我的提取功能：

def extract(title):
    title = title.lower()
    print title
    for index,pattern, in enumerate(patterns):
        match = pattern.match(title)
        if match:
            return levels.iloc[index]['id']
    return None

如果我做了extract（'director'），效果很好，我得到10，但是如果我做了：extract（'petdirector'），它将不返回任何结果。因此，导演从未被提拔

我不确定问题是在编译模式时出现的，因为它们到处都有括号，还是这是一种正确的方法。

模式。match

将只返回字符串开头的匹配项。由于

\b director\b

未出现在字符串

'Pet director'

的开头，因此

模式.match（'Pet director'）

将不会返回任何内容

您需要的是

pattern.search

（或

re.search（pattern，…）

），它将返回字符串中任何位置的匹配项。

非常感谢！是的，我现在意识到，比赛才刚刚开始。看来搜索已经解决了我的问题。再次感谢