Python 在目录中搜索字符串_Python_Regex

Python 在目录中搜索字符串

python regex

Python 在目录中搜索字符串,python,regex,Python,Regex,我试图在Python中搜索给定字符串模式的目录。然后我想将匹配项组合成一个数组起初，我试图使用grep： regex = " dojo.require(..*) " bashCommand = "grep"+" --only-matching -r -h"+regex+baseDir process = subprocess.Popen(bashCommand.split(), stdout=subprocess.PIPE) dirStr = process.communicate()[0]

我试图在Python中搜索给定字符串模式的目录。然后我想将匹配项组合成一个数组

起初，我试图使用grep：

regex = " dojo.require(..*) "
bashCommand = "grep"+" --only-matching -r -h"+regex+baseDir
process = subprocess.Popen(bashCommand.split(), stdout=subprocess.PIPE)
dirStr = process.communicate()[0]

但我意识到我需要在多行上支持字符串，比如

dojo.require(
"abc"(;

所以grep不是一个选项

我还可以用什么方法来实现这一点？提前感谢。

您可以使用

prcegrep

，请参阅。要实现这一点，需要将正则表达式调整为多行

您还可以使用以下方法构建某些内容：

```
os.walk
```
递归访问所有文件
```
re.search
```
搜索正确的表达式

有一个示例。

您可以使用

prcegrep

，请参阅。要实现这一点，需要将正则表达式调整为多行

您还可以使用以下方法构建某些内容：

```
os.walk
```
递归访问所有文件
```
re.search
```
搜索正确的表达式

有一个例子。

您可以使用和的组合在纯Python中实现此功能，而不是调用grep。使用该标志允许多行匹配。例如：

import re, os

def grep(regex, base_dir):
    compiled_regex = re.compile(regex, re.DOTALL)
    matches = list()
    for filename in os.listdir(base_dir):
        full_filename = os.path.join(base_dir, filename)
        if not os.path.isfile(full_filename):
            continue
        with open(os.path.join(base_dir, filename)) as fh:
            content = fh.read()
            if compiled_regex.search(content):
                matches.append(full_filename)
    return matches

print grep(" dojo.require(..*) ", ".")

与调用grep不同，您可以使用和的组合在纯Python中实现此功能。使用该标志允许多行匹配。例如：

import re, os

def grep(regex, base_dir):
    compiled_regex = re.compile(regex, re.DOTALL)
    matches = list()
    for filename in os.listdir(base_dir):
        full_filename = os.path.join(base_dir, filename)
        if not os.path.isfile(full_filename):
            continue
        with open(os.path.join(base_dir, filename)) as fh:
            content = fh.read()
            if compiled_regex.search(content):
                matches.append(full_filename)
    return matches

print grep(" dojo.require(..*) ", ".")

我真的很喜欢这种方法。有一件事-我怎样才能获取实际的匹配项本身，而不是文件名，并将它们放在列表中？我看到搜索（）返回了一个MatchObject…我可以展开（）来获取实际匹配本身吗？使用MatchObject.group（0）来获取整个匹配字符串，或者使用MatchObject.group（1）来获取第一个匹配组（即正则表达式中第一个括号中的所有内容）。然而，我认为您的正则表达式没有正确地转义，也不会做您期望的事情。尝试将其更改为“dojo\.require\（.*？）”，然后您应该能够使用MatchObject.group（1）访问括号中的所有内容。我非常喜欢这种方法。有一件事-我怎样才能获取实际的匹配项本身，而不是文件名，并将它们放在列表中？我看到搜索（）返回了一个MatchObject…我可以展开（）来获取实际匹配本身吗？使用MatchObject.group（0）来获取整个匹配字符串，或者使用MatchObject.group（1）来获取第一个匹配组（即正则表达式中第一个括号中的所有内容）。然而，我认为您的正则表达式没有正确地转义，也不会做您期望的事情。尝试将其更改为“dojo\.require\（.*）”，然后您应该能够使用MatchObject.group（1）访问括号中的所有内容。