Python 使用正则表达式从日志文件中提取特定文本_Python_Regex_Logfile

Python 使用正则表达式从日志文件中提取特定文本

python regex

Python 使用正则表达式从日志文件中提取特定文本,python,regex,logfile,Python,Regex,Logfile,我有以下日志文件 2020-06-30 12:44:06,608 DEBUG [main] [apitests.ApiTest] Reading of Excel File Started 2020-06-30 12:44:11,853 DEBUG [main] [apitests.ApiTest] The Keyword's Entered : Asus Laptop 2020-06-30 12:44:11,853 DEBUG [main] [apitests.ApiTest] No of K

我有以下日志文件

2020-06-30 12:44:06,608 DEBUG [main] [apitests.ApiTest] Reading of Excel File Started
2020-06-30 12:44:11,853 DEBUG [main] [apitests.ApiTest] The Keyword's Entered : Asus Laptop
2020-06-30 12:44:11,853 DEBUG [main] [apitests.ApiTest] No of Keywords Entered = 1
2020-06-30 12:44:11,853 DEBUG [main] [apitests.ApiTest] Response Code from API : 200
2020-06-30 12:44:11,853 DEBUG [main] [apitests.ApiTest] Time Taken : 1959 milliseconds
2020-06-30 12:44:11,853 DEBUG [main] [apitests.ApiTest] The Result Obtained from API is : {"keywords": {"Asus Laptop": ["Premium grade"]}}
2020-06-30 12:44:11,853 DEBUG [main] [apitests.ApiTest] --------------------------------------------------------------------------------------
2020-06-30 12:44:12,136 DEBUG [main] [apitests.ApiTest] The Keyword's Entered : Intext Hardrive
2020-06-30 12:44:12,136 DEBUG [main] [apitests.ApiTest] No of Keywords Entered = 1
2020-06-30 12:44:12,136 DEBUG [main] [apitests.ApiTest] Response Code from API : 200
2020-06-30 12:44:12,136 DEBUG [main] [apitests.ApiTest] Time Taken : 243 milliseconds
2020-06-30 12:44:12,136 DEBUG [main] [apitests.ApiTest] The Result Obtained from API is : {"keywords": {"Intext Hardrive": ["Medium grade"]}}
2020-06-30 12:44:12,136 DEBUG [main] [apitests.ApiTest] --------------------------------------------------------------------------------------

我的目标是提取单词[“高级”]、[“中级”]……等等。基本上是键值的值

我写了下面的代码

import re
with open('quality.log', 'r') as text_file:
    text_file=text_file.read()  
    for line in text_file :  
        matches=re.findall(r"\[(.*?)\]", line)[0]
with open('qualitygrade.txt', 'w') as out:
    out.write('\n'.join(matches))

会议的目标

re.findall（r“\[（.*？\]），line）[0]

只是提取“高级”、“中级”等

不知道我做错了什么。我的输出文本为空。

请提供任何帮助。

此

for

将覆盖每行的

匹配项
for line in text_file :  
    matches=re.findall(r"\[(.*?)\]", line)[0]

你要么
（a） 找到匹配项时写入输出文件
或者（b）将匹配项存储在单独的变量中。
（b） 会是这样的
import re

matches = []

with open('quality.log', 'r') as text_file:
    text_file=text_file.read()  
    for line in text_file :  
        matches += re.findall(r"\[.*?\]", line)

with open('qualitygrade.txt', 'w') as out:
    out.write('\n'.join(matches))

您还需要修复您的正则表达式，因为您当前使用的正则表达式还将捕获日志中的一些其他令牌。
此for
将覆盖每行的匹配项
for line in text_file :  
    matches=re.findall(r"\[(.*?)\]", line)[0]

你要么
（a） 找到匹配项时写入输出文件
或者（b）将匹配项存储在单独的变量中。
（b） 会是这样的
import re

matches = []

with open('quality.log', 'r') as text_file:
    text_file=text_file.read()  
    for line in text_file :  
        matches += re.findall(r"\[.*?\]", line)

with open('qualitygrade.txt', 'w') as out:
    out.write('\n'.join(matches))

您还需要修复您的正则表达式，因为您当前使用的正则表达式还将捕获日志中的其他一些令牌。
您不需要for循环，因为您正在一次读取整个文件
您的代码可以是：
with open('quality.log', 'r') as text_file:
    text_file=text_file.read()
    matches = re.findall(r'\["(.*?)"]', text_file)

如果要获取双引号之间的值，应将它们添加到模式中
\["(.*?)"]

输出
Premium grade
Medium grade

您不需要for循环，因为您正在一次读取整个文件
您的代码可以是：
with open('quality.log', 'r') as text_file:
    text_file=text_file.read()
    matches = re.findall(r'\["(.*?)"]', text_file)

如果要获取双引号之间的值，应将它们添加到模式中
\["(.*?)"]

输出
Premium grade
Medium grade

你的regex
错了你的regex
错了谢谢你@第四只鸟。你的解决方案很简单也很好。谢谢你@第四只鸟。你的解决方案很简单也很好。