Python 使用正则表达式查找文本并返回列表
我试图用Python 使用正则表达式查找文本并返回列表,python,regex,re,Python,Regex,Re,我试图用pythonregular expressionre从文本文件(.txt)中创建一个列表。正文的某些部分如下所示 146.204.224.152-feest6811[21/Jun/2019:15:45:24-0700]“发布/激励HTTP/1.1”3024622\n197.109.77.178--kertzmann3129[21/Jun/2019:15:45:25-0700]“删除/virtual/solutions/target/web+服务HTTP/2.0”203 26554 我可以
python
regular expressionre
从文本文件(.txt
)中创建一个列表。正文的某些部分如下所示
146.204.224.152-feest6811[21/Jun/2019:15:45:24-0700]“发布/激励HTTP/1.1”3024622\n197.109.77.178--kertzmann3129[21/Jun/2019:15:45:25-0700]“删除/virtual/solutions/target/web+服务HTTP/2.0”203 26554
我可以知道如何将文本以列表格式正则化为:
{
"host_name": "146.204.224.152",
"name": "feest6811",
"time": "21/Jun/2019:15:45:24 -0700",
"method": "POST /incentivize HTTP/1.1"
},
..
..
..
我尝试使用此模式来正则表达式,因为我看到了使用此模式的示例:
pattern="(?P<host_name>.*)(\ -\ )(?P<name>\w*)"
for item in re.finditer(pattern,'Text_data',re.VERBOSE):
print(item.groupdict())
pattern=“(?P.*)(\-\)(?P\w*)”
对于re.finditer中的项(模式,'Text_data',re.VERBOSE):
打印(item.groupdict())
本文中对regex的任何建议。使用
(?m)^(?P[\d.]+)-(?P\w+)\[(?P[^][]+)]”(?P[^“]+)”
看
解释
--------------------------------------------------------------------------------
^ the beginning of the string
--------------------------------------------------------------------------------
(?P<host_name> group and capture to \k<host_name>:
--------------------------------------------------------------------------------
[\d.]+ any character of: digits (0-9), '.' (1
or more times (matching the most amount
possible))
--------------------------------------------------------------------------------
) end of \k<host_name>
--------------------------------------------------------------------------------
- ' - '
--------------------------------------------------------------------------------
(?P<name> group and capture to \k<name>:
--------------------------------------------------------------------------------
\w+ word characters (a-z, A-Z, 0-9, _) (1 or
more times (matching the most amount
possible))
--------------------------------------------------------------------------------
) end of \k<name>
--------------------------------------------------------------------------------
' '
--------------------------------------------------------------------------------
\[ '['
--------------------------------------------------------------------------------
(?P<time> group and capture to \k<time>:
--------------------------------------------------------------------------------
[^][]+ any character except: ']', '[' (1 or
more times (matching the most amount
possible))
--------------------------------------------------------------------------------
) end of \k<time>
--------------------------------------------------------------------------------
] " '] "'
--------------------------------------------------------------------------------
(?P<method> group and capture to \k<method>:
--------------------------------------------------------------------------------
[^"]+ any character except: '"' (1 or more
times (matching the most amount
possible))
--------------------------------------------------------------------------------
) end of \k<method>
--------------------------------------------------------------------------------
" '"'
--------------------------------------------------------------------------------
^字符串的开头
--------------------------------------------------------------------------------
(?P组和捕获到\k:
--------------------------------------------------------------------------------
[\d.]+以下任意字符:数字(0-9),“.”(1)
或更多次(与最大金额匹配)
(可能的)
--------------------------------------------------------------------------------
)结束\k
--------------------------------------------------------------------------------
- ' - '
--------------------------------------------------------------------------------
(?P组和捕获到\k:
--------------------------------------------------------------------------------
\w+字字符(a-z,a-z,0-9,41;)(1或
更多次(与最多金额匹配)
(可能的)
--------------------------------------------------------------------------------
)结束\k
--------------------------------------------------------------------------------
' '
--------------------------------------------------------------------------------
\[ '['
--------------------------------------------------------------------------------
(?P组和捕获到\k:
--------------------------------------------------------------------------------
[^][]+除以下字符外的任何字符:']'、'['(1或
更多次(与最多金额匹配)
(可能的)
--------------------------------------------------------------------------------
)结束\k
--------------------------------------------------------------------------------
] " '] "'
--------------------------------------------------------------------------------
(?P组和捕获到\k:
--------------------------------------------------------------------------------
[^“]+除:“”(1个或多个)以外的任何字符
次数(与最大金额匹配)
(可能的)
--------------------------------------------------------------------------------
)结束\k
--------------------------------------------------------------------------------
" '"'
最好为此创建一个解析器,然后使用regex,因为这看起来像一个具有适当结构的web日志当您说“list format,“你能举个例子吗?您希望只包含字典示例的键或值,还是两者都包含?@gmdev抱歉使用错误。我提到的是,我希望从字符串返回字典。