Python 使用正则表达式查找文本并返回列表

Python 使用正则表达式查找文本并返回列表,python,regex,re,Python,Regex,Re,我试图用pythonregular expressionre从文本文件(.txt)中创建一个列表。正文的某些部分如下所示 146.204.224.152-feest6811[21/Jun/2019:15:45:24-0700]“发布/激励HTTP/1.1”3024622\n197.109.77.178--kertzmann3129[21/Jun/2019:15:45:25-0700]“删除/virtual/solutions/target/web+服务HTTP/2.0”203 26554 我可以

我试图用
python
regular expression
re
从文本文件(
.txt
)中创建一个列表。正文的某些部分如下所示

146.204.224.152-feest6811[21/Jun/2019:15:45:24-0700]“发布/激励HTTP/1.1”3024622\n197.109.77.178--kertzmann3129[21/Jun/2019:15:45:25-0700]“删除/virtual/solutions/target/web+服务HTTP/2.0”203 26554

我可以知道如何将文本以列表格式正则化为:

{
"host_name": "146.204.224.152", 
"name": "feest6811", 
"time": "21/Jun/2019:15:45:24 -0700", 
"method": "POST /incentivize HTTP/1.1"
},
..
..
..
我尝试使用此模式来正则表达式,因为我看到了使用此模式的示例:

pattern="(?P<host_name>.*)(\ -\ )(?P<name>\w*)"

for item in re.finditer(pattern,'Text_data',re.VERBOSE):
    print(item.groupdict())
pattern=“(?P.*)(\-\)(?P\w*)”
对于re.finditer中的项(模式,'Text_data',re.VERBOSE):
打印(item.groupdict())
本文中对regex的任何建议。

使用

(?m)^(?P[\d.]+)-(?P\w+)\[(?P[^][]+)]”(?P[^“]+)”

解释

--------------------------------------------------------------------------------
  ^                        the beginning of the string
--------------------------------------------------------------------------------
  (?P<host_name>           group and capture to \k<host_name>:
--------------------------------------------------------------------------------
    [\d.]+                   any character of: digits (0-9), '.' (1
                             or more times (matching the most amount
                             possible))
--------------------------------------------------------------------------------
  )                        end of \k<host_name>
--------------------------------------------------------------------------------
   -                       ' - '
--------------------------------------------------------------------------------
  (?P<name>                 group and capture to \k<name>:
--------------------------------------------------------------------------------
    \w+                      word characters (a-z, A-Z, 0-9, _) (1 or
                             more times (matching the most amount
                             possible))
--------------------------------------------------------------------------------
  )                        end of \k<name>
--------------------------------------------------------------------------------
                           ' '
--------------------------------------------------------------------------------
  \[                       '['
--------------------------------------------------------------------------------
  (?P<time>                group and capture to \k<time>:
--------------------------------------------------------------------------------
    [^][]+                   any character except: ']', '[' (1 or
                             more times (matching the most amount
                             possible))
--------------------------------------------------------------------------------
  )                        end of \k<time>
--------------------------------------------------------------------------------
  ] "                      '] "'
--------------------------------------------------------------------------------
  (?P<method>                        group and capture to \k<method>:
--------------------------------------------------------------------------------
    [^"]+                    any character except: '"' (1 or more
                             times (matching the most amount
                             possible))
--------------------------------------------------------------------------------
  )                        end of \k<method>
--------------------------------------------------------------------------------
  "                        '"'
--------------------------------------------------------------------------------
^字符串的开头
--------------------------------------------------------------------------------
(?P组和捕获到\k:
--------------------------------------------------------------------------------
[\d.]+以下任意字符:数字(0-9),“.”(1)
或更多次(与最大金额匹配)
(可能的)
--------------------------------------------------------------------------------
)结束\k
--------------------------------------------------------------------------------
-                       ' - '
--------------------------------------------------------------------------------
(?P组和捕获到\k:
--------------------------------------------------------------------------------
\w+字字符(a-z,a-z,0-9,41;)(1或
更多次(与最多金额匹配)
(可能的)
--------------------------------------------------------------------------------
)结束\k
--------------------------------------------------------------------------------
' '
--------------------------------------------------------------------------------
\[                       '['
--------------------------------------------------------------------------------
(?P组和捕获到\k:
--------------------------------------------------------------------------------
[^][]+除以下字符外的任何字符:']'、'['(1或
更多次(与最多金额匹配)
(可能的)
--------------------------------------------------------------------------------
)结束\k
--------------------------------------------------------------------------------
] "                      '] "'
--------------------------------------------------------------------------------
(?P组和捕获到\k:
--------------------------------------------------------------------------------
[^“]+除:“”(1个或多个)以外的任何字符
次数(与最大金额匹配)
(可能的)
--------------------------------------------------------------------------------
)结束\k
--------------------------------------------------------------------------------
"                        '"'

最好为此创建一个解析器,然后使用regex,因为这看起来像一个具有适当结构的web日志当您说“list format,“你能举个例子吗?您希望只包含字典示例的键或值,还是两者都包含?@gmdev抱歉使用错误。我提到的是,我希望从字符串返回字典。