Python使用正则表达式解析字符串以构成字典_Python_Regex_Parsing

Python使用正则表达式解析字符串以构成字典

python regex parsing

Python使用正则表达式解析字符串以构成字典,python,regex,parsing,Python,Regex,Parsing,我需要在Python中提取以下字符串以构成字典： 2014:02:02-12:24:17名称测试ulogd[4834]：id=“xxxx”严重性=“xxxx” sys=“xxxx”sub=“xxxx”name=“xxxx aaaa”action=“xxxx”fwrule=“xxxx” outitf=“xxxx”srcmac=“xxxx”srcip=“xxxx”dstip=“xxxx”proto=“x” length=“xxxx”tos=“xxxx”prec=“xxxx”ttl=“xx”srcpo

我需要在Python中提取以下字符串以构成字典：

2014:02:02-12:24:17名称测试ulogd[4834]：id=“xxxx”严重性=“xxxx” sys=“xxxx”sub=“xxxx”name=“xxxx aaaa”action=“xxxx”fwrule=“xxxx” outitf=“xxxx”srcmac=“xxxx”srcip=“xxxx”dstip=“xxxx”proto=“x” length=“xxxx”tos=“xxxx”prec=“xxxx”ttl=“xx”srcport=“xxxx” dsport=“xxxx”tcpflags=“xxxx”

我不使用带空格的

split（“”）

，因为例如，字段

name=“xxxx aaaa”

可以包含空格

首先，使用以下正则表达式，我仅提取了数据：

re.findall('"([^"]*)"', line)

但是现在我需要使用字典格式，比如：

line['id']=1111

那么正则表达式呢？您有什么想法吗？

您可以使用查找键值对：

>>> import re
>>> groups = re.findall(r'(\w+)="(.*?)"', s)
>>> line = dict(groups)
>>>
>>> from pprint import pprint
>>> pprint(line)
{'action': 'xxxx',
 'dstip': 'xxxx',
 'dstport': 'xxxx',
 'fwrule': 'xxxx',
 'id': 'xxxx',
 'length': 'xxxx',
 'name': 'xxxx aaaa',
 'outitf': 'xxxx',
 'prec': 'xxxx',
 'proto': 'x',
 'severity': 'xxxx',
 'srcip': 'xxxx',
 'srcmac': 'xxxx',
 'srcport': 'xxxx',
 'sub': 'xxxx',
 'sys': 'xxxx',
 'tcpflags': 'xxxx',
 'tos': 'xxxx',
 'ttl': 'xx'}

（\w+）=“（.*？”

将匹配一个或多个字母数字字符（

\w+

部分），后跟

=”

，后跟任何字符（

*？

，非贪婪），后跟

”

。这里用括号定义