如果不同块的内容不同，如何使用Python解析多行块文本&；正则表达式？_Python_Regex_Multiline

如果不同块的内容不同，如何使用Python解析多行块文本&；正则表达式？

python regex

如果不同块的内容不同，如何使用Python解析多行块文本&；正则表达式？,python,regex,multiline,Python,Regex,Multiline,我有一个需要解析的配置文件，由于python中的groupins，我的想法是在稍后阶段将其放入字典中我面临的问题是，并非每个文本块中的所有行都完全相同，我的正则表达式到目前为止适用于行数最多的块，但当然只在单个块上匹配。如果实例的某些块中存在一些“设置”行，如何进行多行匹配我是否需要分解正则表达式并使用if、elsif、true/false语句来解决这个问题？看起来不像是蟒蛇我很确定，我将不得不打破我的大正则表达式，并通过它的工作顺序？如果是真的，那么。。。否则跳到下一个正则表达式匹配

我有一个需要解析的配置文件，由于python中的groupins，我的想法是在稍后阶段将其放入字典中

我面临的问题是，并非每个文本块中的所有行都完全相同，我的正则表达式到目前为止适用于行数最多的块，但当然只在单个块上匹配。如果实例的某些块中存在一些“设置”行，如何进行多行匹配

我是否需要分解正则表达式并使用if、elsif、true/false语句来解决这个问题？看起来不像是蟒蛇
我很确定，我将不得不打破我的大正则表达式，并通过它的工作顺序？如果是真的，那么。。。否则跳到下一个正则表达式匹配行
是否考虑将从编辑到下一个的每个块放入一个列表元素中，分别进行解析？或者我可以一次完成全部工作吗

我有一些想法，但我想用一些蟒蛇式的方法来做

一如既往，我们非常感谢您的帮助。多谢各位

文本，其中要匹配的块从“编辑”到“下一个”。并非每个块都包含相同的“set”语句：

edit "port11"
    set vdom "ACME_Prod"
    set vlanforward enable
    set type physical
    set device-identification enable
    set snmp-index 26
next
edit "port21"
    set vdom "ACME_Prod"
    set vlanforward enable
    set type physical
    set snmp-index 27
next
edit "port28"
    set vdom "ACME_Prod"
    set vlanforward enable
    set type physical
    set snmp-index 28
next
edit "port29"
    set vdom "ACME_Prod"
    set ip 174.244.244.244 255.255.255.224
    set allowaccess ping
    set vlanforward enable
    set type physical
    set alias "Internet-IRISnet"
    set snmp-index 29
next
edit "port20"
    set vdom "root"
    set ip 192.168.1.1 255.255.255.0
    set allowaccess ping https ssh snmp fgfm
    set vlanforward enable
    set type physical
    set snmp-index 39
next
edit "port25"
    set vdom "root"
    set allowaccess fgfm
    set vlanforward enable
    set type physical
    set snmp-index 40
next

代码段：

import re, pprint
file = "interfaces_2016_10_12.conf"

try:
    """
    fileopen = open(file, 'r')
    output = open('output.txt', 'w+')
except:
    exit("Input file does not exist, exiting script.")

#read whole config in 1 go instead of iterating line by line
text = fileopen.read()   

# my verbose regex, verbose so it is more readable !

pattern = r'''^                 # use r for multiline usage
\s+edit\s\"(.*)\"\n           # group(1) match int name
\s+set\svdom\s\"(.*)\"\n      # group(2) match vdom name
\s+set\sip\s(.*)\n            # group(3) match interface ip
\s+set\sallowaccess\s(.*)\n   # group(4) match allowaccess
\s+set\svlanforward\s(.*)\n   # group(5) match vlanforward
\s+set\stype\s(.*)\n          # group(6) match type
\s+set\salias\s\"(.*)\"\n     # group(7) match alias
\s+set\ssnmp-index\s\d{1,3}\n # match snmp-index but we don't need it
\s+next$'''                   # match end of config block

regexp = re.compile(pattern, re.VERBOSE | re.MULTILINE)

For multiline regex matching use finditer(): 
"""
z = 1
for match in regexp.finditer(text):
    while z < 8:
        print match.group(z)
        z += 1

fileopen.close()  #always close file
output.close() #always close file

import-re，pprint
file=“interfaces\u 2016\u 10\u 12.conf”
尝试：
"""
fileopen=open（文件“r”）
output=open（'output.txt'，'w+'））
除：
退出（“输入文件不存在，正在退出脚本”）
#在一次遍历中读取整个配置，而不是逐行迭代
text=fileopen.read（）
#我的verbose正则表达式，verbose，因此更具可读性！
pattern=r'^#使用r进行多行使用
\s+编辑\s\“（.*）\”\n#组（1）匹配整数名
\s+set\svdom\s\“（.*）\“\n#组（2）匹配vdom名称
\s+集\sip\s（.*）\n#组（3）匹配接口ip
\s+set\sallowaccess\s（.*）\n#组（4）匹配allowaccess
\s+set\svlanforward\s（.*）\n组（5）匹配vlanforward
\s+set\stype\s（.*）\n#组（6）匹配类型
\s+集\salias\s\“（.*）\”\n#组（7）匹配别名
\s+set\ssnmp index\s\d{1,3}\n#匹配snmp索引，但我们不需要它
\s+下一个$''#匹配配置块的末尾
regexp=re.compile（模式，re.VERBOSE | re.MULTILINE）
对于多行正则表达式匹配，请使用finditer（）：
"""
z=1
对于regexp.finditer中的匹配项（文本）：
z<8时：
打印匹配组（z）
z+=1
fileopen.close（）#始终关闭文件
output.close（）#始终关闭文件

为什么要使用

regex

，因为它似乎是一个非常简单的解析结构：

data = {}
with open(file, 'r') as fileopen:
    for line in fileopen:
        words = line.strip().split()
        if words[0] == 'edit':  # Create a new block
            curr = data.setdefault(words[1].strip('"'), {})
        elif words[0] == 'set': # Write config to block
            curr[words[1]] = words[2].strip('"') if len(words) == 3 else words[2:]
print(data)

输出：

{'port11': {'device-identification': 'enable',
  'snmp-index': '26',
  'type': 'physical',
  'vdom': 'ACME_Prod',
  'vlanforward': 'enable'},
 'port20': {'allowaccess': ['ping', 'https', 'ssh', 'snmp', 'fgfm'],
  'ip': ['192.168.1.1', '255.255.255.0'],
  'snmp-index': '39',
  'type': 'physical',
  'vdom': 'root',
  'vlanforward': 'enable'},
  ...

确实很像蟒蛇。谢谢你，先生！我可能已经承认了这样一个事实：我需要浏览8000行不总是相同块的文本。但是，如果我创建的文本文件只包含那些特定的多行块，那么脚本就可以正常工作。所以可能需要使用一个非常小的正则表达式来匹配块，然后使用您的结构逐行解析，对吗？