Python 我希望解析器返回一个字符串列表，但它返回一个空白列表_Python_List_Parsing_Hex

Python 我希望解析器返回一个字符串列表，但它返回一个空白列表

python list parsing

Python 我希望解析器返回一个字符串列表，但它返回一个空白列表,python,list,parsing,hex,Python,List,Parsing,Hex,我有一个解析器，它读取一个长的八位字节字符串，我希望它根据解析细节打印出更小的字符串。它读取一个十六进制字符串，如下所示字符串的格式如下： 01046574683001000004677265300000000266010000 十六进制中包含的接口格式如下： version:length_of_name:name:op_status:priority:reserved_byte == ==（从十六进制转换时） ^这是字符串的1段，表示eth0（我插入了：以便于阅读）。然而，此时，我的代码

我有一个解析器，它读取一个长的八位字节字符串，我希望它根据解析细节打印出更小的字符串。它读取一个十六进制字符串，如下所示

字符串的格式如下：

01046574683001000004677265300000000266010000

十六进制中包含的接口格式如下：

version:length_of_name:name:op_status:priority:reserved_byte

==（从十六进制转换时）

^这是字符串的1段，表示eth0（我插入了：以便于阅读）。然而，此时，我的代码返回一个空白列表，我不知道为什么。谁能帮帮我吗

def octetChop(long_hexstring, from_ssh_):
    startpoint_of_interface_def=0
    # As of 14/8/13 , the network operator has not been implemented
    network_operator_implemented=False
    version_has_been_read = False
    position_of_interface=0
    chopped_octet_list = []

#This while loop moves through the string of the interface, based on the full length of the container
    try:
        while startpoint_of_interface_def < len(long_hexstring):

            if version_has_been_read == True:
                pass
            else:
                if startpoint_of_interface_def == 0:
                    startpoint_of_interface_def = startpoint_of_interface_def + 2
                    version_has_been_read = True

            endpoint_of_interface_def = startpoint_of_interface_def+2
            length_of_interface_name = long_hexstring[startpoint_of_interface_def:endpoint_of_interface_def]
            length_of_interface_name_in_bytes = int(length_of_interface_name) * 2 #multiply by 2 because its calculating bytes

            end_of_interface_name_point = endpoint_of_interface_def + length_of_interface_name_in_bytes
            hex_name = long_hexstring[endpoint_of_interface_def:end_of_interface_name_point]
            text_name = hex_name.decode("hex")

            print "the text_name is " + text_name

            operational_status_hex = long_hexstring[end_of_interface_name_point:end_of_interface_name_point+2]

            startpoint_of_priority = end_of_interface_name_point+2
            priority_hex = long_hexstring[startpoint_of_priority:startpoint_of_priority+2]

            #Skip the reserved byte
            network_operator_length_startpoint = startpoint_of_priority+4

            single_interface_string = long_hexstring[startpoint_of_interface_def:startpoint_of_priority+4]
            print single_interface_string + " is chopped from the octet string"# - keep for possible debugging

            startpoint_of_interface_def = startpoint_of_priority+4

            if network_operator_implemented == True:
                network_operator_length = long_hexstring[network_operator_length_startpoint:network_operator_length_startpoint+2]
                network_operator_length = int(network_operator_length) * 2
                network_operator_start_point = network_operator_length_startpoint+2
                network_operator_end_point = network_operator_start_point + network_operator_length
                network_operator = long_hexstring[network_operator_start_point:network_operator_end_point]
                #
                single_interface_string = long_hexstring[startpoint_of_interface_def:network_operator_end_point]

                #set the next startpoint if there is one
                startpoint_of_interface_def = network_operator_end_point+1
            else:
                self.network_operator = None

            print single_interface_string + " is chopped from the octet string"# - keep for possible debugging

            #This is where each individual interface is stored, in a list for comparison.
            chopped_octet_list.append(single_interface_string)
    finally:

        return chopped_octet_list

def octetChop（长六进制字符串，来自ssh）：
接口的起始点定义=0
#截至2013年8月14日，网络运营商尚未实施
网络运营商实施=错误
版本已被读取=错误
_接口的位置_=0
切碎的八位组列表=[]
#该while循环根据容器的全长在接口的字符串中移动
尝试：
当接口定义的起始点小于len（长字符串）时：
如果版本已被读取==真：
通过
其他：
如果接口的起始点定义=0：
接口定义的起始点=接口定义的起始点+2
版本已被读取=真
接口定义的端点=接口定义的起点+2
_接口的长度_名称=长_十六进制字符串[_接口的起始点_定义：_接口的端点_定义]
_接口_名称_的长度_（以字节为单位）=int（_接口_名称的长度_）*2乘以2，因为它的计算字节
_interface_name_point的end_=_interface_def的endpoint_+_interface_name_的长度（以字节为单位）
hex_name=long_hexstring[接口的端点_定义：接口的端点_名称_点]
文本名称=十六进制名称。解码（“十六进制”）
打印“文本名称为”+文本名称
操作\u状态\u十六进制=长\u十六进制字符串[接口\u名称\u点的结尾：接口\u名称\u点的结尾+2]
优先级的起始点=接口的结束点名称点+2
优先级\u十六进制=长\u十六进制字符串[优先级的起始点\u：优先级的起始点\u+2]
#跳过保留字节
网络\u运算符\u长度\u起始点=优先级为+4的起始点\u
单\u接口\u字符串=长\u十六进制字符串[接口的起始点\u定义：优先级为+4的起始点\u]
打印单_接口_字符串+”是从八位字节字符串“#中切掉的-保留以备可能的调试
接口定义的起始点=优先级+4的起始点
如果网络_运算符_实现==真：
网络\运算符\长度=长\十六进制字符串[网络\运算符\长度\起始点：网络\运算符\长度\起始点+2]
网络算子长度=int（网络算子长度）*2
网络\运算符\起点\点=网络\运算符\长度\起点+2
网络操作员\u结束\u点=网络操作员\u开始\u点+网络操作员\u长度
网络\u运算符=长\u十六进制字符串[网络\u运算符\u起点：网络\u运算符\u终点]
#
单\u接口\u字符串=长\u十六进制字符串[接口的起始点\u定义：网络\u操作员\u结束点]
#设置下一个起点（如果有）
接口定义的起点=网络操作员终点+1
其他：
self.network\u operator=无
打印单_接口_字符串+”是从八位字节字符串“#中切掉的-保留以备可能的调试
#这是每个单独接口存储在列表中以供比较的地方。
切碎的八位字节列表。追加（单个接口字符串）
最后：
返回切碎的八位字节列表

我希望我没搞错你。您得到了一个包含各种接口定义的十六进制字符串。在每个接口定义中，第二个八位字节描述接口名称的长度

假设字符串包含接口eth0和eth01，如下所示（长度4表示eth0，长度5表示eth01）：

然后您可以像这样拆分它：

def splitIt (s):
    tokens = []
    while s:
        length = int (s [2:4], 16) * 2 + 10 #name length * 2 + 10 digits for rest
        tokens.append (s [:length] )
        s = s [length:]
    return tokens

这将产生：

['010465746830010000', '01056574683031010000']

为了补充Hyperboreus的答案，这里有一种简单的方法可以在拆分接口字符串后解析它们：

def parse(s):
    version = int(s[:2], 16)
    name_len = int(s[2:4], 16)
    name_end = 4 + name_len * 2
    name = s[4:name_end].decode('hex')
    op_status = int(s[name_end:name_end+2], 16)
    priority = int(s[name_end+2:name_end+4], 16)
    reserved = s[name_end+4:name_end+6]
    return version, name_len, name, op_status, priority, reserved

以下是输出：

>>> parse('010465746830010000')
(1, 4, 'eth0', 1, 0, '00')

代码返回空白列表的原因如下：在此行中：

    else:
        self.network_operator = None

self

未定义，因此您会得到一个NameError异常。这意味着

try

直接跳到

finally

子句，而不执行以下部分：

chopped_octet_list.append(single_interface_string)

因此，列表仍然为空。无论如何，对于这样的任务，代码过于复杂，我将遵循其他答案之一

检查以下各项是否有帮助。调用下面的

parse

方法并向其中传递字符串流，然后迭代以获取卡片信息（希望我没有弄错：）

parse

将返回所需信息的元组

>>> def getbytes(hs):
    """Returns a generator of bytes from a hex string"""
    return (int(hs[i:i+2],16) for i in range(0,len(hs)-1,2))

>>> def get_single_card_info(g):
    """Fetches a single card info from a byte generator"""
    v = g.next()
    l = g.next()
    name = "".join(chr(x) for x in map(lambda y: y.next(),[g]*l))
    return (str(v),name,g.next(),g.next(),g.next())

>>> def parse(hs):
    """Parses a hex string stream and returns a generator of card infos"""
    bs = getbytes(hs)
    while True:
        yield get_single_card_info(bs)


>>> c = 1
>>> for card in parse("01046574683001000001056574683031010000"):
    print "Card:{0} -> Version:{1}, Id:{2}, Op_stat:{3}, priority:{4}, reserved:{5} bytes".format(c,*card)
    c = c + 1


Card:1 -> Version:1, Id:eth0, Op_stat:1, priority:0, reserved:0 bytes
Card:2 -> Version:1, Id:eth01, Op_stat:1, priority:0, reserved:0 bytes

Pyparsing包含一个内置表达式，用于解析计数的元素数组，因此这将很好地处理“name”字段。以下是整个解析器：

from pyparsing import Word,hexnums,countedArray

# read in 2 hex digits, convert to integer at parse time
octet = Word(hexnums,exact=2).setParseAction(lambda t:int(t[0],16))

# read in a counted array of octets, convert to string
nameExpr = countedArray(octet, intExpr=octet)
nameExpr.setParseAction(lambda t: ''.join(map(chr,t[0])))

# define record expression, with named results
recordExpr = (octet('version') + nameExpr('name') + octet('op_status') +
              octet('priority') #+ octet('reserved'))

解析您的示例：

sample = "01046574683001000004677265300000000266010000"
for rec in recordExpr.searchString(sample):
    print rec.dump()

给出：

[1, 'eth0', 1, 0]
- name: eth0
- op_status: 1
- priority: 0
- version: 1
[0, 'gre0', 0, 0]
- name: gre0
- op_status: 0
- priority: 0
- version: 0
[0, 'f\x01', 0, 0]
- name: f
- op_status: 0
- priority: 0
- version: 0

dump（）方法显示可用于访问单独解析位的结果名称，如

rec.name

或

rec.version

（我注释掉了保留字节，否则第二个条目将无法正确解析。另外，第三个条目包含一个带有\x01字节的名称。）

您的代码似乎过于复杂。当没有异常被捕获时，为什么会有

try

块？为什么有一个从未使用过的

from\u ssh\u

参数？当这个解析任务似乎不需要迭代时，为什么要有一个带布尔值的while循环来检查循环中的状态？您可以一次计算所有索引。为什么输入字符串包含一堆在输出字符串中没有显示的额外数字

from pyparsing import Word,hexnums,countedArray

# read in 2 hex digits, convert to integer at parse time
octet = Word(hexnums,exact=2).setParseAction(lambda t:int(t[0],16))

# read in a counted array of octets, convert to string
nameExpr = countedArray(octet, intExpr=octet)
nameExpr.setParseAction(lambda t: ''.join(map(chr,t[0])))

# define record expression, with named results
recordExpr = (octet('version') + nameExpr('name') + octet('op_status') +
              octet('priority') #+ octet('reserved'))

sample = "01046574683001000004677265300000000266010000"
for rec in recordExpr.searchString(sample):
    print rec.dump()

[1, 'eth0', 1, 0]
- name: eth0
- op_status: 1
- priority: 0
- version: 1
[0, 'gre0', 0, 0]
- name: gre0
- op_status: 0
- priority: 0
- version: 0
[0, 'f\x01', 0, 0]
- name: f
- op_status: 0
- priority: 0
- version: 0