将XML数据解析为dict，但不确定最适合Python的方式_Python_Xml_Xpath_Xml Parsing_Lxml

将XML数据解析为dict，但不确定最适合Python的方式

python xml xpath

将XML数据解析为dict，但不确定最适合Python的方式,python,xml,xpath,xml-parsing,lxml,Python,Xml,Xpath,Xml Parsing,Lxml,假设我有一些关于在线产品的XML数据，有多种价格： <Response> <TotalOffers>6</TotalOffers> <LowPrices> <LowPrice condition="new"> <CurrencyCode>USD</CurrencyCode> <Amount>15.50</Amount

假设我有一些关于在线产品的XML数据，有多种价格：

<Response>
    <TotalOffers>6</TotalOffers>
    <LowPrices>
        <LowPrice condition="new">
            <CurrencyCode>USD</CurrencyCode>
            <Amount>15.50</Amount>
        </LowPrice>
        <LowPrice condition="used">
            <CurrencyCode>USD</CurrencyCode>
            <Amount>22.86</Amount>
        </LowPrice>
    </LowPrices>
</Response>

使用lxml库，这非常简单。我只需指定查找每个值的xpath，然后处理缺少预期数据的异常，例如，要获取TotalOffers值（6），我将执行以下操作：

response = {
    'total_offers': 6,
    'low_prices': [
        {'condition': "new", 'currency': "USD", 'amount': 15.50},
        {'condition': "used", 'currency': "USD", 'amount': 22.86},
    ]
}

# convert xml to etree object
tree_obj = etree.fromstring(xml_text)
# use xpath to find values that I want in this tree object
matched_els = tree_obj.xpath('//TotalOffers')
# xpath matches are returned as a list
# since there could be more than one match grab only the first one
first_match_el = matched_els[0]
# extract the text and print to console
print first_match_el.text
# >>> '6'

low_prices = []
low_prices_els = tree_obj.xpath('//LowPrices')
for el in low_prices_els:
    low_prices.append(
        {
            'condition': get_text(el, './@condition', type='str'),
            'currency': get_text(el, './CurrencyCode', type='str'),
            'amount': get_text(el, './Amount', type='float')
        }
    )

response = {
    'total_offers': get_text(tree_obj, '//TotalOffers', type='int'),
    'low_prices': low_prices
}

现在我的想法是，我可以编写一个像

get_text（tree_obj，xpath_to_value）

这样的函数，但是如果我还想让这个函数将值转换为其适当的类型（例如：string、float或int），我应该有一个参数来指定像so

get_text（tree_obj、xpath_to_to_value、type='float'）这样的类型

因为如果我这样做，我创建dict的下一步将是这样的：

response = {
    'total_offers': 6,
    'low_prices': [
        {'condition': "new", 'currency': "USD", 'amount': 15.50},
        {'condition': "used", 'currency': "USD", 'amount': 22.86},
    ]
}

# convert xml to etree object
tree_obj = etree.fromstring(xml_text)
# use xpath to find values that I want in this tree object
matched_els = tree_obj.xpath('//TotalOffers')
# xpath matches are returned as a list
# since there could be more than one match grab only the first one
first_match_el = matched_els[0]
# extract the text and print to console
print first_match_el.text
# >>> '6'

low_prices = []
low_prices_els = tree_obj.xpath('//LowPrices')
for el in low_prices_els:
    low_prices.append(
        {
            'condition': get_text(el, './@condition', type='str'),
            'currency': get_text(el, './CurrencyCode', type='str'),
            'amount': get_text(el, './Amount', type='float')
        }
    )

response = {
    'total_offers': get_text(tree_obj, '//TotalOffers', type='int'),
    'low_prices': low_prices
}

这是完成我想做的事情的最好方法吗？我觉得我在为自己将来制造问题。

我认为您需要的工具是xml到json工具，它将xml文档转换为json格式，您可以在以下方面进行测试：

http://codebeautify.org/xmltojson

输出：

我认为您需要的工具是xml到json工具，它将xml文档转换为json格式，您可以在以下方面进行测试：

http://codebeautify.org/xmltojson

输出：