将XML数据解析为dict,但不确定最适合Python的方式
假设我有一些关于在线产品的XML数据,有多种价格:将XML数据解析为dict,但不确定最适合Python的方式,python,xml,xpath,xml-parsing,lxml,Python,Xml,Xpath,Xml Parsing,Lxml,假设我有一些关于在线产品的XML数据,有多种价格: <Response> <TotalOffers>6</TotalOffers> <LowPrices> <LowPrice condition="new"> <CurrencyCode>USD</CurrencyCode> <Amount>15.50</Amount
<Response>
<TotalOffers>6</TotalOffers>
<LowPrices>
<LowPrice condition="new">
<CurrencyCode>USD</CurrencyCode>
<Amount>15.50</Amount>
</LowPrice>
<LowPrice condition="used">
<CurrencyCode>USD</CurrencyCode>
<Amount>22.86</Amount>
</LowPrice>
</LowPrices>
</Response>
使用lxml库,这非常简单。我只需指定查找每个值的xpath,然后处理缺少预期数据的异常,例如,要获取TotalOffers值(6),我将执行以下操作:
response = {
'total_offers': 6,
'low_prices': [
{'condition': "new", 'currency': "USD", 'amount': 15.50},
{'condition': "used", 'currency': "USD", 'amount': 22.86},
]
}
# convert xml to etree object
tree_obj = etree.fromstring(xml_text)
# use xpath to find values that I want in this tree object
matched_els = tree_obj.xpath('//TotalOffers')
# xpath matches are returned as a list
# since there could be more than one match grab only the first one
first_match_el = matched_els[0]
# extract the text and print to console
print first_match_el.text
# >>> '6'
low_prices = []
low_prices_els = tree_obj.xpath('//LowPrices')
for el in low_prices_els:
low_prices.append(
{
'condition': get_text(el, './@condition', type='str'),
'currency': get_text(el, './CurrencyCode', type='str'),
'amount': get_text(el, './Amount', type='float')
}
)
response = {
'total_offers': get_text(tree_obj, '//TotalOffers', type='int'),
'low_prices': low_prices
}
现在我的想法是,我可以编写一个像get_text(tree_obj,xpath_to_value)
这样的函数,但是如果我还想让这个函数将值转换为其适当的类型(例如:string、float或int),我应该有一个参数来指定像soget_text(tree_obj、xpath_to_to_value、type='float')这样的类型
因为如果我这样做,我创建dict的下一步将是这样的:
response = {
'total_offers': 6,
'low_prices': [
{'condition': "new", 'currency': "USD", 'amount': 15.50},
{'condition': "used", 'currency': "USD", 'amount': 22.86},
]
}
# convert xml to etree object
tree_obj = etree.fromstring(xml_text)
# use xpath to find values that I want in this tree object
matched_els = tree_obj.xpath('//TotalOffers')
# xpath matches are returned as a list
# since there could be more than one match grab only the first one
first_match_el = matched_els[0]
# extract the text and print to console
print first_match_el.text
# >>> '6'
low_prices = []
low_prices_els = tree_obj.xpath('//LowPrices')
for el in low_prices_els:
low_prices.append(
{
'condition': get_text(el, './@condition', type='str'),
'currency': get_text(el, './CurrencyCode', type='str'),
'amount': get_text(el, './Amount', type='float')
}
)
response = {
'total_offers': get_text(tree_obj, '//TotalOffers', type='int'),
'low_prices': low_prices
}
这是完成我想做的事情的最好方法吗?我觉得我在为自己将来制造问题。我认为您需要的工具是xml到json工具,它将xml文档转换为json格式,您可以在以下方面进行测试:
http://codebeautify.org/xmltojson
输出:
我认为您需要的工具是xml到json工具,它将xml文档转换为json格式,您可以在以下方面进行测试:
http://codebeautify.org/xmltojson
输出: