在Python中从损坏的xml获取数据
我想从xml中获取数据,但它的结构似乎被破坏了 我有一个示例URL:在Python中从损坏的xml获取数据,python,json,xml,python-requests,lxml,Python,Json,Xml,Python Requests,Lxml,我想从xml中获取数据,但它的结构似乎被破坏了 我有一个示例URL:https://b2b.snapoutdoor.pl/rest/V1/extendvariantstocart/73478 这是包含产品数据的xml import requests import json from xml.etree import ElementTree from pprint import pprint response = requests.get( "https://b2b.snapoutdoo
https://b2b.snapoutdoor.pl/rest/V1/extendvariantstocart/73478
这是包含产品数据的xml
import requests
import json
from xml.etree import ElementTree
from pprint import pprint
response = requests.get(
"https://b2b.snapoutdoor.pl/rest/V1/extendvariantstocart/86559",
headers={"Accept": "application/xml"},
)
node = ElementTree.fromstring(response.content)
data = json.loads(node.text)
这将返回带有四个键的dict:
{'jsonChildsConfig': '{"70259":{"id":"70259","name":"Ski Ultra Merino E - '
'black\\/orange","sku":"610306139887","availableQty":6,"regularPrice":69.2367,"finalPrice":69.2367,"promo":false,"discount":0,"bestDiscount":false,"addToCartUrl":"https:\\/\\/b2b.snapoutdoor.pl\\/checkout\\/cart\\/add\\/uenc\\/aHR0cHM6Ly9iMmIuc25hcG91dGRvb3IucGwvcmVzdC9WMS9leHRlbmR2YXJpYW50c3RvY2FydC84NjU1OQ%2C%2C\\/product\\/86559\\/","formKey":"7OWS6VbWucoSg2zg","superAttributes":"36-39 '
'","salable":true},"70260":{"id":"70260","name":"Ski '
'Ultra Merino E - '
'black\\/orange","sku":"610306139894","availableQty":7,"regularPrice":69.2367,"finalPrice":69.2367,"promo":false,"discount":0,"bestDiscount":false,"addToCartUrl":"https:\\/\\/b2b.snapoutdoor.pl\\/checkout\\/cart\\/add\\/uenc\\/aHR0cHM6Ly9iMmIuc25hcG91dGRvb3IucGwvcmVzdC9WMS9leHRlbmR2YXJpYW50c3RvY2FydC84NjU1OQ%2C%2C\\/product\\/86559\\/","formKey":"7OWS6VbWucoSg2zg","superAttributes":"40-43 '
'","salable":true},"70261":{"id":"70261","name":"Ski '
'Ultra Merino E - '
'black\\/orange","sku":"610306139900","availableQty":6,"regularPrice":69.2367,"finalPrice":69.2367,"promo":false,"discount":0,"bestDiscount":false,"addToCartUrl":"https:\\/\\/b2b.snapoutdoor.pl\\/checkout\\/cart\\/add\\/uenc\\/aHR0cHM6Ly9iMmIuc25hcG91dGRvb3IucGwvcmVzdC9WMS9leHRlbmR2YXJpYW50c3RvY2FydC84NjU1OQ%2C%2C\\/product\\/86559\\/","formKey":"7OWS6VbWucoSg2zg","superAttributes":"44-47 '
'","salable":true},"99060":{"id":"99060","name":"Ski '
'Ultra Merino E - '
'black\\/orange","sku":"610306139917","availableQty":3,"regularPrice":69.24,"finalPrice":69.24,"promo":false,"discount":0,"bestDiscount":false,"addToCartUrl":"https:\\/\\/b2b.snapoutdoor.pl\\/checkout\\/cart\\/add\\/uenc\\/aHR0cHM6Ly9iMmIuc25hcG91dGRvb3IucGwvcmVzdC9WMS9leHRlbmR2YXJpYW50c3RvY2FydC84NjU1OQ%2C%2C\\/product\\/86559\\/","formKey":"7OWS6VbWucoSg2zg","superAttributes":"48+ '
'","salable":true}}',
'jsonConfig': 'some data',
'jsonDefaultPlaceholder': 'https://b2b.snapoutdoor.pl/pub/media/catalog/product/placeholder/',
'jsonSwatchConfig': 'some data'
}
我对jsonchildscanfig
的值很感兴趣,但是当我试图访问其中的键时,我得到了
TypeError:字符串索引必须是整数
,因为jsonChildsConfig
的值是字符串
我想从sku
和availableQty
中获取所有sku和库存值,但它们的类型是字符串,无法获取
data['jsonChildsConfig']['70259']['sku']
或
数据['jsonChildsConfig']['70259']['availableQty']
我还尝试通过tjson.loads()
将此字符串转换为json,但没有成功
你能帮我吗 使用json.loads将数据['jsonChildsConfig']的值转换为dict应该可以工作
>>> childConfigDetails = json.loads(data['jsonChildsConfig'])
>>> childConfigDetails['70259']['sku']
'610306139887'
使用json.loads将数据['jsonChildsConfig']的值转换为dict应该可以工作
>>> childConfigDetails = json.loads(data['jsonChildsConfig'])
>>> childConfigDetails['70259']['sku']
'610306139887'
要修复字典,需要对所有值应用
json.loads
不包括非json格式的'jsonDefaultPlaceholder'
:
del data['jsonDefaultPlaceholder']
new_data = {k: json.loads(v) for k, v in data.items() if v}
new_data['jsonChildsConfig']['70259']['sku']
#output: '610306139887'
或者,如果要将感兴趣的键转换为整数值:
del data['jsonDefaultPlaceholder']
new_data2 = {k: {(int(key) if key.isdigit() else key): val for key,val in json.loads(v).items()} for k, v in data.items() if v}
new_data2['jsonChildsConfig'][70259]['sku']
# output: '610306139887'
要修复字典,需要对所有值应用
json.loads
不包括非json格式的'jsonDefaultPlaceholder'
:
del data['jsonDefaultPlaceholder']
new_data = {k: json.loads(v) for k, v in data.items() if v}
new_data['jsonChildsConfig']['70259']['sku']
#output: '610306139887'
或者,如果要将感兴趣的键转换为整数值:
del data['jsonDefaultPlaceholder']
new_data2 = {k: {(int(key) if key.isdigit() else key): val for key,val in json.loads(v).items()} for k, v in data.items() if v}
new_data2['jsonChildsConfig'][70259]['sku']
# output: '610306139887'
json.loads(d['jsonChildsConfig'])
在上述示例中对我有效。json.loads(d['jsonChildsConfig'])
在上述示例中对我有效。