使用Python解析嵌套JSON:TypeError:list索引必须是整数,而不是str
我有嵌套的数据,我想从JSON插入Pandas数据框,但我的JSON是嵌套的,并且给出了错误 以下是数据使用Python解析嵌套JSON:TypeError:list索引必须是整数,而不是str,python,json,pandas,Python,Json,Pandas,我有嵌套的数据,我想从JSON插入Pandas数据框,但我的JSON是嵌套的,并且给出了错误 以下是数据 {"data":[{"date":"2018-08-20T00:00:00","values":[{"account":"account_1","device":"device_1","deviceModel":"testdev","id":"id_1","Events":[{"EventCategory":"Scan","EventCategoryData":[{"name":"scann
{"data":[{"date":"2018-08-20T00:00:00","values":[{"account":"account_1","device":"device_1","deviceModel":"testdev","id":"id_1","Events":[{"EventCategory":"Scan","EventCategoryData":[{"name":"scanname","info":[{"type":"any","count":8.0}]},{"name":"scanname","info":[{"type":"any","count":1.0}]}],"scancount":2.0},{"EventCategory":"Web","EventCategoryData":[{"name":"web_Scan","info":[{"type":"Web","count":2.0}]},{"name":"web scan 2","info":[{"type":"Web 2","count":0.0}]},{"name":"web 3 ","info":[{"type":"Web 3","count":2.0}]}]},{"EventCategory":"WWW","EventCategoryData":[{"name":"any","info":[{"type":"wifi","count":2.0}]}],"scancount":4.0},{"EventCategory":"Others","EventCategoryData":[{"name":"anything","info":[{"previousversion":"default","updatedversion":"default"}]}]}]}]},{"date":"2018-08-22T00:00:00","values":[{"account":"account_1","device":"device_1","deviceModel":"testdev","id":"id_2","Events":[{"EventCategory":"Scan2","EventCategoryData":[{"name":"scan name","info":[{"type":"scan 2","count":2}]},{"name":"update","info":[{"type":"scan","count":1},{"type":"WWW","count":1}]}],"scancount":1},{"EventCategory":"Web","EventCategoryData":[{"name":"web1","info":[{"type":"WWW","count":1}]},{"name":"Wifi","info":[{"type":"Web Sites","count":1}]},{"name":"web2","info":[{"type":"scan","count":1}]}]}]}]}],"status":"success"}
我试着让你正常化
normalize_data = json_normalize(data['data'],['values'], record_path ='EventCategory' ,errors='ignore')
TypeError: json_normalize() got multiple values for argument 'record_path'
我想构建一个数据框架,将所有键作为列,值作为行。请在此提供任何帮助-使用json\u normalize()
无法以完全通用的方式完成此操作。您可以使用record\u path
和meta
参数来指示您希望如何处理JSON
from pandas.io.json import json_normalize
data ={"data":[{"date":"2018-08-20T00:00:00","values":[{"account":"account_1","device":"device_1","deviceModel":"testdev","id":"id_1","Events":[{"EventCategory":"Scan","EventCategoryData":[{"name":"scanname","info":[{"type":"any","count":8.0}]},{"name":"scanname","info":[{"type":"any","count":1.0}]}],"scancount":2.0},{"EventCategory":"Web","EventCategoryData":[{"name":"web_Scan","info":[{"type":"Web","count":2.0}]},{"name":"web scan 2","info":[{"type":"Web 2","count":0.0}]},{"name":"web 3 ","info":[{"type":"Web 3","count":2.0}]}]},{"EventCategory":"WWW","EventCategoryData":[{"name":"any","info":[{"type":"wifi","count":2.0}]}],"scancount":4.0},{"EventCategory":"Others","EventCategoryData":[{"name":"anything","info":[{"previousversion":"default","updatedversion":"default"}]}]}]}]},{"date":"2018-08-22T00:00:00","values":[{"account":"account_1","device":"device_1","deviceModel":"testdev","id":"id_2","Events":[{"EventCategory":"Scan2","EventCategoryData":[{"name":"scan name","info":[{"type":"scan 2","count":2}]},{"name":"update","info":[{"type":"scan","count":1},{"type":"WWW","count":1}]}],"scancount":1},{"EventCategory":"Web","EventCategoryData":[{"name":"web1","info":[{"type":"WWW","count":1}]},{"name":"Wifi","info":[{"type":"Web Sites","count":1}]},{"name":"web2","info":[{"type":"scan","count":1}]}]}]}]}],"status":"success"}
#merge all data['data] multiple list of data['value'] into single list
flat_list = [item for sublist in data['data'] for item in sublist['values']]
result = json_normalize(flat_list, record_path=['Events','EventCategoryData','info'],\
meta=['account','device','deviceModel','id',['Events','EventCategory'],\
['Events','EventCategory','name']])
print(result)
O/p:
更新:
#merge all data['data] multiple list into single list and merge date items into values sublist of dict.
flat_list = []
for sublist in data['data']:
new_list = [item for item in sublist['values']]
new_list[0]['date'] = sublist['date']
flat_list.extend(new_list)
result = json_normalize(flat_list, record_path=['Events','EventCategoryData','info'],\
meta=['account','device','deviceModel','id','date',['Events','EventCategory'],\
['Events','EventCategory','name']])
print(result)
O/p:
错误描述您的函数调用
json\u normalize
具有多个名为record\u path
的参数,但您粘贴的那一行没有多个名为record\u path
的参数。您是否复制了正确的行?--------------------------------------------------------------1 normalize_data=json_normalize(data['data'],['values'],[u path='Events',errors='ignore')中的TypeError回溯(最近一次调用)2 TypeError:json_normalize()为参数“record_path”获取多个值您的回溯不会显示在注释中的多行上。你能编辑并添加到你的问题中吗?嘿,谢谢你的建议Bharatk,所以如果记录路径和元是唯一的选项,那么我如何区分记录?我想加上“日期”,因为数据是唯一的区别
#merge all data['data] multiple list into single list and merge date items into values sublist of dict.
flat_list = []
for sublist in data['data']:
new_list = [item for item in sublist['values']]
new_list[0]['date'] = sublist['date']
flat_list.extend(new_list)
result = json_normalize(flat_list, record_path=['Events','EventCategoryData','info'],\
meta=['account','device','deviceModel','id','date',['Events','EventCategory'],\
['Events','EventCategory','name']])
print(result)
count previousversion type updatedversion ... id date Events.EventCategory Events.EventCategory.name
0 8.0 NaN any NaN ... id_1 2018-08-20T00:00:00 Scan scanname
1 1.0 NaN any NaN ... id_1 2018-08-20T00:00:00 Scan scanname
2 2.0 NaN Web NaN ... id_1 2018-08-20T00:00:00 Web web_Scan
3 0.0 NaN Web 2 NaN ... id_1 2018-08-20T00:00:00 Web web scan 2
4 2.0 NaN Web 3 NaN ... id_1 2018-08-20T00:00:00 Web web 3
5 2.0 NaN wifi NaN ... id_1 2018-08-20T00:00:00 WWW any
6 NaN default NaN default ... id_1 2018-08-20T00:00:00 Others anything
7 2.0 NaN scan 2 NaN ... id_2 2018-08-22T00:00:00 Scan2 scan name
8 1.0 NaN scan NaN ... id_2 2018-08-22T00:00:00 Scan2 update
9 1.0 NaN WWW NaN ... id_2 2018-08-22T00:00:00 Scan2 update
10 1.0 NaN WWW NaN ... id_2 2018-08-22T00:00:00 Web web1
11 1.0 NaN Web Sites NaN ... id_2 2018-08-22T00:00:00 Web Wifi
12 1.0 NaN scan NaN ... id_2 2018-08-22T00:00:00 Web web2
[13 rows x 11 columns]