Python 使用json_normalize从多个级别获取元值
假设这是我的JSON:Python 使用json_normalize从多个级别获取元值,python,pandas,Python,Pandas,假设这是我的JSON: ds = [{ "name": "groupa", "subGroups": [{ "subGroup": 1, "people": [{ "firstname":"Tony", }, { "firstname":"Brian" } ]
ds = [{
"name": "groupa",
"subGroups": [{
"subGroup": 1,
"people": [{
"firstname":"Tony",
},
{
"firstname":"Brian"
}
]
}]
},
{
"name": "groupb",
"subGroups": [{
"subGroup": 1,
"people": [{
"firstname":"Tony",
},
{
"firstname":"Brian"
}
]
}]
}
]
我通过以下操作创建数据帧:
df = json_normalize(ds, record_path =['subGroups', 'people'], meta=['name'])
这给了我:
firstname name
0 Tony groupa
1 Brian groupa
2 Tony groupb
3 Brian groupb
但是,我还想包括subGroup列
我尝试:
df = json_normalize(ds, record_path =['subGroups', 'people'], meta=['name', 'subGroup'])
但这给了:
KeyError: 'subGroup'
有什么想法吗?试试这个
json_normalize(
ds,
record_path=['subGroups', 'people'],
meta=[
'name',
['subGroups', 'subGroup'] # each meta field needs its own path
],
errors='ignore'
)
firstname name subGroups.subGroup
0 Tony groupa 1
1 Brian groupa 1
2 Tony groupb 1
3 Brian groupb 1
df = json_normalize(ds, record_path =['subGroups', 'people'],meta['name'['subGroups', 'subGroup']])
它将“名称”列作为单独的列删除。