Python 3.x 使用json_normalize从嵌套值构建表
当记录路径指向一列时,我在使用json_normalize时遇到了麻烦,其中有一个嵌套的dict,该dict随后包含一个列表。例如,见下文 鉴于以下情况:Python 3.x 使用json_normalize从嵌套值构建表,python-3.x,pandas,json-normalize,Python 3.x,Pandas,Json Normalize,当记录路径指向一列时,我在使用json_normalize时遇到了麻烦,其中有一个嵌套的dict,该dict随后包含一个列表。例如,见下文 鉴于以下情况: list_of_dict = [ { 'SCHOOL_NAME': 'SCHOOL_A', 'STUDENTS': [ { 'STUDENT_NAME': 'JOHN', 'STUDENT_ID': '1' }, {
list_of_dict = [
{
'SCHOOL_NAME': 'SCHOOL_A',
'STUDENTS': [
{
'STUDENT_NAME': 'JOHN',
'STUDENT_ID': '1'
},
{
'STUDENT_NAME': 'JANE',
'STUDENT_ID': '2'
},
]
},
{
'SCHOOL_NAME': 'SCHOOL_B',
'STUDENTS': [
{
'STUDENT_NAME': 'HENRY',
'STUDENT_ID': '1'
},
{
'STUDENT_NAME': 'MARK',
'STUDENT_ID': '2'
},
]
}]
我可以用电脑把它弄平
pd.json_normalize(data=list_of_dict, record_path='STUDENTS', meta=['SCHOOL_NAME'])[['SCHOOL_NAME', 'STUDENT_ID', 'STUDENT_NAME']]
要获得以下信息:
list_of_dict = [
{
'SCHOOL_NAME': 'SCHOOL_A',
'STUDENTS': [
{
'STUDENT_NAME': 'JOHN',
'STUDENT_ID': '1'
},
{
'STUDENT_NAME': 'JANE',
'STUDENT_ID': '2'
},
]
},
{
'SCHOOL_NAME': 'SCHOOL_B',
'STUDENTS': [
{
'STUDENT_NAME': 'HENRY',
'STUDENT_ID': '1'
},
{
'STUDENT_NAME': 'MARK',
'STUDENT_ID': '2'
},
]
}]
如果dict的列表结构如下,如何获得类似的输出格式:
**注意增加了学生名单**
list_of_dict = [
{
'SCHOOL_NAME': 'SCHOOL_A',
'STUDENT_LIST':{
'STUDENTS': [
{
'STUDENT_NAME': 'JOHN',
'STUDENT_ID': '1'
},
{
'STUDENT_NAME': 'JANE',
'STUDENT_ID': '2'
},
]
}
},
{
'SCHOOL_NAME': 'SCHOOL_B',
'STUDENT_LIST': {
'STUDENTS': [
{
'STUDENT_NAME': 'HENRY',
'STUDENT_ID': '1'
},
{
'STUDENT_NAME': 'MARK',
'STUDENT_ID': '2'
},
]
}
}]
与
pop
一起使用dict comprehension
:
# Just pop key `STUDENT_LIST` and your list_of_dict is back like before
In [680]: a = [{**x, **x.pop('STUDENT_LIST')} for x in list_of_dict]
# Now use `json_normalize`
In [684]: pd.json_normalize(a, record_path='STUDENTS', meta=['SCHOOL_NAME'])
Out[684]:
STUDENT_NAME STUDENT_ID SCHOOL_NAME
0 JOHN 1 SCHOOL_A
1 JANE 2 SCHOOL_A
2 HENRY 1 SCHOOL_B
3 MARK 2 SCHOOL_B