Python 循环JSON对象并将结果存储在dataframe中
我有一个JSON对象,如下所示:Python 循环JSON对象并将结果存储在dataframe中,python,json,pandas,loops,Python,Json,Pandas,Loops,我有一个JSON对象,如下所示: data = {'A': {'code': 'Ok', 'tracepoints': [None, None, {'alternatives_count': 0, 'location': [-122.419189, 37.753805], 'distance': 28.078003, 'hint': '5Qg7hUqpFQA2AAAAOgAAAAwAAAAPAAAAiVMWQq2VIEIAuABB7FgoQTYAAAA6A
data = {'A': {'code': 'Ok',
'tracepoints': [None,
None,
{'alternatives_count': 0,
'location': [-122.419189, 37.753805],
'distance': 28.078003,
'hint': '5Qg7hUqpFQA2AAAAOgAAAAwAAAAPAAAAiVMWQq2VIEIAuABB7FgoQTYAAAA6AAAADAAAAA8AAAD4RAAACwi0-M0TQALvB7T4yRRAAgEAXwX5Wu6N',
'name': '23rd Street',
'matchings_index': 0,
'waypoint_index': 0},
{'alternatives_count': 0,
'location': [-122.417866, 37.75389],
'distance': 26.825184,
'hint': 'K8w6BRinFYAdAAAACwAAAA0AAAAAAAAAIxmmQTSs6kCiuRFBAAAAAB0AAAALAAAADQAAAAAAAAD4RAAANg20-CIUQAJNDbT4MRNAAgIAnxD5Wu6N',
'name': '23rd Street',
'matchings_index': 0,
'waypoint_index': 1},
{'alternatives_count': 0,
'location': [-122.416896, 37.75395],
'distance': 16.583412,
'hint': 'Jcw6BSzMOoUqAAAAQwAAABAAAAANAAAA0i_uQb3SOEKKPC9BG1EaQSoAAABDAAAAEAAAAA0AAAD4RAAAABG0-F4UQALyELT48xRAAgEAnxD5Wu6N',
'name': '23rd Street',
'matchings_index': 0,
'waypoint_index': 2},
{'alternatives_count': 7,
'location': [-122.415502, 37.754028],
'distance': 10.013916,
'hint': 'Jsw6hbN6kQBmAAAACAAAABAAAAANAAAAQOKOQg89nkCKPC9BEMcOQWYAAAAIAAAAEAAAAA0AAAD4RAAAcha0-KwUQAJ6FrT4UhRAAgEAbwX5Wu6N',
'name': '23rd Street',
'matchings_index': 0,
'waypoint_index': 3}],
'matchings': [{'duration': 50.6,
'distance': 325.2,
'weight': 50.6,
'geometry': 'y{h_gAh~znhF}@k[OmFMoFcAea@IeD[uMAYKsDMsDAe@}@u_@g@aTMwFMwFwAqq@',
'confidence': 0.374625,
'weight_name': 'routability',
'legs': [{'steps': [],
'weight': 18.8,
'distance': 116.7,
'annotation': {'nodes': [1974590926,
4763953263,
65359046,
4763953265,
5443374298,
2007343352]},
'summary': '',
'duration': 18.8},
{'steps': [],
'weight': 12.2,
'distance': 85.6,
'annotation': {'nodes': [5443374298,
2007343352,
4763953266,
65359043,
4763953269,
2007343354,
4763953270]},
'summary': '',
'duration': 12.2},
{'steps': [],
'weight': 19.6,
'distance': 122.9,
'annotation': {'nodes': [2007343354,
4763953270,
65334199,
4763953274,
2007343347]},
'summary': '',
'duration': 19.6}]}]},
'B': {'code': 'Ok',
'tracepoints': [{'alternatives_count': 0,
'location': [-122.387971, 37.727587],
'distance': 11.53267,
'hint': 'xHWRAEJ2kYALAAAArQAAAA4AAAAsAAAAnpH1QDVG8EJWgBdBa2v0QQsAAACtAAAADgAAACwAAAD4RAAA_YG0-GOtPwJKgrT4t60_AgIA3wf5Wu6N',
'name': 'Underwood Avenue',
'matchings_index': 0,
'waypoint_index': 0},
{'alternatives_count': 0,
'location': [-122.388563, 37.727175],
'distance': 13.565054,
'hint': 'w3WRgBuxOgVPAAAACAAAABMAAAASAAAA7ONaQo4CrUDv7U1BJdFAQU8AAAAIAAAAEwAAABIAAAD4RAAArX-0-MerPwIsgLT4gqs_AgIAbw35Wu6N',
'name': 'Jennings Street',
'matchings_index': 0,
'waypoint_index': 1},
{'alternatives_count': 1,
'location': [-122.388478, 37.725984],
'distance': 9.601917,
'hint': 't3WRABexOoWcAAAAbAAAABEAAAALAAAAdujYQqu4lUJXHD1B9-ruQJwAAABsAAAAEQAAAAsAAAD4RAAAAoC0-CCnPwJCgLT4Zqc_AgIAHxP5Wu6N',
'name': 'Wallace Avenue',
'matchings_index': 0,
'waypoint_index': 2}],
'matchings': [{'duration': 50,
'distance': 270.4,
'weight': 50,
'geometry': 'euu}fAd_~lhFoAlCMTuAvCvC|Bh@`@hXbUnAdADBhDzCzClCXVzZnW\\X~CnC~@qBLWnWej@',
'confidence': 1e-06,
'weight_name': 'routability',
'legs': [{'steps': [],
'weight': 17.8,
'distance': 84.8,
'annotation': {'nodes': [5443147626,
6360865540,
6360865536,
65307580,
6360865535,
6360865539,
6360865531]},
'summary': '',
'duration': 17.8},
{'steps': [],
'weight': 32.2,
'distance': 185.6,
'annotation': {'nodes': [6360865539,
6360865531,
6360865525,
65343521,
6360865527,
6360865529,
6360865523,
6360865520,
65321110,
6360865519,
6360865522,
6376329343]},
'summary': '',
'duration': 32.2}]}]},
'C': {'code': 'Ok',
'tracepoints': [None,
None,
{'alternatives_count': 0,
'location': [-122.443682, 37.713254],
'distance': 6.968076,
'hint': 'QXo6hUR6OgUAAAAANQAAAAAAAAAkAAAAAAAAAOCMMUEAAAAA_Z1yQQAAAAAbAAAAAAAAACQAAAD4RAAAXqiz-GZ1PwKiqLP4hnU_AgAAzxL5Wu6N',
'name': '',
'matchings_index': 0,
'waypoint_index': 0},
{'alternatives_count': 0,
'location': [-122.442428, 37.714335],
'distance': 16.488956,
'hint': 'E3o6BVRukYAJAAAAIgAAAGgAAAAUAAAA2RnSQL_5uUEPjI9CBTlaQQkAAAAiAAAAaAAAABQAAAD4RAAARK2z-J95PwKTrLP4b3k_AgEAXxX5Wu6N',
'name': 'Allison Street',
'matchings_index': 0,
'waypoint_index': 1},
{'alternatives_count': 1,
'location': [-122.441751, 37.712761],
'distance': 17.311636,
'hint': 'Fno6hRl6OgWZAAAANwAAAAAAAAAKAAAAH4vUQgKXFkIAAAAAXtbYQJkAAAA3AAAAAAAAAAoAAAD4RAAA6a-z-HlzPwKjsLP4q3M_AgAAHwr5Wu6N',
'name': 'Allison Street',
'matchings_index': 0,
'waypoint_index': 2}],
'matchings': [{'duration': 64.1,
'distance': 420.1,
'weight': 66.7,
'geometry': 'kuy|fAbyjphFcBxEmE`FqJkKiBqBuP}Qgc@ie@eAiAcB}ArA_Eb@mAjKkDnBo@fe@mOrw@kW',
'confidence': 7.3e-05,
'weight_name': 'routability',
'legs': [{'steps': [],
'weight': 40.1,
'distance': 235.2,
'annotation': {'nodes': [5440513673,
5440513674,
5440513675,
65363070,
1229920760,
65307726,
6906452420,
1229920717,
65361047,
1229920749,
554163599,
3978809925]},
'summary': '',
'duration': 37.5},
{'steps': [],
'weight': 26.6,
'distance': 184.9,
'annotation': {'nodes': [554163599, 3978809925, 65345518, 8256268328]},
'summary': '',
'duration': 26.6}]}]}}
我想提取每个用户A、B和C的关键节点下的值,并将这些值与相应的用户一起存储在数据帧中。如下图所示:
value user
1974590926 A
4763953263 A
65359046 A
4763953265 A
5443374298 A
2007343352 A
5443374298 A
2007343352 A
4763953266 A
65359043 A
4763953269 A
2007343354 A
4763953270 A
2007343354 A
4763953270 A
65334199 A
4763953274 A
2007343347 A
5443147626 B
6360865540 B
6360865536 B
65307580 B
6360865535 B
6360865539 B
6360865531 B
6360865539 B
6360865531 B
6360865525 B
65343521 B
6360865527 B
6360865529 B
6360865523 B
6360865520 B
65321110 B
6360865519 B
6360865522 B
6376329343 B
5440513673 C
5440513674 C
5440513675 C
65363070 C
1229920760 C
65307726 C
6906452420 C
1229920717 C
65361047 C
1229920749 C
554163599 C
3978809925 C
554163599 C
3978809925 C
65345518 C
8256268328 C
我能够用下面的代码将属于用户C的节点提取并存储到数据帧中。但是,我很难添加用户列和其他节点及其相应的用户。有什么想法吗
import pandas as pd
nodes_df = pd.DataFrame({'node':{}})
for user in output[user]['matchings'][0]['legs']:
result = user['annotation']['nodes']
values_temp = pd.DataFrame(result, columns=['value'])
values_df = values_df.append(values_temp, ignore_index=True)
values_df.node = values_df.value.astype(int)
values_df
value
0 5440513673
1 5440513674
2 5440513675
3 65363070
4 1229920760
5 65307726
6 6906452420
7 1229920717
8 65361047
9 1229920749
10 554163599
11 3978809925
12 554163599
13 3978809925
14 65345518
15 8256268328
您可以使用记录路径,然后用户:
dfs=[]
对于output.keys中的用户:
df=pd.json\u normalizeoutput,record\u path=[用户,'匹配','腿','注释','节点']
df['user']=用户
dfs.appenddf
nodes_df=pd.concatdfs.renamecolumns={0:'node'}
节点用户
1974590926 A
4763953263 A
65359046 A
... ...
3978809925摄氏度
65345518摄氏度
8256268328 C
如果有一些用户缺少匹配项,您可以检查输出[用户]中的“匹配项”是否为:
dfs=[]
对于output.keys中的用户:
如果输出[用户]中的“匹配”:
df=pd.json\u normalizeoutput,record\u path=[用户,'匹配','腿','注释','节点']
df['user']=用户
dfs.appenddf
nodes_df=pd.concatdfs.renamecolumns={0:'node'}
如果输出键类似于'2018-02-03','A',并且您将它们迭代为trip,则需要访问其日期和用户作为trip[0]和trip[1]:
dfs=[]
对于output.keys中的跳闸:
如果输出[跳闸]中有“匹配”:
df=pd.json\u normalizeoutput,记录路径=[trip,'matchings','legs','annotation','nodes']
df['date']=行程[0]
df['user']=trip[1]
dfs.appenddf
nodes_df=pd.concatdfs.renamecolumns={0:'node'}
您可以使用记录路径,然后用户:
dfs=[]
对于output.keys中的用户:
df=pd.json\u normalizeoutput,record\u path=[用户,'匹配','腿','注释','节点']
df['user']=用户
dfs.appenddf
nodes_df=pd.concatdfs.renamecolumns={0:'node'}
节点用户
1974590926 A
4763953263 A
65359046 A
... ...
3978809925摄氏度
65345518摄氏度
8256268328 C
如果有一些用户缺少匹配项,您可以检查输出[用户]中的“匹配项”是否为:
dfs=[]
对于output.keys中的用户:
如果输出[用户]中的“匹配”:
df=pd.json\u normalizeoutput,record\u path=[用户,'匹配','腿','注释','节点']
df['user']=用户
dfs.appenddf
nodes_df=pd.concatdfs.renamecolumns={0:'node'}
如果输出键类似于'2018-02-03','A',并且您将它们迭代为trip,则需要访问其日期和用户作为trip[0]和trip[1]:
dfs=[]
对于output.keys中的跳闸:
如果输出[跳闸]中有“匹配”:
df=pd.json\u normalizeoutput,记录路径=[trip,'matchings','legs','annotation','nodes']
df['date']=行程[0]
df['user']=trip[1]
dfs.appenddf
nodes_df=pd.concatdfs.renamecolumns={0:'node'}
我们希望将所有节点值放在[legs]中 如果您想要使用just for循环的最简单方法:
nodes = []
user = []
for i in output.keys():
for j in output[i]['matchings'][0]['legs']:
for k in j['annotation']['nodes']:
col1.append(k)
col2.append(i)
d = {'nodes':nodes, 'user':user}
df = pd.DataFrame(data=d)
print(df)
我们希望将所有节点值放在[legs]中 如果您想要使用just for循环的最简单方法:
nodes = []
user = []
for i in output.keys():
for j in output[i]['matchings'][0]['legs']:
for k in j['annotation']['nodes']:
col1.append(k)
col2.append(i)
d = {'nodes':nodes, 'user':user}
df = pd.DataFrame(data=d)
print(df)
在数据帧内重新组合之前,可以使用模块提取数据;由于迭代在字典中,因此应该加快一些速度:
的摘要是:如果访问键,则使用点;如果数据在列表中,则使用[]访问数据:
#pip install jmespath
import jmespath
from itertools import chain
query ={letter: jmespath.compile(f"{letter}.matchings[].legs[].annotation.nodes")
for letter in ("A", "B", "C")}
result = {letter: pd.DataFrame(chain.from_iterable(expression.search(output)),
columns=['node'])
for letter, expression in query.items()}
result = pd.concat(result).droplevel(-1).rename_axis(index='user').reset_index()
result.head(15)
user node
0 A 1974590926
1 A 4763953263
2 A 65359046
3 A 4763953265
4 A 5443374298
5 A 2007343352
6 A 5443374298
7 A 2007343352
8 A 4763953266
9 A 65359043
10 A 4763953269
11 A 2007343354
12 A 4763953270
13 A 2007343354
14 A 4763953270
在数据帧内重新组合之前,可以使用模块提取数据;由于迭代在字典中,因此应该加快一些速度:
的摘要是:如果访问键,则使用点;如果数据在列表中,则使用[]访问数据:
#pip install jmespath
import jmespath
from itertools import chain
query ={letter: jmespath.compile(f"{letter}.matchings[].legs[].annotation.nodes")
for letter in ("A", "B", "C")}
result = {letter: pd.DataFrame(chain.from_iterable(expression.search(output)),
columns=['node'])
for letter, expression in query.items()}
result = pd.concat(result).droplevel(-1).rename_axis(index='user').reset_index()
result.head(15)
user node
0 A 1974590926
1 A 4763953263
2 A 65359046
3 A 4763953265
4 A 5443374298
5 A 2007343352
6 A 5443374298
7 A 2007343352
8 A 4763953266
9 A 65359043
10 A 4763953269
11 A 2007343354
12 A 4763953270
13 A 2007343354
14 A 4763953270
您当前代码的输出是什么?我已经用当前代码的输出更新了我的问题output是您的json对象吗?您已经设置了user=c?输出确实是我的json对象。Json对象中的值A B和C是用户键当前代码的输出是什么?我已经用当前代码的输出更新了我的问题输出是Json对象吗?您已经设置了user=c?输出确实是我的json对象。Json对象中的值AB和C是用户键。我注意到,当用户(比如说C)没有“匹配”时,例如:“C”:{message:找不到任何坐标的匹配段,code:NoSegment},上面的代码将抛出一个KeyError:“匹配”。如何防止这种情况发生?@sampeterson看来熊猫无法直接处理这种情况。我用一个测试输出[user]中是否存在“匹配”的变通方法更新了答案。最后一个问题:如果我的JSON对象除了像这样的用户键之外,还具有一个日期键:{'2018-02-03','a':{'code':'Ok','tracepoints':[None…,我如何将日期添加到额外的列日期?我希望可以运行dfs=[]对于输出中的跳闸。键:如果输出中的“匹配”[跳闸]:df=
pd.json_normalizeoutput,record_path=[date,user,'matchings','legs','annotation','nodes']df['date']=date df['user']=user dfs.appenddf nodes_df=pd.concatdfs.renamecolumns={0:'node'},但它给了我一个错误。@sampeterson检查更新我注意到当用户,比如说C,没有“matchings”,例如:'C':{message:找不到任何坐标的匹配段,.code:NoSegment},上面的代码将抛出一个KeyError:'matchings'。如何防止这种情况发生?@sampeterson看来熊猫无法直接处理这种情况。我用一个测试输出[user]中是否存在“匹配”的变通方法更新了答案。最后一个问题:如果我的JSON对象除了像这样的用户键之外,还具有一个日期键:{'2018-02-03','a':{'code':'Ok','tracepoints':[None…,我如何将日期添加到额外的列日期?我希望可以运行dfs=[]对于output.keys中的trip:如果output[trip]中的'matchings':df=pd.json_normalizeoutput,则记录路径=[date,user,'matchings','legs','annotation','nodes']df['date']=date df['user']=user dfs.appenddf nodes\u df=pd.concatdfs.renamecolumns={0:'node'},但它给我一个错误。@sampeterson检查更新