Python 循环JSON对象并将结果存储在dataframe中

Python 循环JSON对象并将结果存储在dataframe中,python,json,pandas,loops,Python,Json,Pandas,Loops,我有一个JSON对象,如下所示: data = {'A': {'code': 'Ok', 'tracepoints': [None, None, {'alternatives_count': 0, 'location': [-122.419189, 37.753805], 'distance': 28.078003, 'hint': '5Qg7hUqpFQA2AAAAOgAAAAwAAAAPAAAAiVMWQq2VIEIAuABB7FgoQTYAAAA6A

我有一个JSON对象,如下所示:

data = {'A': {'code': 'Ok',
  'tracepoints': [None,
   None,
   {'alternatives_count': 0,
    'location': [-122.419189, 37.753805],
    'distance': 28.078003,
    'hint': '5Qg7hUqpFQA2AAAAOgAAAAwAAAAPAAAAiVMWQq2VIEIAuABB7FgoQTYAAAA6AAAADAAAAA8AAAD4RAAACwi0-M0TQALvB7T4yRRAAgEAXwX5Wu6N',
    'name': '23rd Street',
    'matchings_index': 0,
    'waypoint_index': 0},
   {'alternatives_count': 0,
    'location': [-122.417866, 37.75389],
    'distance': 26.825184,
    'hint': 'K8w6BRinFYAdAAAACwAAAA0AAAAAAAAAIxmmQTSs6kCiuRFBAAAAAB0AAAALAAAADQAAAAAAAAD4RAAANg20-CIUQAJNDbT4MRNAAgIAnxD5Wu6N',
    'name': '23rd Street',
    'matchings_index': 0,
    'waypoint_index': 1},
   {'alternatives_count': 0,
    'location': [-122.416896, 37.75395],
    'distance': 16.583412,
    'hint': 'Jcw6BSzMOoUqAAAAQwAAABAAAAANAAAA0i_uQb3SOEKKPC9BG1EaQSoAAABDAAAAEAAAAA0AAAD4RAAAABG0-F4UQALyELT48xRAAgEAnxD5Wu6N',
    'name': '23rd Street',
    'matchings_index': 0,
    'waypoint_index': 2},
   {'alternatives_count': 7,
    'location': [-122.415502, 37.754028],
    'distance': 10.013916,
    'hint': 'Jsw6hbN6kQBmAAAACAAAABAAAAANAAAAQOKOQg89nkCKPC9BEMcOQWYAAAAIAAAAEAAAAA0AAAD4RAAAcha0-KwUQAJ6FrT4UhRAAgEAbwX5Wu6N',
    'name': '23rd Street',
    'matchings_index': 0,
    'waypoint_index': 3}],
  'matchings': [{'duration': 50.6,
    'distance': 325.2,
    'weight': 50.6,
    'geometry': 'y{h_gAh~znhF}@k[OmFMoFcAea@IeD[uMAYKsDMsDAe@}@u_@g@aTMwFMwFwAqq@',
    'confidence': 0.374625,
    'weight_name': 'routability',
    'legs': [{'steps': [],
      'weight': 18.8,
      'distance': 116.7,
      'annotation': {'nodes': [1974590926,
        4763953263,
        65359046,
        4763953265,
        5443374298,
        2007343352]},
      'summary': '',
      'duration': 18.8},
     {'steps': [],
      'weight': 12.2,
      'distance': 85.6,
      'annotation': {'nodes': [5443374298,
        2007343352,
        4763953266,
        65359043,
        4763953269,
        2007343354,
        4763953270]},
      'summary': '',
      'duration': 12.2},
     {'steps': [],
      'weight': 19.6,
      'distance': 122.9,
      'annotation': {'nodes': [2007343354,
        4763953270,
        65334199,
        4763953274,
        2007343347]},
      'summary': '',
      'duration': 19.6}]}]},
 'B': {'code': 'Ok',
  'tracepoints': [{'alternatives_count': 0,
    'location': [-122.387971, 37.727587],
    'distance': 11.53267,
    'hint': 'xHWRAEJ2kYALAAAArQAAAA4AAAAsAAAAnpH1QDVG8EJWgBdBa2v0QQsAAACtAAAADgAAACwAAAD4RAAA_YG0-GOtPwJKgrT4t60_AgIA3wf5Wu6N',
    'name': 'Underwood Avenue',
    'matchings_index': 0,
    'waypoint_index': 0},
   {'alternatives_count': 0,
    'location': [-122.388563, 37.727175],
    'distance': 13.565054,
    'hint': 'w3WRgBuxOgVPAAAACAAAABMAAAASAAAA7ONaQo4CrUDv7U1BJdFAQU8AAAAIAAAAEwAAABIAAAD4RAAArX-0-MerPwIsgLT4gqs_AgIAbw35Wu6N',
    'name': 'Jennings Street',
    'matchings_index': 0,
    'waypoint_index': 1},
   {'alternatives_count': 1,
    'location': [-122.388478, 37.725984],
    'distance': 9.601917,
    'hint': 't3WRABexOoWcAAAAbAAAABEAAAALAAAAdujYQqu4lUJXHD1B9-ruQJwAAABsAAAAEQAAAAsAAAD4RAAAAoC0-CCnPwJCgLT4Zqc_AgIAHxP5Wu6N',
    'name': 'Wallace Avenue',
    'matchings_index': 0,
    'waypoint_index': 2}],
  'matchings': [{'duration': 50,
    'distance': 270.4,
    'weight': 50,
    'geometry': 'euu}fAd_~lhFoAlCMTuAvCvC|Bh@`@hXbUnAdADBhDzCzClCXVzZnW\\X~CnC~@qBLWnWej@',
    'confidence': 1e-06,
    'weight_name': 'routability',
    'legs': [{'steps': [],
      'weight': 17.8,
      'distance': 84.8,
      'annotation': {'nodes': [5443147626,
        6360865540,
        6360865536,
        65307580,
        6360865535,
        6360865539,
        6360865531]},
      'summary': '',
      'duration': 17.8},
     {'steps': [],
      'weight': 32.2,
      'distance': 185.6,
      'annotation': {'nodes': [6360865539,
        6360865531,
        6360865525,
        65343521,
        6360865527,
        6360865529,
        6360865523,
        6360865520,
        65321110,
        6360865519,
        6360865522,
        6376329343]},
      'summary': '',
      'duration': 32.2}]}]},
 'C': {'code': 'Ok',
  'tracepoints': [None,
   None,
   {'alternatives_count': 0,
    'location': [-122.443682, 37.713254],
    'distance': 6.968076,
    'hint': 'QXo6hUR6OgUAAAAANQAAAAAAAAAkAAAAAAAAAOCMMUEAAAAA_Z1yQQAAAAAbAAAAAAAAACQAAAD4RAAAXqiz-GZ1PwKiqLP4hnU_AgAAzxL5Wu6N',
    'name': '',
    'matchings_index': 0,
    'waypoint_index': 0},
   {'alternatives_count': 0,
    'location': [-122.442428, 37.714335],
    'distance': 16.488956,
    'hint': 'E3o6BVRukYAJAAAAIgAAAGgAAAAUAAAA2RnSQL_5uUEPjI9CBTlaQQkAAAAiAAAAaAAAABQAAAD4RAAARK2z-J95PwKTrLP4b3k_AgEAXxX5Wu6N',
    'name': 'Allison Street',
    'matchings_index': 0,
    'waypoint_index': 1},
   {'alternatives_count': 1,
    'location': [-122.441751, 37.712761],
    'distance': 17.311636,
    'hint': 'Fno6hRl6OgWZAAAANwAAAAAAAAAKAAAAH4vUQgKXFkIAAAAAXtbYQJkAAAA3AAAAAAAAAAoAAAD4RAAA6a-z-HlzPwKjsLP4q3M_AgAAHwr5Wu6N',
    'name': 'Allison Street',
    'matchings_index': 0,
    'waypoint_index': 2}],
  'matchings': [{'duration': 64.1,
    'distance': 420.1,
    'weight': 66.7,
    'geometry': 'kuy|fAbyjphFcBxEmE`FqJkKiBqBuP}Qgc@ie@eAiAcB}ArA_Eb@mAjKkDnBo@fe@mOrw@kW',
    'confidence': 7.3e-05,
    'weight_name': 'routability',
    'legs': [{'steps': [],
      'weight': 40.1,
      'distance': 235.2,
      'annotation': {'nodes': [5440513673,
        5440513674,
        5440513675,
        65363070,
        1229920760,
        65307726,
        6906452420,
        1229920717,
        65361047,
        1229920749,
        554163599,
        3978809925]},
      'summary': '',
      'duration': 37.5},
     {'steps': [],
      'weight': 26.6,
      'distance': 184.9,
      'annotation': {'nodes': [554163599, 3978809925, 65345518, 8256268328]},
      'summary': '',
      'duration': 26.6}]}]}}
我想提取每个用户A、B和C的关键节点下的值,并将这些值与相应的用户一起存储在数据帧中。如下图所示:

    value        user
    1974590926  A
    4763953263  A
    65359046    A
    4763953265  A
    5443374298  A
    2007343352  A
    5443374298  A
    2007343352  A
    4763953266  A
    65359043    A
    4763953269  A
    2007343354  A
    4763953270  A
    2007343354  A
    4763953270  A
    65334199    A
    4763953274  A
    2007343347  A
    5443147626  B
    6360865540  B
    6360865536  B
    65307580    B
    6360865535  B
    6360865539  B
    6360865531  B
    6360865539  B
    6360865531  B
    6360865525  B
    65343521    B
    6360865527  B
    6360865529  B
    6360865523  B
    6360865520  B
    65321110    B
    6360865519  B
    6360865522  B
    6376329343  B
    5440513673  C
    5440513674  C
    5440513675  C
    65363070    C
    1229920760  C
    65307726    C
    6906452420  C
    1229920717  C
    65361047    C
    1229920749  C
    554163599   C
    3978809925  C
    554163599   C
    3978809925  C
    65345518    C
    8256268328  C
我能够用下面的代码将属于用户C的节点提取并存储到数据帧中。但是,我很难添加用户列和其他节点及其相应的用户。有什么想法吗

import pandas as pd
nodes_df = pd.DataFrame({'node':{}})

for user in output[user]['matchings'][0]['legs']:
    result  = user['annotation']['nodes']
    values_temp = pd.DataFrame(result, columns=['value'])
    values_df = values_df.append(values_temp, ignore_index=True)
values_df.node = values_df.value.astype(int)
values_df

    value
0   5440513673
1   5440513674
2   5440513675
3   65363070
4   1229920760
5   65307726
6   6906452420
7   1229920717
8   65361047
9   1229920749
10  554163599
11  3978809925
12  554163599
13  3978809925
14  65345518
15  8256268328
您可以使用记录路径,然后用户:

dfs=[] 对于output.keys中的用户: df=pd.json\u normalizeoutput,record\u path=[用户,'匹配','腿','注释','节点'] df['user']=用户 dfs.appenddf nodes_df=pd.concatdfs.renamecolumns={0:'node'} 节点用户 1974590926 A 4763953263 A 65359046 A ... ... 3978809925摄氏度 65345518摄氏度 8256268328 C 如果有一些用户缺少匹配项,您可以检查输出[用户]中的“匹配项”是否为:

dfs=[] 对于output.keys中的用户: 如果输出[用户]中的“匹配”: df=pd.json\u normalizeoutput,record\u path=[用户,'匹配','腿','注释','节点'] df['user']=用户 dfs.appenddf nodes_df=pd.concatdfs.renamecolumns={0:'node'} 如果输出键类似于'2018-02-03','A',并且您将它们迭代为trip,则需要访问其日期和用户作为trip[0]和trip[1]:

dfs=[] 对于output.keys中的跳闸: 如果输出[跳闸]中有“匹配”: df=pd.json\u normalizeoutput,记录路径=[trip,'matchings','legs','annotation','nodes'] df['date']=行程[0] df['user']=trip[1] dfs.appenddf nodes_df=pd.concatdfs.renamecolumns={0:'node'} 您可以使用记录路径,然后用户:

dfs=[] 对于output.keys中的用户: df=pd.json\u normalizeoutput,record\u path=[用户,'匹配','腿','注释','节点'] df['user']=用户 dfs.appenddf nodes_df=pd.concatdfs.renamecolumns={0:'node'} 节点用户 1974590926 A 4763953263 A 65359046 A ... ... 3978809925摄氏度 65345518摄氏度 8256268328 C 如果有一些用户缺少匹配项,您可以检查输出[用户]中的“匹配项”是否为:

dfs=[] 对于output.keys中的用户: 如果输出[用户]中的“匹配”: df=pd.json\u normalizeoutput,record\u path=[用户,'匹配','腿','注释','节点'] df['user']=用户 dfs.appenddf nodes_df=pd.concatdfs.renamecolumns={0:'node'} 如果输出键类似于'2018-02-03','A',并且您将它们迭代为trip,则需要访问其日期和用户作为trip[0]和trip[1]:

dfs=[] 对于output.keys中的跳闸: 如果输出[跳闸]中有“匹配”: df=pd.json\u normalizeoutput,记录路径=[trip,'matchings','legs','annotation','nodes'] df['date']=行程[0] df['user']=trip[1] dfs.appenddf nodes_df=pd.concatdfs.renamecolumns={0:'node'}
我们希望将所有节点值放在[legs]中

如果您想要使用just for循环的最简单方法:


nodes = []
user = []

for i in output.keys():
    for j in output[i]['matchings'][0]['legs']:
        for k in j['annotation']['nodes']:
            col1.append(k)
            col2.append(i)

d = {'nodes':nodes, 'user':user}

df = pd.DataFrame(data=d)
print(df)

我们希望将所有节点值放在[legs]中

如果您想要使用just for循环的最简单方法:


nodes = []
user = []

for i in output.keys():
    for j in output[i]['matchings'][0]['legs']:
        for k in j['annotation']['nodes']:
            col1.append(k)
            col2.append(i)

d = {'nodes':nodes, 'user':user}

df = pd.DataFrame(data=d)
print(df)
在数据帧内重新组合之前,可以使用模块提取数据;由于迭代在字典中,因此应该加快一些速度:

的摘要是:如果访问键,则使用点;如果数据在列表中,则使用[]访问数据:

#pip install jmespath
import jmespath
from itertools import chain

query ={letter: jmespath.compile(f"{letter}.matchings[].legs[].annotation.nodes")
        for letter in ("A", "B", "C")}

result = {letter: pd.DataFrame(chain.from_iterable(expression.search(output)),
                               columns=['node']) 
          for letter, expression in query.items()}

result = pd.concat(result).droplevel(-1).rename_axis(index='user').reset_index()

result.head(15)
 
   user        node
0     A  1974590926
1     A  4763953263
2     A    65359046
3     A  4763953265
4     A  5443374298
5     A  2007343352
6     A  5443374298
7     A  2007343352
8     A  4763953266
9     A    65359043
10    A  4763953269
11    A  2007343354
12    A  4763953270
13    A  2007343354
14    A  4763953270
在数据帧内重新组合之前,可以使用模块提取数据;由于迭代在字典中,因此应该加快一些速度:

的摘要是:如果访问键,则使用点;如果数据在列表中,则使用[]访问数据:

#pip install jmespath
import jmespath
from itertools import chain

query ={letter: jmespath.compile(f"{letter}.matchings[].legs[].annotation.nodes")
        for letter in ("A", "B", "C")}

result = {letter: pd.DataFrame(chain.from_iterable(expression.search(output)),
                               columns=['node']) 
          for letter, expression in query.items()}

result = pd.concat(result).droplevel(-1).rename_axis(index='user').reset_index()

result.head(15)
 
   user        node
0     A  1974590926
1     A  4763953263
2     A    65359046
3     A  4763953265
4     A  5443374298
5     A  2007343352
6     A  5443374298
7     A  2007343352
8     A  4763953266
9     A    65359043
10    A  4763953269
11    A  2007343354
12    A  4763953270
13    A  2007343354
14    A  4763953270

您当前代码的输出是什么?我已经用当前代码的输出更新了我的问题output是您的json对象吗?您已经设置了user=c?输出确实是我的json对象。Json对象中的值A B和C是用户键当前代码的输出是什么?我已经用当前代码的输出更新了我的问题输出是Json对象吗?您已经设置了user=c?输出确实是我的json对象。Json对象中的值AB和C是用户键。我注意到,当用户(比如说C)没有“匹配”时,例如:“C”:{message:找不到任何坐标的匹配段,code:NoSegment},上面的代码将抛出一个KeyError:“匹配”。如何防止这种情况发生?@sampeterson看来熊猫无法直接处理这种情况。我用一个测试输出[user]中是否存在“匹配”的变通方法更新了答案。最后一个问题:如果我的JSON对象除了像这样的用户键之外,还具有一个日期键:{'2018-02-03','a':{'code':'Ok','tracepoints':[None…,我如何将日期添加到额外的列日期?我希望可以运行dfs=[]对于输出中的跳闸。键:如果输出中的“匹配”[跳闸]:df=
pd.json_normalizeoutput,record_path=[date,user,'matchings','legs','annotation','nodes']df['date']=date df['user']=user dfs.appenddf nodes_df=pd.concatdfs.renamecolumns={0:'node'},但它给了我一个错误。@sampeterson检查更新我注意到当用户,比如说C,没有“matchings”,例如:'C':{message:找不到任何坐标的匹配段,.code:NoSegment},上面的代码将抛出一个KeyError:'matchings'。如何防止这种情况发生?@sampeterson看来熊猫无法直接处理这种情况。我用一个测试输出[user]中是否存在“匹配”的变通方法更新了答案。最后一个问题:如果我的JSON对象除了像这样的用户键之外,还具有一个日期键:{'2018-02-03','a':{'code':'Ok','tracepoints':[None…,我如何将日期添加到额外的列日期?我希望可以运行dfs=[]对于output.keys中的trip:如果output[trip]中的'matchings':df=pd.json_normalizeoutput,则记录路径=[date,user,'matchings','legs','annotation','nodes']df['date']=date df['user']=user dfs.appenddf nodes\u df=pd.concatdfs.renamecolumns={0:'node'},但它给我一个错误。@sampeterson检查更新