Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/json/15.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 熊猫:将数据帧与嵌套数组组合或合并JSON输出_Python_Json_Python 2.7_Pandas_Dictionary - Fatal编程技术网

Python 熊猫:将数据帧与嵌套数组组合或合并JSON输出

Python 熊猫:将数据帧与嵌套数组组合或合并JSON输出,python,json,python-2.7,pandas,dictionary,Python,Json,Python 2.7,Pandas,Dictionary,我使用一个标准的数据帧,并使用嵌套数组创建汇总数据的各种子集数据帧。然后,我需要以一种能够提供预期JSON输出的方式组合子集数据帧。(我用MaxU的答案格式化了大部分代码;) 我的标准数据帧的前几行(如果必要,我可以给出这个示例中的所有58行):df 在这里,我使用以下Python: PAFF_df = pd.DataFrame(df.groupby(['PRI_DEP','PRI_AFF'])['ID'].nunique().unstack().reset_index().fillna(0)

我使用一个标准的数据帧,并使用嵌套数组创建汇总数据的各种子集数据帧。然后,我需要以一种能够提供预期JSON输出的方式组合子集数据帧。(我用MaxU的答案格式化了大部分代码;)

我的标准数据帧的前几行(如果必要,我可以给出这个示例中的所有58行):df

在这里,我使用以下Python:

 PAFF_df = pd.DataFrame(df.groupby(['PRI_DEP','PRI_AFF'])['ID'].nunique().unstack().reset_index().fillna(0))
 LOA_df = pd.DataFrame(df.groupby(['PRI_DEP','LOA'])['ID'].nunique().unstack().reset_index().fillna(0))
 ST_df = pd.DataFrame(df.groupby(['PRI_DEP','STATE'])['ID'].nunique().unstack().reset_index().fillna(0))

 Nested_PAFF_df = (PAFF_df.groupby(['PRI_DEP'], as_index=True)
      .apply(lambda x: x[['A','E','F','L','M','T']].to_dict('r'))
      .reset_index()
      .rename(columns={0:'Primary_Affiliation'}))

 Nested_LOA_df = (LOA_df.groupby(['PRI_DEP'], as_index=True)
      .apply(lambda x: x[['Basic','Blue','Bronze','Invalid','UFM']].to_dict('r'))
      .reset_index()
      .rename(columns={0:'LOA'}))

 Nested_ST_df = (ST_df.groupby(['PRI_DEP'], as_index=True)
      .apply(lambda x: x[['A','E']].to_dict('r'))
      .reset_index()
      .rename(columns={0:'STATE'}))
这为我提供了适当的嵌套json,使用:.to_json(orient='records'))

主要附属机构:

[{"PRI_DEP":" ","Primary_Affiliation":[{"A":0.0,"E":0.0,"F":0.0,"M":2.0,"L":0.0,"T":0.0}]},{"PRI_DEP":"14700000","Primary_Affiliation":[{"A":0.0,"E":3.0,"F":0.0,"M":1.0,"L":1.0,"T":19.0}]},{"PRI_DEP":"95011000","Primary_Affiliation":[{"A":0.0,"E":0.0,"F":1.0,"M":0.0,"L":0.0,"T":0.0}]},{"PRI_DEP":"Null","Primary_Affiliation":[{"A":0.0,"E":1.0,"F":0.0,"M":0.0,"L":0.0,"T":0.0}]},{"PRI_DEP":"ST010000","Primary_Affiliation":[{"A":1.0,"E":0.0,"F":0.0,"M":0.0,"L":0.0,"T":1.0}]}] 
LOA JSON:

[{"PRI_DEP":" ","LOA":[{"Blue":0.0,"UFM":0.0,"Invalid":0.0,"Bronze":1.0,"Basic":1.0}]},{"PRI_DEP":"14700000","LOA":[{"Blue":14.0,"UFM":5.0,"Invalid":1.0,"Bronze":4.0,"Basic":0.0}]},{"PRI_DEP":"95011000","LOA":[{"Blue":0.0,"UFM":0.0,"Invalid":0.0,"Bronze":1.0,"Basic":0.0}]},{"PRI_DEP":"Null","LOA":[{"Blue":0.0,"UFM":0.0,"Invalid":0.0,"Bronze":1.0,"Basic":0.0}]},{"PRI_DEP":"ST010000","LOA":[{"Blue":0.0,"UFM":0.0,"Invalid":1.0,"Bronze":0.0,"Basic":1.0}]}] 
状态JSON:

[{"PRI_DEP":" ","STATE":[{"A":2.0,"E":0.0}]},{"PRI_DEP":"14700000","STATE":[{"A":23.0,"E":1.0}]},{"PRI_DEP":"95011000","STATE":[{"A":1.0,"E":0.0}]},{"PRI_DEP":"Null","STATE":[{"A":1.0,"E":0.0}]},{"PRI_DEP":"ST010000","STATE":[{"A":2.0,"E":0.0}]}] 
现在,我想通过PRI_DEP以某种方式在一个JSON中表示这些内容

因此,所需的JSON如下所示(为便于阅读而更新):


我一直在尝试不同的数据帧组合方法,我想我找到了答案

在我的原始帖子(设置嵌套组)中的python代码之后,我执行了以下操作:

Group_frames = [Nested_PAFF_df.set_index('PRI_DEP'), Nested_LOA_df.set_index('PRI_DEP'), Nested_ST_df.set_index('PRI_DEP')]
result = pd.concat(Group_frames, axis=1).reset_index()
print(result.to_json(orient='records'))

看来你想要的JSON被切断了。你们能更新吗?我故意只放第一条记录,但我会用剩下的记录更新。只有几个了。
[{"PRI_DEP":" ",
    "Primary_Affiliation":
        [{"A":0.0,"E":0.0,"F":0.0,"M":2.0,"L":0.0,"T":0.0}],
    "LOA": 
        [{"Blue":0.0,"UFM":0.0,"Invalid":0.0,"Bronze":1.0,"Basic":1.0}],
    "STATE":
        [{"A":2.0,"E":0.0}]},
 {"PRI_DEP":"14700000",
    "Primary_Affiliation": 
        [{"A":0.0,"E":3.0,"F":0.0,"M":1.0,"L":1.0,"T":19.0}],
    "LOA": 
        [{"Blue":14.0,"UFM":5.0,"Invalid":1.0,"Bronze":4.0,"Basic":0.0}],
    "STATE":
        [{"A":23.0,"E":1.0}]}, 
 {"PRI_DEP":"95011000",
    "Primary_Affiliation":
        [{"A":0.0,"E":0.0,"F":1.0,"M":0.0,"L":0.0,"T":0.0}],
    "LOA":
        [{"Blue":0.0,"UFM":0.0,"Invalid":0.0,"Bronze":1.0,"Basic":0.0}],
    "STATE":
        [{"A":1.0,"E":0.0}]},
 {"PRI_DEP":"Null",
    "Primary_Affiliation": 
        [{"A":0.0,"E":1.0,"F":0.0,"M":0.0,"L":0.0,"T":0.0}],
    "LOA":
        [{"Blue":0.0,"UFM":0.0,"Invalid":0.0,"Bronze":1.0,"Basic":0.0}],
    "STATE":
        [{"A":1.0,"E":0.0}]},
 {"PRI_DEP":"ST010000",
    "Primary_Affiliation":
        [{"A":1.0,"E":0.0,"F":0.0,"M":0.0,"L":0.0,"T":1.0}],
    "LOA":
        [{"Blue":0.0,"UFM":0.0,"Invalid":1.0,"Bronze":0.0,"Basic":1.0}],
    "STATE":
        [{"A":2.0,"E":0.0}]}]
Group_frames = [Nested_PAFF_df.set_index('PRI_DEP'), Nested_LOA_df.set_index('PRI_DEP'), Nested_ST_df.set_index('PRI_DEP')]
result = pd.concat(Group_frames, axis=1).reset_index()
print(result.to_json(orient='records'))