Python 在groupby和melt之后将数据帧转换为嵌套JSON
我有一个数据框,如下所示:Python 在groupby和melt之后将数据帧转换为嵌套JSON,python,pandas,Python,Pandas,我有一个数据框,如下所示: PayeeID TransactionID Res_1 Res_2 1001 aa1001234 OK OK 1001 aa1001235 OK NOT OK 1002 aa1002567 NOT OK NOT OK 1002 aa1002568 NOT OK OK df_m = df.melt
PayeeID TransactionID Res_1 Res_2
1001 aa1001234 OK OK
1001 aa1001235 OK NOT OK
1002 aa1002567 NOT OK NOT OK
1002 aa1002568 NOT OK OK
df_m = df.melt(id_vars=['PayeeID','Trxn_ID'],value_vars = ['Status_I','Status_II'],\
var_name='Status_Type',value_name='Final')
df_g = df_m.groupby(['PayeeID','Trxn_ID']).\
apply(lambda x : dict(zip(x['Status_Type'],x['Final']))).reset_index().\
rename(columns={0:'Final_Status'})
j = df_g.to_json(orient='records')
[{"PayeeID":1001,"Trxn_ID":"aa1001234","Final_Status":{"Status_I":"OK","Status_II":"NOT OK"}},
{"PayeeID":1001,"Trxn_ID":"aa1001338","Final_Status":{"Status_I":"NOT OK","Status_II":"OK"}},
{"PayeeID":1002,"Trxn_ID":"aa1002455","Final_Status":{"Status_I":"NOT OK","Status_II":"NOT OK"}},
{"PayeeID":1002,"Trxn_ID":"aa1002766","Final_Status":{"Status_I":"OK","Status_II":"OK"}}]
现在我想将其转换为嵌套的JSON字符串,如下所示:
[{"PayeeID":1001,{"Trxn_ID":"aa1001234","Final_Status":{"Status_I":"OK","Status_II":"NOT
OK"},"Trxn_ID":"aa1001338","Final_Status":{"Status_I":"NOT OK","Status_II":"OK"}}},{"PayeeID":1002,
{"Trxn_ID":"aa1002455","Final_Status":{"Status_I":"NOT OK","Status_II":"NOT
OK"},"Trxn_ID":"aa1002766","Final_Status":{"Status_I":"OK","Status_II":"OK"}}}]
即,对于每个PayeId,都应该有嵌套记录
我的做法如下:
PayeeID TransactionID Res_1 Res_2
1001 aa1001234 OK OK
1001 aa1001235 OK NOT OK
1002 aa1002567 NOT OK NOT OK
1002 aa1002568 NOT OK OK
df_m = df.melt(id_vars=['PayeeID','Trxn_ID'],value_vars = ['Status_I','Status_II'],\
var_name='Status_Type',value_name='Final')
df_g = df_m.groupby(['PayeeID','Trxn_ID']).\
apply(lambda x : dict(zip(x['Status_Type'],x['Final']))).reset_index().\
rename(columns={0:'Final_Status'})
j = df_g.to_json(orient='records')
[{"PayeeID":1001,"Trxn_ID":"aa1001234","Final_Status":{"Status_I":"OK","Status_II":"NOT OK"}},
{"PayeeID":1001,"Trxn_ID":"aa1001338","Final_Status":{"Status_I":"NOT OK","Status_II":"OK"}},
{"PayeeID":1002,"Trxn_ID":"aa1002455","Final_Status":{"Status_I":"NOT OK","Status_II":"NOT OK"}},
{"PayeeID":1002,"Trxn_ID":"aa1002766","Final_Status":{"Status_I":"OK","Status_II":"OK"}}]
但当我看到j时,我得到如下结果:
PayeeID TransactionID Res_1 Res_2
1001 aa1001234 OK OK
1001 aa1001235 OK NOT OK
1002 aa1002567 NOT OK NOT OK
1002 aa1002568 NOT OK OK
df_m = df.melt(id_vars=['PayeeID','Trxn_ID'],value_vars = ['Status_I','Status_II'],\
var_name='Status_Type',value_name='Final')
df_g = df_m.groupby(['PayeeID','Trxn_ID']).\
apply(lambda x : dict(zip(x['Status_Type'],x['Final']))).reset_index().\
rename(columns={0:'Final_Status'})
j = df_g.to_json(orient='records')
[{"PayeeID":1001,"Trxn_ID":"aa1001234","Final_Status":{"Status_I":"OK","Status_II":"NOT OK"}},
{"PayeeID":1001,"Trxn_ID":"aa1001338","Final_Status":{"Status_I":"NOT OK","Status_II":"OK"}},
{"PayeeID":1002,"Trxn_ID":"aa1002455","Final_Status":{"Status_I":"NOT OK","Status_II":"NOT OK"}},
{"PayeeID":1002,"Trxn_ID":"aa1002766","Final_Status":{"Status_I":"OK","Status_II":"OK"}}]
这里我遗漏了什么。熊猫不知道您想要的数据格式。您需要首先在数据帧中创建它,然后输出到JSON。下面为每个收款人提供一个条目
df = pd.DataFrame([
[1001, "aa1001234", "OK", "OK",],
[1001, "aa1001235", "OK", "NOT OK",],
[1002, "aa1002567", "NOT OK", "NOT OK",],
[1002, "aa1002568", "NOT OK", "OK"]], columns=["PayeeID", "TransactionID", "Res_1", "Res_2"])
dfg = df.groupby("PayeeID")["TransactionID", "Res_1", "Res_2"].aggregate(lambda x: tuple(x))
dfg.reset_index().to_json(orient='records')
[{
"PayeeID": 1001,
"TransactionID": ["aa1001234", "aa1001235"],
"Res_1": ["OK", "OK"],
"Res_2": ["OK", "NOT OK"]
}, {
"PayeeID": 1002,
"TransactionID": ["aa1002567", "aa1002568"],
"Res_1": ["NOT OK", "NOT OK"],
"Res_2": ["NOT OK", "OK"]
}]
可能更好的结构可以是:
df['tx'] = df.apply(lambda x: {x['TransactionID']: {'Res_1':x['Res_1'], 'Res_2':x['Res_2']}}, axis=1)
dfg = df.groupby("PayeeID")["tx"].aggregate(lambda x: tuple(x))
dfg.reset_index().to_json(orient="records")
[{
"PayeeID": 1001,
"tx": [{
"aa1001234": {
"Res_1": "OK",
"Res_2": "OK"
}
}, {
"aa1001235": {
"Res_1": "OK",
"Res_2": "NOT OK"
}
}]
}, {
"PayeeID": 1002,
"tx": [{
"aa1002567": {
"Res_1": "NOT OK",
"Res_2": "NOT OK"
}
}, {
"aa1002568": {
"Res_1": "NOT OK",
"Res_2": "OK"
}
}]
}]
您正在查找的JSON无效-如何在受款人ID之后有一个对象?该对象没有键。实际上,我正在尝试为一个
payeid
获取一个条目。您需要首先对值进行分组。看看我的答案。