Python-如何从数据帧和分组方式创建JSON嵌套文件?
因此,我在从pandas数据帧创建适当的JSON格式时遇到了一些麻烦。我的数据框如下所示(很抱歉使用csv格式): 如您所见,前四行有重复的值,因此我想将所有列分为两组,以获得此JSON文件:Python-如何从数据帧和分组方式创建JSON嵌套文件?,python,json,pandas,Python,Json,Pandas,因此,我在从pandas数据帧创建适当的JSON格式时遇到了一些麻烦。我的数据框如下所示(很抱歉使用csv格式): 如您所见,前四行有重复的值,因此我想将所有列分为两组,以获得此JSON文件: [ { "report": { "first_date":201901, "second_date": 201902, "id":05555, "typ
[
{
"report":
{
"first_date":201901,
"second_date": 201902,
"id":05555,
"type": 111
},
"features": [
{
"codename":01111,
"description": 1,
"price":200.00
},
{
"codename":023111,
"description": 44,
"price":120.00
},
{
"codename":14113,
"description": 23,
"price":84.00
}
]
}
]
到目前为止,我已尝试按最后三列进行分组,将它们添加到字典并重命名:
cols = ["codename","description","price"]
rep = (df.groupby(["first_date","second_date","id","type"])[cols]
.apply(lambda x:x.to_dict('r')
.reset_index(name="features")
.to_json(orient="records"))
output = json.dumps(json.loads(rep),indent=4)
我得到这个作为输出:
[
{
"first_date":201901,
"second_date": 201902,
"id":05555,
"type": 111,
"features": [
{
"codename":01111,
"description": 1,
"price":200.00
},
{
"codename":023111,
"description": 44,
"price":120.00
},
{
"codename":14113,
"description": 23,
"price":84.00
}
]
}
]
谁能指导我对第一组列进行重命名和分组?或者有人知道解决这个问题的另一种方法吗?我想这样做,因为我必须重复相同的过程,但是有更多的列组和搜索,这似乎比从几个for循环创建子循环要简单
任何建议肯定会有帮助!我已经搜索了很多,但这是我第一次使用这种类型的输出。提前感谢 看看这是否适合您:
#get rid of whitespaces if any
df.columns = df.columns.str.strip()
#split into two sections
fixed = df.columns[:4]
varying = df.columns[4:]
#create dicts for both fixed and varying
features = df[varying].to_dict('records')
report = df[fixed].drop_duplicates().to_dict('records')[0]
#combine into a dict into a list :
fin = [{"report":report,"features":features}]
print(fin)
[{'report': {'first_date': 201901,
'second_date': 201902,
'id': 5555,
'type': 111},
'features': [{'codename': 1111, 'description': 1, 'price': 200.0},
{'codename': 23111, 'description': 44, 'price': 120.0},
{'codename': 14113, 'description': 23, 'price': 84.0}]}]
看看这是否适合您:
#get rid of whitespaces if any
df.columns = df.columns.str.strip()
#split into two sections
fixed = df.columns[:4]
varying = df.columns[4:]
#create dicts for both fixed and varying
features = df[varying].to_dict('records')
report = df[fixed].drop_duplicates().to_dict('records')[0]
#combine into a dict into a list :
fin = [{"report":report,"features":features}]
print(fin)
[{'report': {'first_date': 201901,
'second_date': 201902,
'id': 5555,
'type': 111},
'features': [{'codename': 1111, 'description': 1, 'price': 200.0},
{'codename': 23111, 'description': 44, 'price': 120.0},
{'codename': 14113, 'description': 23, 'price': 84.0}]}]