Python-如何从数据帧和分组方式创建JSON嵌套文件?

Python-如何从数据帧和分组方式创建JSON嵌套文件?,python,json,pandas,Python,Json,Pandas,因此,我在从pandas数据帧创建适当的JSON格式时遇到了一些麻烦。我的数据框如下所示(很抱歉使用csv格式): 如您所见,前四行有重复的值,因此我想将所有列分为两组,以获得此JSON文件: [ { "report": { "first_date":201901, "second_date": 201902, "id":05555, "typ

因此,我在从pandas数据帧创建适当的JSON格式时遇到了一些麻烦。我的数据框如下所示(很抱歉使用csv格式):

如您所见,前四行有重复的值,因此我想将所有列分为两组,以获得此JSON文件:

[
    {
     "report": 
           {
              "first_date":201901,
              "second_date": 201902,
              "id":05555,
              "type": 111   
            },
     "features": [
           {
              "codename":01111,
              "description": 1,
              "price":200.00
            },
           {
              "codename":023111,
              "description": 44,
              "price":120.00
            },
           {
              "codename":14113,
              "description": 23,
              "price":84.00
            }

       ]
    }
 ]
到目前为止,我已尝试按最后三列进行分组,将它们添加到字典并重命名:

cols = ["codename","description","price"]
rep = (df.groupby(["first_date","second_date","id","type"])[cols]
       .apply(lambda x:x.to_dict('r')
       .reset_index(name="features")
       .to_json(orient="records"))
output = json.dumps(json.loads(rep),indent=4)
我得到这个作为输出:

[
    {
      "first_date":201901,
      "second_date": 201902,
      "id":05555,
      "type": 111,

      "features": [
           {
              "codename":01111,
              "description": 1,
              "price":200.00
            },
           {
              "codename":023111,
              "description": 44,
              "price":120.00
            },
           {
              "codename":14113,
              "description": 23,
              "price":84.00
            }

       ]
    }
 ]
谁能指导我对第一组列进行重命名和分组?或者有人知道解决这个问题的另一种方法吗?我想这样做,因为我必须重复相同的过程,但是有更多的列组和搜索,这似乎比从几个for循环创建子循环要简单


任何建议肯定会有帮助!我已经搜索了很多,但这是我第一次使用这种类型的输出。提前感谢

看看这是否适合您:

#get rid of whitespaces if any
df.columns = df.columns.str.strip()
#split into two sections
fixed = df.columns[:4]
varying = df.columns[4:]
#create dicts for both fixed and varying 
features = df[varying].to_dict('records')
report = df[fixed].drop_duplicates().to_dict('records')[0]
#combine into a dict into a list : 
fin = [{"report":report,"features":features}]
print(fin)
[{'report': {'first_date': 201901,
             'second_date': 201902,
             'id': 5555,
             'type': 111},
 'features': [{'codename': 1111, 'description': 1, 'price': 200.0},
              {'codename': 23111, 'description': 44, 'price': 120.0},
              {'codename': 14113, 'description': 23, 'price': 84.0}]}]

看看这是否适合您:

#get rid of whitespaces if any
df.columns = df.columns.str.strip()
#split into two sections
fixed = df.columns[:4]
varying = df.columns[4:]
#create dicts for both fixed and varying 
features = df[varying].to_dict('records')
report = df[fixed].drop_duplicates().to_dict('records')[0]
#combine into a dict into a list : 
fin = [{"report":report,"features":features}]
print(fin)
[{'report': {'first_date': 201901,
             'second_date': 201902,
             'id': 5555,
             'type': 111},
 'features': [{'codename': 1111, 'description': 1, 'price': 200.0},
              {'codename': 23111, 'description': 44, 'price': 120.0},
              {'codename': 14113, 'description': 23, 'price': 84.0}]}]