Python 如何分组/合并具有各种数据类型的数据帧

Python 如何分组/合并具有各种数据类型的数据帧,python,pandas,Python,Pandas,我有一个具有不同数据类型(列表、字典、字典列表、字符串等)的数据框架 我想通过Jon Snow将这两行合并,并将所有其他字段合并在一起,使其看起来像 name category description connection Jon Snow ['House Targaryen','House

我有一个具有不同数据类型(列表、字典、字典列表、字符串等)的数据框架

我想通过Jon Snow将这两行合并,并将所有其他字段合并在一起,使其看起来像

name                          category                                       description                                      connection

Jon Snow    ['House Targaryen','House Stark','Nights Watch'] Jon Snow, born ...... his army to Daenerys Targaryen.   ['Rhaena Targaryen',...,'Bran Stark']
使用字典列表可能有点棘手,因为这是一个玩具示例,它只包含两行,很容易分解它并将两行类别组合在一起。但我认为在我的实际数据集中这样做是不现实的

我还考虑过使用
df.groupby('name').aggregate('category':func1,'description':func2,'connection':func3)
,但我不确定是否有适合我需要的内置函数


谢谢亚尔的帮助

查看您的数据,可以先执行一个简单的
groupby
sum
。然后使用列表理解处理类别:

import pandas as pd

df = pd.DataFrame([{'category': [{'id': 1, 'name':'House Targaryen'}],
                    'name': 'Jon Snow',
                    'description':'Jon Snow, born Aegon Targaryen, is the son of Lyanna Stark and Rhaegar Targaryen, the late Prince of Dragonstone',
                    'connection':['Rhaena Targaryen', 'Aegon Targaryen']},
                   {'category': [{'id': 2, 'name': 'House Stark'},{'id': 3, 'name': 'Nights Watch'}],
                    'name': 'Jon Snow',
                    'description': 'After successfully capturing a wight and presenting it to the Lannisters as proof that the Army of the Dead are real, '
                                   'Jon pledges himself and his army to Daenerys Targaryen.',
                    'connection':['Robb Stark', 'Sansa Stark', 'Arya Stark', 'Bran Stark']},
                   {"category":[{"id":4,"name":"Some house"}],
                    "name": "Some name",
                    "description": "some desc",
                    "connection":["connection 1"]}])

result = df.groupby("name").sum()
result["category"] = [[item.get("name") for item in i] for i in result["category"]]
result.reset_index(inplace=True)

print (result)

#
            name                                      category                                        description                                         connection
0   Jon Snow  [House Targaryen, House Stark, Nights Watch]  Jon Snow, born Aegon Targaryen, is the son of ...  [Rhaena Targaryen, Aegon Targaryen, Robb Stark...
1  Some name                                  [Some house]                                          some desc                                     [connection 1]

您可以使用
groupby()。对其执行任何转换并返回df。应用类似于
df.groupby(“group\u col”).Apply(func)
我认为它解决了示例情况,但我意识到我的df在几乎所有列中都缺少值,似乎
sum()
不知道如何处理它
import pandas as pd

df = pd.DataFrame([{'category': [{'id': 1, 'name':'House Targaryen'}],
                    'name': 'Jon Snow',
                    'description':'Jon Snow, born Aegon Targaryen, is the son of Lyanna Stark and Rhaegar Targaryen, the late Prince of Dragonstone',
                    'connection':['Rhaena Targaryen', 'Aegon Targaryen']},
                   {'category': [{'id': 2, 'name': 'House Stark'},{'id': 3, 'name': 'Nights Watch'}],
                    'name': 'Jon Snow',
                    'description': 'After successfully capturing a wight and presenting it to the Lannisters as proof that the Army of the Dead are real, '
                                   'Jon pledges himself and his army to Daenerys Targaryen.',
                    'connection':['Robb Stark', 'Sansa Stark', 'Arya Stark', 'Bran Stark']},
                   {"category":[{"id":4,"name":"Some house"}],
                    "name": "Some name",
                    "description": "some desc",
                    "connection":["connection 1"]}])

result = df.groupby("name").sum()
result["category"] = [[item.get("name") for item in i] for i in result["category"]]
result.reset_index(inplace=True)

print (result)

#
            name                                      category                                        description                                         connection
0   Jon Snow  [House Targaryen, House Stark, Nights Watch]  Jon Snow, born Aegon Targaryen, is the son of ...  [Rhaena Targaryen, Aegon Targaryen, Robb Stark...
1  Some name                                  [Some house]                                          some desc                                     [connection 1]