Python 数据帧聚合对象类型_Python_Python 3.x_Pandas_Group By_Aggregate

Python 数据帧聚合对象类型

python python-3.x pandas

Python 数据帧聚合对象类型,python,python-3.x,pandas,group-by,aggregate,Python,Python 3.x,Pandas,Group By,Aggregate,目标我有一个带有浮点和对象类型的数据帧我想按“名称”列对数据帧进行分组 groupped=df.groupby（[“name”]）然后聚合所有其他列有些列的浮点值是我加在一起的但我也有“对象类型”，目标是只保留1个对象类型，例如：第一个。它们是一样的。因此，我尝试使用min，但它不起作用，但我找不到任何其他与对象类型相关的函数错误 ----------------------------------------------------------------------

目标

我有一个带有浮点和对象类型的数据帧
我想按“名称”列对数据帧进行分组
```
groupped=df.groupby（[“name”]）
```
然后聚合所有其他列
有些列的浮点值是我加在一起的
但我也有“对象类型”，目标是只保留1个对象类型，例如：第一个。它们是一样的。因此，我尝试使用min，但它不起作用，但我找不到任何其他与对象类型相关的函数

错误

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-102-3594b7bd0c31> in <module>
      9          'number_one' : ['sum'],
     10          'type' : ['min'],
---> 11          'number_two' : ['sum'],
     12 })
     13 
...
TypeError: '<=' not supported between instances of 'str' and 'float'

---------------------------------------------------------------------------
TypeError回溯（最近一次调用上次）
在里面
9“第一名”：[“总和]，
10“类型”：[“分钟]，
--->11“数字二”：[“总和]，
12 })
13
...
TypeError:“第一个想法是用于对象列：
aggregated = groupped.agg({ 
         'name' : ['first'],
         'id' : ['first'],
         'date' : ['first'],
         'number_one' : ['sum'],
         'type' : ['first'],
         'number_two' : ['sum'],
})

如果想要避免多索引

删除[]：

aggregated = groupped.agg({ 
         'name' : 'first',
         'id' : 'first',
         'date' : 'first',
         'number_one' : 'sum',
         'type' : 'first',
         'number_two' : 'sum',
})

更通用的解决方案是针对数值列聚合

sum

，对于其他列，在lambda函数中获取第一个值：

f = lambda x: x.sum() if np.issubdtype(x.dtype, np.number) else x.iat[0]
aggregated = groupped.agg(f)

第一个想法是用于对象列：

aggregated = groupped.agg({ 
         'name' : ['first'],
         'id' : ['first'],
         'date' : ['first'],
         'number_one' : ['sum'],
         'type' : ['first'],
         'number_two' : ['sum'],
})

如果想要避免多索引删除[]：

aggregated = groupped.agg({ 
         'name' : 'first',
         'id' : 'first',
         'date' : 'first',
         'number_one' : 'sum',
         'type' : 'first',
         'number_two' : 'sum',
})

更通用的解决方案是针对数值列聚合

sum

，对于其他列，在lambda函数中获取第一个值：

f = lambda x: x.sum() if np.issubdtype(x.dtype, np.number) else x.iat[0]
aggregated = groupped.agg(f)

30秒后完美。谢谢，30秒后就好了。非常感谢。