Python 有没有办法在聚合函数中创建自定义函数?
要在数据帧中应用自定义函数吗 例如,数据帧Python 有没有办法在聚合函数中创建自定义函数?,python,python-3.x,pandas,aggregate,pandas-groupby,Python,Python 3.x,Pandas,Aggregate,Pandas Groupby,要在数据帧中应用自定义函数吗 例如,数据帧 index City Age 0 1 A 50 1 2 A 24 2 3 B 65 3 4 A 40 4 5 B 68 5 6 B 48 要应用的函数 def count_people_above_60(age): ** *** #i
index City Age
0 1 A 50
1 2 A 24
2 3 B 65
3 4 A 40
4 5 B 68
5 6 B 48
要应用的函数
def count_people_above_60(age):
** *** #i dont know if the age can or can't be passed as series or list to perform any operation later
return count_people_above_60
期待做类似的事情
df.groupby(['City']).agg{"AGE" : ["mean",""count_people_above_60"]}
预期产量
City Mean People_Above_60
A 38 0
B 60.33 2
如果性能很重要,则创建一个新列,该列由转换为
整数的比较值填充,因此使用for count进行聚合求和:
df = (df.assign(new = df['Age'].gt(60).astype(int))
.groupby(['City'])
.agg(Mean= ("Age" , "mean"), People_Above_60= ('new',"sum")))
print (df)
Mean People_Above_60
City
A 38.000000 0
B 60.333333 2
您的解决方案应该使用比较值和sum
进行更改,但如果有多个组或较大的DataFrame
:
def count_people_above_60(age):
return (age > 60).sum()
df = (df.groupby(['City']).agg(Mean=("Age" , "mean"),
People_Above_60=('Age',count_people_above_60)))
print (df)
Mean People_Above_60
City
A 38.000000 0
B 60.333333 2