Python 自定义聚合函数_Python_Pandas_Aggregate_Pandas Groupby

Python 自定义聚合函数

python pandas

Python 自定义聚合函数,python,pandas,aggregate,pandas-groupby,Python,Pandas,Aggregate,Pandas Groupby,我有一个pandas数据帧，下面的命令可以对其工作： house.groupby(['place_name'])['index_nsa'].agg(['first','last']) 它给了我想要的。现在，我想创建一个自定义聚合值，该值给出第一个值和最后一个值之间的百分比变化我在计算这些值时出错，所以我假设我必须将它们转换成数字 house.groupby(['place_name'])['index_nsa'].agg({"change in %":[(int('last')-int('f

我有一个pandas数据帧，下面的命令可以对其工作：

house.groupby(['place_name'])['index_nsa'].agg(['first','last'])

它给了我想要的。现在，我想创建一个自定义聚合值，该值给出第一个值和最后一个值之间的百分比变化

我在计算这些值时出错，所以我假设我必须将它们转换成数字

house.groupby(['place_name'])['index_nsa'].agg({"change in %":[(int('last')-int('first')/int('first')]})

不幸的是，我只在最后一个括号中得到一个语法错误，我似乎找不到这个错误

有人看到我哪里出错了吗

您需要在此处定义并将回调传递给

agg

。您可以通过lambda函数来实现这一点：

house.groupby(['place_name'])['index_nsa'].agg([
    ("change in %", lambda x: (x.iloc[-1] - x.iloc[0]) / x.iloc[0])])

仔细查看

.agg

调用要允许重命名输出列，必须传递格式为

[（new_name，agg_func），…]的元组列表。

。更多信息

如果您想避免使用lambda，但要付出一些冗长的代价，那么可以使用

def first_last_pct(ser):
    first, last = ser.iloc[0], ser.iloc[-1]
    return (last - first) / first

house.groupby(['place_name'])['index_nsa'].agg([("change in %", first_last_pct)])