Pandas 熊猫-按组/子组进行滚动平均

Pandas 熊猫-按组/子组进行滚动平均,pandas,moving-average,Pandas,Moving Average,我试图通过分组几列来找出滚动平均值。下面给出了我的数据集的外观: category, sub_category,value fruit, apple, 10 fruit, apple, 2 fruit, apple, 5 fruit, apple, 1 fruit, banana, 3 fruit, orange, 5 fruit, orange, 5 fruit, orange, 3 fruit, orange, 8 预期产出: category, sub_category,value, r

我试图通过分组几列来找出滚动平均值。下面给出了我的数据集的外观:

category, sub_category,value
fruit, apple, 10
fruit, apple, 2
fruit, apple, 5
fruit, apple, 1
fruit, banana, 3
fruit, orange, 5
fruit, orange, 5
fruit, orange, 3
fruit, orange, 8
预期产出:

category, sub_category,value, rolling_average
fruit, apple, 10, 10
fruit, apple, 2, 6
fruit, apple, 5, 5.66
fruit, apple, 1, 2.66
fruit, banana, 3, 3
fruit, orange, 5, 5
fruit, orange, 5, 5
fruit, orange, 3, 4.33
fruit, orange, 8, 5.33
我能够在没有任何组的情况下执行滚动平均,但不确定如何在同一数据帧内按组执行

我相信您需要按组执行:

df['expanding_average'] = (df.groupby(['category', 'sub_category'])['value']
                             .expanding()
                             .mean()
                             .reset_index(level=[0,1], drop=True))
print (df)
  category sub_category  value  expanding_average
0    fruit        apple     10          10.000000
1    fruit        apple      2           6.000000
2    fruit        apple      5           5.666667
3    fruit        apple      1           4.500000
4    fruit       banana      3           3.000000
5    fruit       orange      5           5.000000
6    fruit       orange      5           5.000000
7    fruit       orange      3           4.333333
8    fruit       orange      8           5.250000
具有
N=3的滚动平均值的解:

df['rolling_average'] = (df.groupby(['category', 'sub_category'])['value']
                           .rolling(3, min_periods=1)
                           .mean()
                           .reset_index(level=[0,1], drop=True))
print (df)

  category sub_category  value  rolling_average
0    fruit        apple     10        10.000000
1    fruit        apple      2         6.000000
2    fruit        apple      5         5.666667
3    fruit        apple      1         2.666667
4    fruit       banana      3         3.000000
5    fruit       orange      5         5.000000
6    fruit       orange      5         5.000000
7    fruit       orange      3         4.333333
8    fruit       orange      8         5.333333

这回答了你的问题吗?请允许我再请你帮个忙。现在我们可以得到
rolling_average
了,我想看看是否可以将用于计算
rolling_average
的3个值存储在3个单独的列中。我如何修改上述内容以获得这些值。谢谢