Pandas 熊猫-按组/子组进行滚动平均
我试图通过分组几列来找出滚动平均值。下面给出了我的数据集的外观:Pandas 熊猫-按组/子组进行滚动平均,pandas,moving-average,Pandas,Moving Average,我试图通过分组几列来找出滚动平均值。下面给出了我的数据集的外观: category, sub_category,value fruit, apple, 10 fruit, apple, 2 fruit, apple, 5 fruit, apple, 1 fruit, banana, 3 fruit, orange, 5 fruit, orange, 5 fruit, orange, 3 fruit, orange, 8 预期产出: category, sub_category,value, r
category, sub_category,value
fruit, apple, 10
fruit, apple, 2
fruit, apple, 5
fruit, apple, 1
fruit, banana, 3
fruit, orange, 5
fruit, orange, 5
fruit, orange, 3
fruit, orange, 8
预期产出:
category, sub_category,value, rolling_average
fruit, apple, 10, 10
fruit, apple, 2, 6
fruit, apple, 5, 5.66
fruit, apple, 1, 2.66
fruit, banana, 3, 3
fruit, orange, 5, 5
fruit, orange, 5, 5
fruit, orange, 3, 4.33
fruit, orange, 8, 5.33
我能够在没有任何组的情况下执行滚动平均,但不确定如何在同一数据帧内按组执行我相信您需要按组执行:
df['expanding_average'] = (df.groupby(['category', 'sub_category'])['value']
.expanding()
.mean()
.reset_index(level=[0,1], drop=True))
print (df)
category sub_category value expanding_average
0 fruit apple 10 10.000000
1 fruit apple 2 6.000000
2 fruit apple 5 5.666667
3 fruit apple 1 4.500000
4 fruit banana 3 3.000000
5 fruit orange 5 5.000000
6 fruit orange 5 5.000000
7 fruit orange 3 4.333333
8 fruit orange 8 5.250000
具有N=3的滚动平均值的解:
df['rolling_average'] = (df.groupby(['category', 'sub_category'])['value']
.rolling(3, min_periods=1)
.mean()
.reset_index(level=[0,1], drop=True))
print (df)
category sub_category value rolling_average
0 fruit apple 10 10.000000
1 fruit apple 2 6.000000
2 fruit apple 5 5.666667
3 fruit apple 1 2.666667
4 fruit banana 3 3.000000
5 fruit orange 5 5.000000
6 fruit orange 5 5.000000
7 fruit orange 3 4.333333
8 fruit orange 8 5.333333
这回答了你的问题吗?请允许我再请你帮个忙。现在我们可以得到rolling_average
了,我想看看是否可以将用于计算rolling_average
的3个值存储在3个单独的列中。我如何修改上述内容以获得这些值。谢谢