Pandas-使用groupby创建另一列的最后N个值之和的新列
我有这个df:Pandas-使用groupby创建另一列的最后N个值之和的新列,pandas,Pandas,我有这个df: round_id team opponent home_dummy GC GP P 0 1.0 Flamengo Atlético-MG 1.0 1.0 0.0 0 1 4.0 Flamengo Grêmio 1.0 1.0 1.0 1 2 5.0 Flamengo
round_id team opponent home_dummy GC GP P
0 1.0 Flamengo Atlético-MG 1.0 1.0 0.0 0
1 4.0 Flamengo Grêmio 1.0 1.0 1.0 1
2 5.0 Flamengo Botafogo 1.0 1.0 1.0 1
3 6.0 Flamengo Santos 0.0 0.0 1.0 3
4 7.0 Flamengo Bahia 0.0 3.0 5.0 3
5 8.0 Flamengo Fortaleza 1.0 1.0 2.0 3
6 9.0 Flamengo Fluminense 0.0 1.0 2.0 3
7 10.0 Flamengo Ceará 0.0 2.0 0.0 0
8 3.0 Flamengo Coritiba 0.0 0.0 1.0 3
9 11.0 Flamengo Goiás 1.0 1.0 2.0 3
10 13.0 Flamengo Athlético-PR 1.0 1.0 3.0 3
11 14.0 Flamengo Sport 1.0 0.0 3.0 3
12 15.0 Flamengo Vasco 0.0 1.0 2.0 3
13 16.0 Flamengo Bragantino 1.0 1.0 1.0 1
14 17.0 Flamengo Corinthians 0.0 1.0 5.0 3
15 18.0 Flamengo Internacional 0.0 2.0 2.0 1
16 19.0 Flamengo São Paulo 1.0 4.0 1.0 0
17 12.0 Flamengo Palmeiras 0.0 1.0 1.0 1
18 2.0 Flamengo Atlético-GO 0.0 3.0 0.0 0
19 20.0 Flamengo Atlético-MG 0.0 4.0 0.0 0
我给它添加了一个新列,它用另一个的最后N个值的和创建了一个新列,如下所示:
df['Last_5'] = df.P.rolling(5,min_periods=1).sum().shift().fillna(0)
这给了我:
round_id team opponent home_dummy GC GP P last_5
0 1.0 Flamengo Atlético-MG 1.0 1.0 0.0 0 0
1 4.0 Flamengo Grêmio 1.0 1.0 1.0 1 0
2 5.0 Flamengo Botafogo 1.0 1.0 1.0 1 1
3 6.0 Flamengo Santos 0.0 0.0 1.0 3 2
4 7.0 Flamengo Bahia 0.0 3.0 5.0 3 5
5 8.0 Flamengo Fortaleza 1.0 1.0 2.0 3 8
6 9.0 Flamengo Fluminense 0.0 1.0 2.0 3 11
7 10.0 Flamengo Ceará 0.0 2.0 0.0 0 13
8 3.0 Flamengo Coritiba 0.0 0.0 1.0 3 12
9 11.0 Flamengo Goiás 1.0 1.0 2.0 3 12
10 13.0 Flamengo Athlético-PR 1.0 1.0 3.0 3 12
11 14.0 Flamengo Sport 1.0 0.0 3.0 3 12
12 15.0 Flamengo Vasco 0.0 1.0 2.0 3 12
13 16.0 Flamengo Bragantino 1.0 1.0 1.0 1 15
14 17.0 Flamengo Corinthians 0.0 1.0 5.0 3 13
15 18.0 Flamengo Internacional 0.0 2.0 2.0 1 11
16 19.0 Flamengo São Paulo 1.0 4.0 1.0 0 8
17 12.0 Flamengo Palmeiras 0.0 1.0 1.0 1 8
18 2.0 Flamengo Atlético-GO 0.0 3.0 0.0 0 6
19 20.0 Flamengo Atlético-MG 0.0 4.0 0.0 0 5
但假设我有许多团队在同一个数据框架中:
round_id team opponent home_dummy GC GP /
0 1.0 Flamengo Atlético-MG 1.0 1.0 0.0
1 4.0 Flamengo Grêmio 1.0 1.0 1.0
2 5.0 Flamengo Botafogo 1.0 1.0 1.0
3 6.0 Flamengo Santos 0.0 0.0 1.0
4 7.0 Flamengo Bahia 0.0 3.0 5.0
.. ... ... ... ... ... ...
395 15.0 Atlético-GO Bragantino 1.0 1.0 2.0
396 16.0 Atlético-GO Santos 0.0 0.0 1.0
397 17.0 Atlético-GO Athlético-PR 1.0 1.0 1.0
398 9.0 Atlético-GO Vasco 0.0 1.0 2.0
399 20.0 Atlético-GO Corinthians 1.0 1.0 1.0
如何对每个团队应用相同的计算并获得相同的结果,而不使团队之间的最后N行重叠?添加
groupby
df['Last_5'] = df.groupby('team').P.apply(lambda x : x.rolling(5,min_periods=1).sum().shift().fillna(0))
似乎这是一个类似的问题,可以提供一个答案。