带groupby和condition的python和
我有这个数据框带groupby和condition的python和,python,pandas,pandas-groupby,cumsum,Python,Pandas,Pandas Groupby,Cumsum,我有这个数据框 In [1]: import pandas as pd
In [1]: import pandas as pd
In [2]: data = pd.DataFrame({'ID': ['A', 'A', 'A', 'A', 'B', 'B', 'B'], 'Tag': ['X', '', 'X', '', 'X', '',''], 'Counts': [1,3,5,2,3,2,1]})
In [3]: data
Out[3]:
ID Tag Counts
0 A X 1
1 A 3
2 A X 5
3 A 2
4 B X 3
5 B 2
6 B 1
我想用cum sum group by column=ID创建一个新列,但如果column Tag=X,则重新启动sum
In [6]: data['before'] = data.groupby(['ID']).Counts.cumsum()
In [7]: data['after'] = [1,4,5,7,3,5,6]
In [8]: data
Out[8]:
ID Tag Counts before after
0 A X 1 1 1
1 A 3 4 4
2 A X 5 9 5
3 A 2 11 7
4 B X 3 3 3
5 B 2 5 5
6 B 1 6 6
我想获取列'after'您可以使用.eq('X').cumsum()
来标识以X
开头的组,您可以在groupby
中使用该组,并将其与'ID'
一起使用:
data['after'] = data.groupby(['ID',data.Tag.eq('X').cumsum()])['Counts'].cumsum()
输出:
ID Tag Counts after
0 A X 1 1
1 A 3 4
2 A X 5 5
3 A 2 7
4 B X 3 3
5 B 2 5
6 B 1 6