Python 返回数据集中值的运行计数
我正试图根据Python 返回数据集中值的运行计数,python,pandas,dataframe,group-by,pandas-groupby,Python,Pandas,Dataframe,Group By,Pandas Groupby,我正试图根据df中的两列返回正在运行的count 对于下面的df,我试图根据列“事件”和列“谁”确定计数 import pandas as pd import numpy as np d = ({ 'Event' : ['A','B','E','','C','B','B','B','B','E','C','D'], 'Space' : ['X1','X1','X2','','X3','X3','X3','X4','X3','X2','X2','X1'], 'Who' :
df
中的两列返回正在运行的count
对于下面的df
,我试图根据列“事件”
和列“谁”
确定计数
import pandas as pd
import numpy as np
d = ({
'Event' : ['A','B','E','','C','B','B','B','B','E','C','D'],
'Space' : ['X1','X1','X2','','X3','X3','X3','X4','X3','X2','X2','X1'],
'Who' : ['Home','Home','Even','Out','Home','Away','Home','Away','Home','Even','Away','Home']
})
d = pd.DataFrame(data = d)
我试过以下方法
df = d.groupby(['Event', 'Who'])['Space'].count().reset_index(name="count")
这就产生了:
Event Who count
0 Out 1
1 A Home 1
2 B Away 2
3 B Home 3
4 C Away 1
5 C Home 1
6 D Home 1
7 E Even 2
但我希望它是一个运行计数,而不是一个总计数
是否可以修改df=d.groupby(['Event',Who'['Space'].count().reset_index(name=“count”)
以过滤其他约束,或者它必须是掩码
函数或类似函数
因此,我的预期输出是:
A_Away A_Home B_Away B_Home C_Away C_Home D_Away D_Home Event Space Who
0 1 A X1 Home
1 B X1 Home
2 E X2 Even
3 Out
4 1 C X3 Home
5 1 B X3 Away
6 1 B X3 Home
7 B X4 Away
8 2 B X3 Home
9 2 E X2 Even
10 1 C X2 Away
11 1 D X1 Home
因此,计数被添加到行中。而不是整个数据集的总计数。以下是获得结果所需的步骤:
groupby
和cumcount
unstack
pd.concat
不是反对者,但这是一个写得很好的问题。我想反对者是因为这个问题有点本地化。谢谢@Call Centre Executive!这就是我要找的。
# set the index
v = df.set_index(['Who', 'Event'], append=True)['Space']
# assign `v` the values for the cumulative count
v[:] = df.groupby(['Event', 'Who']).cumcount().add(1)
# reshape `v`
v = v.unstack([1, 2], fill_value='')
# fix your headers
v.columns = v.columns.map('{0[1]}_{0[0]}'.format)
# concatenate the result
pd.concat([v.loc[:, ~v.columns.str.contains('Out')], df], 1)
A_Home B_Home E_Even C_Home B_Away C_Away D_Home Event Space Who
0 1 A X1 Home
1 1 B X1 Home
2 1 E X2 Even
3 Out
4 1 C X3 Home
5 1 B X3 Away
6 2 B X3 Home
7 2 B X4 Away
8 3 B X3 Home
9 2 E X2 Even
10 1 C X2 Away
11 1 D X1 Home