Python 如何将不同的行分组,并将类别的计数添加到pandas中的新列中?

Python 如何将不同的行分组,并将类别的计数添加到pandas中的新列中?,python,pandas,Python,Pandas,1.删除重复项['Match_Date','Games'] i、 e 2019-07-10足球赛(F)数量+2019-07-10板球赛(C)=5(F或C) 2019-07-11足球比赛(F)数量+2019-07-11板球比赛(C)=6(F_或_C),依此类推 2.添加一列F和C,如最终数据如下所示: Game_ID Games Match_Date Total_Games_Each_Day F_or_C 1 Football 2019-07-10 5

1.删除重复项['Match_Date','Games'] i、 e 2019-07-10足球赛(F)数量+2019-07-10板球赛(C)=5(F或C)

2019-07-11足球比赛(F)数量+2019-07-11板球比赛(C)=6(F_或_C),依此类推

2.添加一列F和C,如最终数据如下所示:

Game_ID Games     Match_Date    Total_Games_Each_Day     F_or_C
1       Football  2019-07-10    5                        2
2       Cricket   2019-07-10    5                        3
3       Cricket   2019-07-10    5                        3
4       Football  2019-07-10    5                        2
5       Cricket   2019-07-10    5                        3
6       Football  2019-07-11    6                        4
7       Cricket   2019-07-11    6                        2
8       Cricket   2019-07-11    6                        2
9       Football  2019-07-11    6                        4
10      Football  2019-07-11    6                        4
11      Football  2019-07-11    6                        4
12      Football  2019-07-16    6                        6
13      Football  2019-07-16    6                        6
14      Football  2019-07-16    6                        6

F列中的Null表示当天没有踢足球,C列中的Null表示当天没有踢足球。

IIUC,您需要使用
交叉表

Game_ID Games     Match_Date    Total_Games_Each_Day     F_or_C  F     C
1       Football  2019-07-10    5                        2       2     Null
2       Cricket   2019-07-10    5                        3       Null  3
3       Football  2019-07-11    6                        4       4     Null
4       Cricket   2019-07-11    6                        2       Null  2
5       Football  2019-07-16    6                        6       6     Null

到目前为止,您尝试了什么?game.groupby(['Match_Date','Games'])['game ID'].count()不起作用。使用drop_副本也会带来问题
from io import StringIO
# read your dataframe
df = pd.read_csv(StringIO(your_data),sep='\s+',parse_dates=['Match_Date'])
#note the datetime column.
s = df.drop_duplicates(subset=['Match_Date','Games'])


new_df = s.join(pd.crosstab(s.index, s.Games.str[0], s["F_or_C"], aggfunc="first"))

    Game_ID     Games Match_Date  Total_Games_Each_Day  F_or_C    C    F
0         1  Football 2019-07-10                     5       2  NaN  2.0
1         2   Cricket 2019-07-10                     5       3  3.0  NaN
5         6  Football 2019-07-11                     6       4  NaN  4.0
6         7   Cricket 2019-07-11                     6       2  2.0  NaN
11       12  Football 2019-07-16                     6       6  NaN  6.0