Python Groupby和具有相同值的多个列的总和
我正在处理熊猫数据帧,并具有以下数据帧:Python Groupby和具有相同值的多个列的总和,python,pandas,group-by,Python,Pandas,Group By,我正在处理熊猫数据帧,并具有以下数据帧: data =pd.DataFrame() data['HomeTeam'] = ['A','B','C','D','E'] data['AwayTeam'] = ['E','D','A','B','C'] data['HomePoint'] = [1,3,0,1,3] data['AwayPoint'] = [1,0,3,1,0] data ['Match'] = data['HomeTeam'].astype(str)+' Vs '+data['Aw
data =pd.DataFrame()
data['HomeTeam'] = ['A','B','C','D','E']
data['AwayTeam'] = ['E','D','A','B','C']
data['HomePoint'] = [1,3,0,1,3]
data['AwayPoint'] = [1,0,3,1,0]
data ['Match'] = data['HomeTeam'].astype(str)+' Vs '+data['AwayTeam'].astype(str)
# I want to duplicate the match
Nsims = 2
data_Dub =pd.DataFrame((pd.np.tile(data,(Nsims,1))))
data_Dub.columns = data.columns
# Then I will assign the stage of the match
data_Dub['SimStage'] = data_Dub.groupby('Match').cumcount()
我想做的是将每个团队获得的原点和原点相加,并将其保存到新的数据帧中。
我的新数据帧如下所示:
这意味着Homepoint和awaypoint将为同一个团队添加,因为我在dataframe中有5个团队。
有人能建议怎么做吗
我使用了以下代码,但它不起作用
Point = data_Dub.groupby(['SimStage','HomeTeam','AwayTeam])['HomePoint','AwayPoint'].sum()
谢谢。您可以分别为
HomeTeam
和AwayTeam
聚合sum
,然后对多索引
中的列使用,最后更改列名,必要时更改列顺序:
a = data_Dub.groupby(['AwayTeam', 'SimStage'])['AwayPoint'].sum()
b = data_Dub.groupby(['HomeTeam', 'SimStage'])['HomePoint'].sum()
s = a.add(b).rename('Point')
df = s.sort_index(level=[1, 0]).reset_index().rename(columns={'AwayTeam':'Team'})
df = df[['Team','Point','SimStage']]
print (df)
Team Point SimStage
0 A 4 0
1 B 4 0
2 C 0 0
3 D 1 0
4 E 4 0
5 A 4 1
6 B 4 1
7 C 0 1
8 D 1 1
9 E 4 1