Python 合并具有不同行数的数据帧,并使用值之和创建一个新列
我有这个df1:Python 合并具有不同行数的数据帧,并使用值之和创建一个新列,python,pandas,Python,Pandas,我有这个df1: df1 = pd.DataFrame({'Player':['Zico', 'Leonidas', 'Didi'], 'Team': ['Flamengo', 'Flamengo', 'Botafogo'], 'Position': ['MID', 'DEF', 'MID'], 'Games_Away': [4, 4, 4]}) 另一个df2有不同的行数,第
df1 = pd.DataFrame({'Player':['Zico', 'Leonidas', 'Didi'],
'Team': ['Flamengo', 'Flamengo', 'Botafogo'],
'Position': ['MID', 'DEF', 'MID'],
'Games_Away': [4, 4, 4]})
另一个df2有不同的行数,第一个df上的所有玩家都在场
df2 = pd.DataFrame({'Player':['Zico', 'Leonidas', 'Didi', 'Gerson', 'Pele'],
'Team': ['Flamengo', 'Flamengo', 'Botafogo', 'Botafogo', 'Santos'],
'Position': ['MID', 'DEF', 'MID', 'MID', 'FWD'],
'Games_Home': [3, 4, 3, 1, 1]})
我如何合并这两个dfs,以得到一个新的列“Total_Games”,该列将正确相加
Player Team Position Games_Home Games_Away Total_Games
0 Zico Flamengo MID 3 4 7
1 Leonidas Flamengo DEF 4 4 8
2 Didi Botafogo MID 3 4 7
3 Gerson Botafogo MID 1 0 1
4 Pele Santos FWD 1 0 1
我试过:
df_merge = df1.merge(df2, on="Player", how = 'inner')
df_merge['Total_Games']= df1['Games_Away'] + df2['Games_Home']
但这给了我:
Player Team_x Position_x Games_Away Team_y Position_y Games_Home Total_Games
0 Zico Flamengo MID 4 Flamengo MID 3 7.0
1 Leonidas Flamengo DEF 4 Flamengo DEF 4 8.0
2 Didi Botafogo MID 4 Botafogo MID 3 7.0
问题:
- 不添加df2播放器
- 重复“位置”和“团队”
考虑到1+“NaN”必须是1,加入dfs、保留所有玩家、不重复列并正确求和“Games\u Away”+“Games\u Home”的最佳方式是什么?是否进行外部
合并
df=df1.merge(df2,on=['Player','Team','Position'],how='outer').fillna(0)
df['Game_total']=df.Games_Away+df.Games_Home
df
Out[241]:
Player Team Position Games_Away Games_Home Game_total
0 Zico Flamengo MID 4.0 3 7.0
1 Leonidas Flamengo DEF 4.0 4 8.0
2 Didi Botafogo MID 4.0 3 7.0
3 Gerson Botafogo MID 0.0 1 1.0
4 Pele Santos FWD 0.0 1 1.0