Python 在dataframe上使用apply(),并将其他dataframe列作为输入
我有一个足球比赛结果的数据框,我正试图在数据框的末尾创建一个新的列,显示哪支球队获胜。我正试图使用df.apply来实现这一点。以下是我到目前为止的情况:Python 在dataframe上使用apply(),并将其他dataframe列作为输入,python,pandas,dataframe,series,Python,Pandas,Dataframe,Series,我有一个足球比赛结果的数据框,我正试图在数据框的末尾创建一个新的列,显示哪支球队获胜。我正试图使用df.apply来实现这一点。以下是我到目前为止的情况: def match_winner(winner,home_team,away_team,home_goals,away_goals): if home_goals>away_goals: winner = home_team elif home_goals<away_goals:
def match_winner(winner,home_team,away_team,home_goals,away_goals):
if home_goals>away_goals:
winner = home_team
elif home_goals<away_goals:
winner = away_team
else:
winner = "None"
match['Winning Team'] =""
match['Winning Team'].apply(match_winner,args=[match['Home Team'],match['Away Team'],match['home_team_goal'],match['away_team_goal']])
我需要发送多个列作为输入参数,以便找到谁赢了,但我不确定.apply是否允许您将一系列作为参数传递?是否可以使用.apply执行此操作。如果是这样的话,一个解决方案会很有帮助,但是如果有人知道一个更好的选择,那也是有用的。 < P>避免数据流。应用程序通常运行为A,而考虑嵌套的条件逻辑与列:
match['Winning Team'] = np.where(match['home_team_goal'] > match['away_team_goal'],
match['Home Team'],
np.where(match['home_team_goal'] < match['away_team_goal'],
match['Away Team'],
np.nan
)
)
<>避免数据流应用.Apple通常作为A运行,而考虑嵌套的条件逻辑与列:
match['Winning Team'] = np.where(match['home_team_goal'] > match['away_team_goal'],
match['Home Team'],
np.where(match['home_team_goal'] < match['away_team_goal'],
match['Away Team'],
np.nan
)
)
使用,而不是应用,这应该不是一个好主意在这里的引擎盖下:
match = pd.DataFrame({
'Home Team':list('abcdef'),
'Away Team':list('ghijkl'),
'home_team_goal':[14,5,4,5,5,4],
'away_team_goal':[7,8,2,4,8,4],
})
m1 = match.home_team_goal>match.away_team_goal
m2 = match.home_team_goal<match.away_team_goal
match['winner'] = np.select([m1, m2], [match['Home Team'], match['Away Team']], default=None)
print (match)
Home Team Away Team home_team_goal away_team_goal winner
0 a g 14 7 a
1 b h 5 8 h
2 c i 4 2 c
3 d j 5 4 d
4 e k 5 8 k
5 f l 4 4 None
使用,而不是应用,这应该不是一个好主意在这里的引擎盖下:
match = pd.DataFrame({
'Home Team':list('abcdef'),
'Away Team':list('ghijkl'),
'home_team_goal':[14,5,4,5,5,4],
'away_team_goal':[7,8,2,4,8,4],
})
m1 = match.home_team_goal>match.away_team_goal
m2 = match.home_team_goal<match.away_team_goal
match['winner'] = np.select([m1, m2], [match['Home Team'], match['Away Team']], default=None)
print (match)
Home Team Away Team home_team_goal away_team_goal winner
0 a g 14 7 a
1 b h 5 8 h
2 c i 4 2 c
3 d j 5 4 d
4 e k 5 8 k
5 f l 4 4 None
还有一种方法:
与团队一起创建dataframe:
results_di = {
"Home Team": [
"KV Mechelen",
"KSV Cercle Brugge",
"RSC Anderlecht",
"KV Mechelen",
"SV Zulte-Waregem",
"FC Zürich",
"FC St. Gallen",
"FC Vaduz",
"Grasshopper Club Zürich",
"BSC Young Boys",
],
"Away Team": [
"KRC Genk",
"Club Brugge KV",
"SV Zulte-Waregem",
"RSC Anderlecht",
"KSV Roeselare",
"FC Thun",
"FC Thun",
"FC Luzern",
"FC Sion",
"FC Basel",
],
"home_team_goal": [2, 1, 2, 2, 0, 3, 1, 1, 2, 4],
"away_team_goal": [1, 3, 0, 1, 0, 3, 0, 2, 0, 3],
}
df = pd.DataFrame(results_di)
df['winner'] = 'none'
home_win = df['home_team_goal'] - df["away_team_goal"] > 0
away_win = df["away_team_goal"] - df['home_team_goal'] > 0
df.loc[home_win, "winner"] = df.loc[home_win, 'Home Team']
df.loc[away_win, "winner"] = df.loc[away_win, 'Away Team']
printdf
Home Team Away Team home_team_goal away_team_goal \
0 KV Mechelen KRC Genk 2 1
1 KSV Cercle Brugge Club Brugge KV 1 3
2 RSC Anderlecht SV Zulte-Waregem 2 0
3 KV Mechelen RSC Anderlecht 2 1
4 SV Zulte-Waregem KSV Roeselare 0 0
5 FC Zürich FC Thun 3 3
6 FC St. Gallen FC Thun 1 0
7 FC Vaduz FC Luzern 1 2
8 Grasshopper Club Zürich FC Sion 2 0
9 BSC Young Boys FC Basel 4 3
winner
0 KV Mechelen
1 Club Brugge KV
2 RSC Anderlecht
3 KV Mechelen
4 none
5 none
6 FC St. Gallen
7 FC Luzern
8 Grasshopper Club Zürich
9 BSC Young Boys
还有一种方法:
与团队一起创建dataframe:
results_di = {
"Home Team": [
"KV Mechelen",
"KSV Cercle Brugge",
"RSC Anderlecht",
"KV Mechelen",
"SV Zulte-Waregem",
"FC Zürich",
"FC St. Gallen",
"FC Vaduz",
"Grasshopper Club Zürich",
"BSC Young Boys",
],
"Away Team": [
"KRC Genk",
"Club Brugge KV",
"SV Zulte-Waregem",
"RSC Anderlecht",
"KSV Roeselare",
"FC Thun",
"FC Thun",
"FC Luzern",
"FC Sion",
"FC Basel",
],
"home_team_goal": [2, 1, 2, 2, 0, 3, 1, 1, 2, 4],
"away_team_goal": [1, 3, 0, 1, 0, 3, 0, 2, 0, 3],
}
df = pd.DataFrame(results_di)
df['winner'] = 'none'
home_win = df['home_team_goal'] - df["away_team_goal"] > 0
away_win = df["away_team_goal"] - df['home_team_goal'] > 0
df.loc[home_win, "winner"] = df.loc[home_win, 'Home Team']
df.loc[away_win, "winner"] = df.loc[away_win, 'Away Team']
printdf
Home Team Away Team home_team_goal away_team_goal \
0 KV Mechelen KRC Genk 2 1
1 KSV Cercle Brugge Club Brugge KV 1 3
2 RSC Anderlecht SV Zulte-Waregem 2 0
3 KV Mechelen RSC Anderlecht 2 1
4 SV Zulte-Waregem KSV Roeselare 0 0
5 FC Zürich FC Thun 3 3
6 FC St. Gallen FC Thun 1 0
7 FC Vaduz FC Luzern 1 2
8 Grasshopper Club Zürich FC Sion 2 0
9 BSC Young Boys FC Basel 4 3
winner
0 KV Mechelen
1 Club Brugge KV
2 RSC Anderlecht
3 KV Mechelen
4 none
5 none
6 FC St. Gallen
7 FC Luzern
8 Grasshopper Club Zürich
9 BSC Young Boys
@耶斯雷尔。。。条条大路通罗马,对吗?我习惯于基于集合的嵌套逻辑,类似于其他语言,如R的ifelse,SQL的case。选择是有趣的!我仍然记得我第一次在联合国大学遇到这个函数,这真是一个很好的知识;我还使用了双、三np.where,但可读性稍差@耶斯雷尔。。。条条大路通罗马,对吗?我习惯于基于集合的嵌套逻辑,类似于其他语言,如R的ifelse,SQL的case。选择是有趣的!我仍然记得我第一次在联合国大学遇到这个函数,这真是一个很好的知识;我还使用了双、三np.where,但可读性稍差;检查你的答案,我认为你在第二个过滤器中的意思是匹配[客场球队]。检查你的答案,我认为你在第二个过滤器中的意思是匹配[客场球队]。