R中唯一博弈id的不同变量的总和
我有以下数据框,包含游戏ID、玩家、动作类型(例如传球或运球)以及动作是否导致成功或失败R中唯一博弈id的不同变量的总和,r,R,我有以下数据框,包含游戏ID、玩家、动作类型(例如传球或运球)以及动作是否导致成功或失败 df1 <- data.frame( game_id = c("1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "2", &quo
df1 <- data.frame(
game_id = c("1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "2", "2", "2", "2", "2",
"2", "2", "2", "2", "2"),
player = c("X", "X", "X", "Y", "Y", "Z", "Z", "X", "Y", "Z", "Y", "Y", "Y", "X", "X", "Z",
"Z", "X", "Z", "X"),
type = c("pass", "pass", "pass", "pass", "pass", "dribble", "dribble", 'tackle', "pass",
"pass", "dribble", "pass", "dribble", "pass", "pass", "pass", "pass", "dribble",
"pass", "pass"),
result = c("success", "success", "fail", "success", "success", "success", "fail", "success",
"fail", "success", "success", "fail", "fail", "success", "success", "fail", "fail",
"success", "success", "success")
)
df1
# game_id player type result
#1 1 X pass success
#2 1 X pass success
#3 1 X pass fail
#4 1 Y pass success
#5 1 Y pass success
#6 1 Z dribble success
#7 1 Z dribble fail
#8 1 X tackle success
#9 1 Y pass fail
#10 1 Z pass success
#11 2 Y dribble success
#12 2 Y pass fail
#13 2 Y dribble fail
#14 2 X pass success
#15 2 X pass success
#16 2 Z pass fail
#17 2 Z pass fail
#18 2 X dribble success
#19 2 Z pass success
#20 2 X pass success
df1第一个开始是:
df1 %>%
group_by(game_id, player) %>%
mutate(
pass_per_player = cumsum(type=="pass"),
success_pass_player = cumsum(result=="success" & type=="pass"),
success_rate_player = success_pass_player / pass_per_player)
# A tibble: 20 x 7
# Groups: game_id, player [6]
game_id player type result pass_per_player success_pass_player success_rate_player
<chr> <chr> <chr> <chr> <int> <int> <dbl>
1 1 X pass success 1 1 1
2 1 X pass success 2 2 1
3 1 X pass fail 3 2 0.667
4 1 Y pass success 1 1 1
5 1 Y pass success 2 2 1
6 1 Z dribble success 0 0 NaN
7 1 Z dribble fail 0 0 NaN
8 1 X tackle success 3 2 0.667
9 1 Y pass fail 3 2 0.667
10 1 Z pass success 1 1 1
11 2 Y dribble success 0 0 NaN
12 2 Y pass fail 1 0 0
13 2 Y dribble fail 1 0 0
14 2 X pass success 1 1 1
15 2 X pass success 2 2 1
16 2 Z pass fail 1 0 0
17 2 Z pass fail 2 0 0
18 2 X dribble success 2 2 1
19 2 Z pass success 3 1 0.333
20 2 X pass success 3 3 1
然而,我相信第一个cumsum并没有真正起作用,因为它增加了每一次传球,而不是每个球员的传球。不幸的是,你在期望的结果中得到了相同的结果。上面的代码给了我不同的输出。我现在尝试了以下方法,效果很好:df1
df1 %>%
group_by(game_id, player) %>%
mutate(
pass_per_player = cumsum(type=="pass"),
success_pass_player = cumsum(result=="success" & type=="pass"),
success_rate_player = success_pass_player / pass_per_player)
# A tibble: 20 x 7
# Groups: game_id, player [6]
game_id player type result pass_per_player success_pass_player success_rate_player
<chr> <chr> <chr> <chr> <int> <int> <dbl>
1 1 X pass success 1 1 1
2 1 X pass success 2 2 1
3 1 X pass fail 3 2 0.667
4 1 Y pass success 1 1 1
5 1 Y pass success 2 2 1
6 1 Z dribble success 0 0 NaN
7 1 Z dribble fail 0 0 NaN
8 1 X tackle success 3 2 0.667
9 1 Y pass fail 3 2 0.667
10 1 Z pass success 1 1 1
11 2 Y dribble success 0 0 NaN
12 2 Y pass fail 1 0 0
13 2 Y dribble fail 1 0 0
14 2 X pass success 1 1 1
15 2 X pass success 2 2 1
16 2 Z pass fail 1 0 0
17 2 Z pass fail 2 0 0
18 2 X dribble success 2 2 1
19 2 Z pass success 3 1 0.333
20 2 X pass success 3 3 1
df1$success_rate_player[is.nan(df1$success_rate_player)] <- 0