R 使用分组应用函数将列变为数据帧_R_Dataframe_Dplyr_Apply

R 使用分组应用函数将列变为数据帧

r dataframe

R 使用分组应用函数将列变为数据帧,r,dataframe,dplyr,apply,R,Dataframe,Dplyr,Apply,我试图对分组数据帧中的行使用apply函数，以检查该组中是否存在与依赖于每一行的特定条件相匹配的其他行。我能够让这项工作为一个小组，但不是所有人例如，在没有分组的情况下： library(dplyr) id <- c(1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2) station <- c(1, 2, 3, 3, 2, 2, 1, 1, 3, 2, 2) timeslot <- c(13, 14, 20, 21, 24, 23, 8, 9, 10, 15,

我试图对分组数据帧中的行使用apply函数，以检查该组中是否存在与依赖于每一行的特定条件相匹配的其他行。我能够让这项工作为一个小组，但不是所有人

例如，在没有分组的情况下：

library(dplyr)

id <- c(1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2)
station <- c(1, 2, 3, 3, 2, 2, 1, 1, 3, 2, 2)
timeslot <- c(13, 14, 20, 21, 24, 23, 8, 9, 10, 15, 16)

df <- data.frame(id, station, timeslot)

s <- 2

df <- 
  df %>% 
  filter(id == 1) %>% 
  arrange(id, timeslot) %>% 
  mutate(match = ifelse(station == s, apply(., 1, function(x) (any(as.numeric(x[3] + 1) == .$timeslot))), FALSE))

  id station timeslot match
1  1       1       13 FALSE
2  1       2       14 FALSE
3  1       3       20 FALSE
4  1       3       21 FALSE
5  1       2       23  TRUE
6  1       2       24 FALSE

库（dplyr）
如果我理解错了问题，我诚挚的道歉。这就是我从问题中了解到的：
 df$match = apply(df, 1, function(line) any(df$id == line[1] & 
                                            df$station == line[2] &
                                            df$timeslot == line[3] + 1))

结果是
   id station timeslot match
1   1       1       13 FALSE
2   1       2       14 FALSE
3   1       3       20  TRUE
4   1       3       21 FALSE
5   1       2       24 FALSE
6   1       2       23  TRUE
7   2       1        8  TRUE
8   2       1        9 FALSE
9   2       3       10 FALSE
10  2       2       15  TRUE
11  2       2       16 FALSE

这似乎有效：
df %>% 
  group_by(id) %>% 
  arrange(id, timeslot) %>% 
  mutate(match = station == s & ((timeslot + 1) %in% timeslot))
# # A tibble: 11 x 4
# # Groups:   id [2]
#       id station timeslot match
#    <dbl>   <dbl>    <dbl> <lgl>
#  1     1       1       13 FALSE
#  2     1       2       14 FALSE
#  3     1       3       20 FALSE
#  4     1       3       21 FALSE
#  5     1       2       23 TRUE 
#  6     1       2       24 FALSE
#  7     2       1        8 FALSE
#  8     2       1        9 FALSE
#  9     2       3       10 FALSE
# 10     2       2       15 TRUE 
# 11     2       2       16 FALSE

df%>%
分组依据（id）%>%
安排（id，时隙）%>%
变异（匹配=站==s&（（时隙+1）%in%时隙））
##A tibble:11 x 4
##组：id[2]
#id站时隙匹配
#            
#1 13错误
#2 1 2 14错误
#3 1 3 20错误
#4 1 3 21错误
#5 1 2 23正确
#61224假
#7 2 1 8错误
#8 2 1 9错误
#9 2 3 10错误
#102 15对
#11 2 16错误
谢谢，我已经添加了这个inI think OP正在寻找行[2]==2&any（df$timeslot==line[3]+1）
，但是添加了一个褶皱，即any（df$timeslot==line[3]+1）
只需要在相同的id中进行评估。我认为您可以使用行[2]==2&any（df$timeslot[df$id==line[1]]==line[3]+1）使您的方法工作，但我没有测试。@Gregor Thomas感谢您的解释。由于OP对你的解决方案很满意，我也不会测试，但会投票支持你的解决方案。
df %>% 
  group_by(id) %>% 
  arrange(id, timeslot) %>% 
  mutate(match = station == s & ((timeslot + 1) %in% timeslot))
# # A tibble: 11 x 4
# # Groups:   id [2]
#       id station timeslot match
#    <dbl>   <dbl>    <dbl> <lgl>
#  1     1       1       13 FALSE
#  2     1       2       14 FALSE
#  3     1       3       20 FALSE
#  4     1       3       21 FALSE
#  5     1       2       23 TRUE 
#  6     1       2       24 FALSE
#  7     2       1        8 FALSE
#  8     2       1        9 FALSE
#  9     2       3       10 FALSE
# 10     2       2       15 TRUE 
# 11     2       2       16 FALSE