Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/74.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
dplyr在组间滞后_R_Dplyr - Fatal编程技术网

dplyr在组间滞后

dplyr在组间滞后,r,dplyr,R,Dplyr,我试着做一些类似于滞后的事情,但要跨团队,而不是在团队内部。样本数据: df <- data.frame(flag = c("A", "B", "A", "B", "B", "B", "A", "B", "B", "A", "B"), var = c("AB123","AC124", "AD125", "AE126", "AF127", "AG128", "AF129",

我试着做一些类似于滞后的事情,但要跨团队,而不是在团队内部。样本数据:

df <- data.frame(flag = c("A", "B", "A", "B", "B", "B", "A", "B", "B", "A", "B"),
                 var = c("AB123","AC124", "AD125", "AE126",
                          "AF127", "AG128", "AF129",
                          "AG130","AH131",
                          "AHI132", "AJ133"))
)
每个flag=B的目标是使用flag=A的前一个var值创建lagvar

这将显示所需的输出:

df1 <- data.frame(flag = c("A", "B", "A", "B", "B", "B", "A", "B", "B", "A", "B"),
                 var = c("AB123","AC124", "AD125", "AE126",
                          "AF127", "AG128", "AF129",
                          "AG130","AH131",
                          "AHI132", "AJ133"),
                 lagvar = c("","AB123","","AD125","AD125","AD125","","AF129","AF129","","AHI132")
)
dplyr解决方案是首选,但我并不挑剔

编辑:我发现了一个使用zoo软件包的解决方案,但如果其他人有更好的想法,我很感兴趣。df$lagvar给你。我使用NA代替空格,但您可以根据需要进行调整:

df %>% mutate(lagvar = ifelse(flag == "A", as.character(var), NA),
              lagvar = zoo::na.locf(lagvar),
              lagvar = ifelse(flag == "A", NA, lagvar))
#    flag    var lagvar
# 1     A  AB123   <NA>
# 2     B  AC124  AB123
# 3     A  AD125   <NA>
# 4     B  AE126  AD125
# 5     B  AF127  AD125
# 6     B  AG128  AD125
# 7     A  AF129   <NA>
# 8     B  AG130  AF129
# 9     B  AH131  AF129
# 10    A AHI132   <NA>
# 11    B  AJ133 AHI132
给你。我使用NA代替空格,但您可以根据需要进行调整:

df %>% mutate(lagvar = ifelse(flag == "A", as.character(var), NA),
              lagvar = zoo::na.locf(lagvar),
              lagvar = ifelse(flag == "A", NA, lagvar))
#    flag    var lagvar
# 1     A  AB123   <NA>
# 2     B  AC124  AB123
# 3     A  AD125   <NA>
# 4     B  AE126  AD125
# 5     B  AF127  AD125
# 6     B  AG128  AD125
# 7     A  AF129   <NA>
# 8     B  AG130  AF129
# 9     B  AH131  AF129
# 10    A AHI132   <NA>
# 11    B  AJ133 AHI132

我的解决方案有点复杂。其思想是找出每个B应该分配给的A的位置,然后与一个表联接,该表只包含带有标志A的行

df %>%
  mutate(pos=cumsum(flag == "A")) %>%
  left_join(
    df %>%
      filter(flag == "A") %>%
      mutate(pos=1:n()) %>%
      select(pos, lagvar=var),
    by="pos") %>%
  mutate(lagvar=ifelse(flag == "A", "", as.character(lagvar)))

#    flag    var pos lagvar
# 1     A  AB123   1       
# 2     B  AC124   1  AB123
# 3     A  AD125   2       
# 4     B  AE126   2  AD125
# 5     B  AF127   2  AD125
# 6     B  AG128   2  AD125
# 7     A  AF129   3       
# 8     B  AG130   3  AF129
# 9     B  AH131   3  AF129
# 10    A AHI132   4       
# 11    B  AJ133   4 AHI132

我的解决方案有点复杂。其思想是找出每个B应该分配给的A的位置,然后与一个表联接,该表只包含带有标志A的行

df %>%
  mutate(pos=cumsum(flag == "A")) %>%
  left_join(
    df %>%
      filter(flag == "A") %>%
      mutate(pos=1:n()) %>%
      select(pos, lagvar=var),
    by="pos") %>%
  mutate(lagvar=ifelse(flag == "A", "", as.character(lagvar)))

#    flag    var pos lagvar
# 1     A  AB123   1       
# 2     B  AC124   1  AB123
# 3     A  AD125   2       
# 4     B  AE126   2  AD125
# 5     B  AF127   2  AD125
# 6     B  AG128   2  AD125
# 7     A  AF129   3       
# 8     B  AG130   3  AF129
# 9     B  AH131   3  AF129
# 10    A AHI132   4       
# 11    B  AJ133   4 AHI132

这太完美了!谢谢Gregor,这太完美了!谢谢你,格雷戈。