Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/70.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R 生成循环以匹配ID';s_R_Dplyr - Fatal编程技术网

R 生成循环以匹配ID';s

R 生成循环以匹配ID';s,r,dplyr,R,Dplyr,我有两个数据帧,每个都包含标识符 df1 <- data.frame(ID = c(20001, 20001, 20003, 20003, 20003, 20003)) df2 <- data.frame(ID = c(20001, 20001, 20003, 20003, 20003, 20005), Type = c('N1', 'N2', 'N3', 'N4', 'N5', 'N6')) 相反,我希望为每个重复的ID引入“next”类型值,

我有两个数据帧,每个都包含标识符

df1 <- data.frame(ID = c(20001, 20001, 20003, 20003, 20003, 20003))
df2 <- data.frame(ID = c(20001, 20001, 20003, 20003, 20003, 20005),
                  Type = c('N1', 'N2', 'N3', 'N4', 'N5', 'N6'))
相反,我希望为每个重复的ID引入“next”类型值,基本上是在一个循环中。理想情况下,我希望跟踪输出

ID     Add
20001  N1
20001  N2
20003  N3
20003  N4
20003  N5
20003  N3

我想它可能需要使用lappy和一个用户定义的函数。

这就是你想要的吗

library(dplyr)
df1 %>% group_by(ID) %>% 
        mutate(c = rep(df2$Type[df2$ID == unique(ID)], length.out = n()))

#     ID      c
#1 20001     N1
#2 20001     N2
#3 20003     N3
#4 20003     N4
#5 20003     N5
#6 20003     N3



我使用了dplyr解决方案,因为它已经为以前的代码块加载了。非常好地使用了mutate,谢谢你的帮助。
ID     Add
20001  N1
20001  N2
20003  N3
20003  N4
20003  N5
20003  N3
library(dplyr)
df1 %>% group_by(ID) %>% 
        mutate(c = rep(df2$Type[df2$ID == unique(ID)], length.out = n()))

#     ID      c
#1 20001     N1
#2 20001     N2
#3 20003     N3
#4 20003     N4
#5 20003     N5
#6 20003     N3
# incase of efficiency, 

library(data.table)
setDT(df2)
setDT(df1)[,  x := rep(df2$Type[df2$ID == ID], length.out = .N),by = .(ID)]
# i'm looking for a base R solution without involving merge
# as of now my bet is on sapply() - but not utilised efficiently

unlist(sapply(unique(df1$ID), function(x) rep(df2$Type[df2$ID == x],
                                              length.out = sum(x==df1$ID))))
# [1] N1 N2 N3 N4 N5 N3