数据帧r-多重匹配代码
我有两个要连接的数据帧 第一个是:数据帧r-多重匹配代码,r,R,我有两个要连接的数据帧 第一个是: V1 <- c("AB1", "AB2", "AB3" ,"AB4" ,"AB5" ,"AB6" ,"AB7","AB6","AB9" ,"AB10") df1 <- data.frame(V1) 我有下面的代码,但是我不能让它们同时为来自df2的3列工作 df1$res[match(df2$V5,df1$V1, nomatch=0)] <- df2$V6[match(df2$V5,df1$V1, nomatch = 0)] 如果从代码中
V1 <- c("AB1", "AB2", "AB3" ,"AB4" ,"AB5" ,"AB6" ,"AB7","AB6","AB9" ,"AB10")
df1 <- data.frame(V1)
我有下面的代码,但是我不能让它们同时为来自df2的3列工作
df1$res[match(df2$V5,df1$V1, nomatch=0)] <- df2$V6[match(df2$V5,df1$V1, nomatch = 0)]
如果从代码中删除distinct,您将获得df1的所有行,而不是distinct行。df2数据集在哪里?为什么AB2的值是2?它还与值1关联。为什么AB6的“是/否”列为0?为什么AB3和AB6不匹配,当它们存在于df2中时?请现在看看。为什么AB3没有匹配?它同时存在于df1和df2中。为什么在期望的输出中只有AB6的第二行有匹配项?您期望的输出与上面描述的不匹配。@AntosiosK现已修复@AntosiosK谢谢。最后一个问题如果df1和df2中的列有不同的名称,那么代码是什么?比如说df$V1是“aaa”。您必须使用您拥有的名称,而不是V1和V8。无论您在我的代码中看到V1,都应该用aaa替换它,用V8替换它
V df1$V1 res df$V8 Yes/no
AB1 1 1
AB2 2 1
AB3 3 1
AB4 4 1
AB5 5 1
AB6 6 1
AB7 0
AB6 0
AB9 0
AB10 0
df1$res[match(df2$V5,df1$V1, nomatch=0)] <- df2$V6[match(df2$V5,df1$V1, nomatch = 0)]
V1 <- c("AB1", "AB2", "AB3" ,"AB4" ,"AB5" ,"AB6" ,"AB7","AB6","AB9" ,"AB10")
df1 <- data.frame(V1, stringsAsFactors = F)
V5 <- c("AB1","","","", "AB3", "AB4", "AB5", "AB6")
V6 <- c("AB","AB2","","AB", "", "AB", "", "AB")
V7 <- c("AB","AB","AB","", "AB", "", "AB", "AB")
V8 <- c(1,2,2,2,3,4,5,6)
df2 = data.frame(V5,V6,V7,V8, stringsAsFactors = F)
library(tidyverse)
df2 %>%
gather(v, V1, -V8) %>% # reshape dataset
select(-v) %>% # remove unecessary variable
right_join(df1, by="V1") %>% # join df1
mutate(YesNo = ifelse(is.na(V8), 0, 1)) %>% # create Yes/No variable
distinct() %>% # select distinct rows
select(V1, V8, YesNo) # arrange columns
# V1 V8 YesNo
# 1 AB1 1 1
# 2 AB2 2 1
# 3 AB3 3 1
# 4 AB4 4 1
# 5 AB5 5 1
# 6 AB6 6 1
# 7 AB7 NA 0
# 8 AB9 NA 0
# 9 AB10 NA 0