R 如何基于*部分*行重叠合并数据帧?
我希望在R中合并数据帧,这样只保留那些行在数据帧中部分对应的观察结果 我有两个数据帧(这些是玩具数据帧-实际的有数百列): 预期结果:R 如何基于*部分*行重叠合并数据帧?,r,dataframe,merge,R,Dataframe,Merge,我希望在R中合并数据帧,这样只保留那些行在数据帧中部分对应的观察结果 我有两个数据帧(这些是玩具数据帧-实际的有数百列): 预期结果: V1 V2 V3 rabbit 001 M squirrel 001 M rabbit 004 M squirrel 004 M rabbit 001 B
V1 V2 V3
rabbit 001 M
squirrel 001 M
rabbit 004 M
squirrel 004 M
rabbit 001 B
squirrel 001 B
rabbit 004 B
squirrel 004 B
merge和dplyr::inter_join并不是用于此的合适函数。是什么
rbind(d1, d2)[ave(1:(nrow(d1) + nrow(d2)),
Reduce(paste, rbind(d1, d2)[c("V1", "V2")]),
FUN = length) > 1,]
# V1 V2 V3
#1 rabbit 1 M
#2 squirrel 1 M
#4 rabbit 4 M
#5 squirrel 4 M
#7 rabbit 1 B
#8 squirrel 1 B
#10 rabbit 4 B
#11 squirrel 4 B
资料
可能效率更高,但如果您更愿意考虑连接操作方面的问题,可以使用3dplyr
JOIN操作:
library(dplyr)
# Perform an inner_join with just the columns that you want to match
match_rows <- inner_join(df1[,1:2], df2[,1:2])
match_rows
V1 V2
1 rabbit 1
2 squirrel 1
3 rabbit 4
4 squirrel 4
# Then left_join that with each dataframe to get the matching rows from each
# and then bind them together as rows
bind_rows(left_join(match_rows, df1),
left_join(match_rows, df2))
V1 V2 V3
1 rabbit 1 M
2 squirrel 1 M
3 rabbit 4 M
4 squirrel 4 M
5 rabbit 1 B
6 squirrel 1 B
7 rabbit 4 B
8 squirrel 4 B
库(dplyr)
#仅对要匹配的列执行内部_联接
谢谢你!这个答案很容易解释,实际上我更喜欢dplyr。它工作得很好。
#dput(d1)
structure(list(V1 = c("rabbit", "squirrel", "cow", "rabbit",
"squirrel", "skunk"), V2 = c(1L, 1L, 1L, 4L, 4L, 4L), V3 = c("M",
"M", "M", "M", "M", "M")), row.names = c(NA, 6L), class = "data.frame")
#dput(d2)
structure(list(V1 = c("rabbit", "squirrel", "skunk", "rabbit",
"squirrel", "skunk"), V2 = c(1L, 1L, 1L, 4L, 4L, 8L), V3 = c("B",
"B", "B", "B", "B", "B")), row.names = 7:12, class = "data.frame")
library(dplyr)
# Perform an inner_join with just the columns that you want to match
match_rows <- inner_join(df1[,1:2], df2[,1:2])
match_rows
V1 V2
1 rabbit 1
2 squirrel 1
3 rabbit 4
4 squirrel 4
# Then left_join that with each dataframe to get the matching rows from each
# and then bind them together as rows
bind_rows(left_join(match_rows, df1),
left_join(match_rows, df2))
V1 V2 V3
1 rabbit 1 M
2 squirrel 1 M
3 rabbit 4 M
4 squirrel 4 M
5 rabbit 1 B
6 squirrel 1 B
7 rabbit 4 B
8 squirrel 4 B