R 按1:2列比率合并2个数据帧_R_Merge

R 按1:2列比率合并2个数据帧

r merge

R 按1:2列比率合并2个数据帧,r,merge,R,Merge,我有两个数据帧 df1=data.frame(w=c(10,'a','a',14,''),data='other stuff') df2=data.frame(c=10:14,n=letters[1:5],data='stuff') > df1;df2 w data 1 10 other stuff 2 a other stuff 3 a other stuff 4 14 other stuff 5 other stuff c n data 1 10

我有两个数据帧

df1=data.frame(w=c(10,'a','a',14,''),data='other stuff')
df2=data.frame(c=10:14,n=letters[1:5],data='stuff')
> df1;df2
   w        data
1 10 other stuff
2  a other stuff
3  a other stuff
4 14 other stuff
5    other stuff
   c n  data
1 10 a stuff
2 11 b stuff
3 12 c stuff
4 13 d stuff
5 14 e stuff

我想制作一个最终df，看起来像（手工制作）：

我试过了

merge(df1,df2,by.x='w',by.y='c|n')

没有用，我也不知道如何解决这个问题。请注意，df1和df2是1000乘以48维的

我们可以将df2转换为一个键列以与df1匹配，然后使用merge：

#dummy data updated data columns
df1 = data.frame(w = c(10,'a','a',14,''), data = paste('otherStuff', 1:5))
df2 = data.frame(c = 10:14, n = letters[1:5], data = paste('stuff', 1:5))

df1;df2

#    w         data
# 1 10 otherStuff 1
# 2  a otherStuff 2
# 3  a otherStuff 3
# 4 14 otherStuff 4
# 5    otherStuff 5

#    c n    data
# 1 10 a stuff 1
# 2 11 b stuff 2
# 3 12 c stuff 3
# 4 13 d stuff 4
# 5 14 e stuff 5


library(dplyr)
library(tidyr)

merge(df1,
      gather(df2, key = "Group", value = "w", -data),
      by = "w", all.x = TRUE)


#    w       data.x  data.y Group
# 1    otherStuff 5    <NA>  <NA>
# 2 10 otherStuff 1 stuff 1     c
# 3 14 otherStuff 4 stuff 5     c
# 4  a otherStuff 2 stuff 1     n
# 5  a otherStuff 3 stuff 1     n

#虚拟数据更新数据列
df1=data.frame（w=c（10，'a'，'a'，14'，），data=paste（'otherStuff'，1:5））
df2=data.frame（c=10:14，n=letters[1:5]，data=paste（'stuff'，1:5））
df1；df2
#w数据
#1 10其他材料1
#2 a其他东西2
#其他的东西
#4其他材料4
#5其他材料5
#碳氮数据
#1 10一件东西1
#2 11B材料2
#3 12 c材料3
#4 13 d材料4
#5 14 e材料5
图书馆（dplyr）
图书馆（tidyr）
合并（df1，
收集（df2，key=“Group”，value=“w”，数据），
by=“w”，all.x=TRUE）
#w data.x data.y组
#1其他材料5
#2 10其他材料1材料1 c
#3 14其他材料4材料5 c
#4 a其他材料2材料1 n
#5 a其他材料3材料1 n

您希望最终数据帧的列名是什么？在每个单独的数据帧中似乎都有一个同名的列。匹配来自列w，并且值不一致。有些值是数值，有些是字符。我希望将所有数据合并到一个GO中，您可以将数据列中的值更改为1,2,3，。。。等等，这样我们就可以理解预期的输出。那么，您希望

df1

返回，并添加一个新列，其中包含附加的

df2$data

值，该值与

df2$c

或

df2$n

到

df1$w

匹配？你需要小心打字。您定义

df1

的方式可能使

df1$w

成为一个因素。请参见

？data.frame

中的参数

stringsAsFactors

，以了解更多信息，df1$w包含df2$c或df2$n列中的值，出于操作目的，我希望df1和df2中的数据列位于同一数据框中

#dummy data updated data columns
df1 = data.frame(w = c(10,'a','a',14,''), data = paste('otherStuff', 1:5))
df2 = data.frame(c = 10:14, n = letters[1:5], data = paste('stuff', 1:5))

df1;df2

#    w         data
# 1 10 otherStuff 1
# 2  a otherStuff 2
# 3  a otherStuff 3
# 4 14 otherStuff 4
# 5    otherStuff 5

#    c n    data
# 1 10 a stuff 1
# 2 11 b stuff 2
# 3 12 c stuff 3
# 4 13 d stuff 4
# 5 14 e stuff 5


library(dplyr)
library(tidyr)

merge(df1,
      gather(df2, key = "Group", value = "w", -data),
      by = "w", all.x = TRUE)


#    w       data.x  data.y Group
# 1    otherStuff 5    <NA>  <NA>
# 2 10 otherStuff 1 stuff 1     c
# 3 14 otherStuff 4 stuff 5     c
# 4  a otherStuff 2 stuff 1     n
# 5  a otherStuff 3 stuff 1     n