R 将数据A与数据B一起子集-但保留B的相关列的信息

R 将数据A与数据B一起子集-但保留B的相关列的信息,r,subset,R,Subset,我有两个数据帧,比如A和B,希望使用列信息匹配来子集A。到目前为止还不错,这是我已经知道的。我知道match,它是%和%。我需要的是更复杂一点,我只是不能把我的思想围绕着这个。我想跟踪a列/B列的信息 例如: words <- c("Hello","Experts","Please","Help","Me","Out","With","This","Problem","!") letterV <- toupper(letters[1:20]) numbersV <- 1:20

我有两个数据帧,比如A和B,希望使用列信息匹配来子集A。到目前为止还不错,这是我已经知道的。我知道match,它是%和%。我需要的是更复杂一点,我只是不能把我的思想围绕着这个。我想跟踪a列/B列的信息

例如:

words <- c("Hello","Experts","Please","Help","Me","Out","With","This","Problem","!")
letterV <- toupper(letters[1:20])
numbersV <- 1:20
combV <- paste("Test_",letters[1:10],sep = "")
rainbowV <- rainbow(20)

testDF.A <- as.data.frame(cbind(letterV,cbind(numbersV,cbind(combV,rainbowV))),stringsAsFactors = F)
testDF.B <- as.data.frame(cbind(numbersV[1:10],cbind(letterV[1:10],cbind(combV,words))),stringsAsFactors = F)
testDF_B:

    V1  V2  combV   words
1   1   A   Test_a  Hello
2   2   B   Test_b  Experts
3   3   C   Test_c  Please
4   4   D   Test_d  Help
5   5   E   Test_e  Me
6   6   F   Test_f  Out
7   7   G   Test_g  With
8   8   H   Test_h  This
9   9   I   Test_i  Problem
10  10  J   Test_j  !
假设我想比较/匹配A[,3]和B[,3],但保留信息B[,4] 预期结果:

    keptCol letterV numbersV    combV   rainbowV
1   Hello   A       1           Test_a  #FF0000FF
2   Experts B       2           Test_b  #FF4D00FF
3   Please  C       3           Test_c  #FF9900FF
4   Help    D       4           Test_d  #FFE500FF
5   Me      E       5           Test_e  #CCFF00FF
6   Out     F       6           Test_f  #80FF00FF
7   With    G       7           Test_g  #33FF00FF
8   This    H       8           Test_h  #00FF19FF
9   Problem I       9           Test_i  #00FF66FF
10  !       J       10          Test_j  #00FFB2FF
11  Hello   K       11          Test_a  #00FFFFFF
12  Experts L       12          Test_b  #00B3FFFF
13  Please  M       13          Test_c  #0066FFFF
14  Help    N       14          Test_d  #001AFFFF
15  Me      O       15          Test_e  #3300FFFF
16  Out     P       16          Test_f  #7F00FFFF
17  With    Q       17          Test_g  #CC00FFFF
18  This    R       18          Test_h  #FF00E6FF
19  Problem S       19          Test_i  #FF0099FF
20  !       T       20          Test_j  #FF004DFF
    keptCol V1  V2          V3
1   Thank   a   Test_a      a1
2   Thank   b   Test_a      b1
3   You     c   Test_b      c1
4   !       c   Test_b      c1
5   Thank   d   Test_a      d1
为了让思考变得更加困难,我实际上也需要涵盖多个点击:

假设我想比较/匹配A[,2]和B[,1],但保留B[,2]的信息

small.a <- as.data.frame(cbind(letters[1:6],cbind(c("Test_a","Test_a","Test_b","Test_a","Test_c","Test_z"),c("a1","b1","c1","d1","e1","f1"))),stringsAsFactors = F)
small.b <- as.data.frame(cbind(c("Test_a","Test_b","Test_d","Test_e","Test_b"),c("Thank","You","Very","Much","!")),stringsAsFactors = F)
小的

    V1      V2
1   Test_a  Thank
2   Test_b  You
3   Test_d  Very
4   Test_e  Much
5   Test_b  !
预期结果:

    keptCol letterV numbersV    combV   rainbowV
1   Hello   A       1           Test_a  #FF0000FF
2   Experts B       2           Test_b  #FF4D00FF
3   Please  C       3           Test_c  #FF9900FF
4   Help    D       4           Test_d  #FFE500FF
5   Me      E       5           Test_e  #CCFF00FF
6   Out     F       6           Test_f  #80FF00FF
7   With    G       7           Test_g  #33FF00FF
8   This    H       8           Test_h  #00FF19FF
9   Problem I       9           Test_i  #00FF66FF
10  !       J       10          Test_j  #00FFB2FF
11  Hello   K       11          Test_a  #00FFFFFF
12  Experts L       12          Test_b  #00B3FFFF
13  Please  M       13          Test_c  #0066FFFF
14  Help    N       14          Test_d  #001AFFFF
15  Me      O       15          Test_e  #3300FFFF
16  Out     P       16          Test_f  #7F00FFFF
17  With    Q       17          Test_g  #CC00FFFF
18  This    R       18          Test_h  #FF00E6FF
19  Problem S       19          Test_i  #FF0099FF
20  !       T       20          Test_j  #FF004DFF
    keptCol V1  V2          V3
1   Thank   a   Test_a      a1
2   Thank   b   Test_a      b1
3   You     c   Test_b      c1
4   !       c   Test_b      c1
5   Thank   d   Test_a      d1
另一个问题可能是如果我想保留多列的信息

我希望我提供了足够的信息让您理解这个问题:

您可以试试

library(dplyr)
inner_join(small.a, small.b, by=c('V2'='V1'))
或者使用合并


就是这样!非常感谢。将变量名用作“by”参数似乎存在一个问题:“nameA=namedframea[colNum]”类似的内容在这里不被接受-您是否碰巧知道解决方法?您可能需要在更改列名之前进行更改。名称数据集[编号]
merge(small.a, small.b, by.x='V2', by.y='V1')