R:按列合并不同大小的数据帧

R:按列合并不同大小的数据帧,r,merge,R,Merge,我目前正在尝试合并两个在R中共享一列的数据帧。这应该不会有问题,但在使用merge时我发现了一个奇怪的错误。 (这是测试数据,这些表是通过从csv读取的read.csv2导入的) 数据帧1: match similar aa.alcoholics_anonymous.n.01 mission.n.01 aa.associate_in_arts.n.01 associate_in_nursing.n.01 ab

我目前正在尝试合并两个在R中共享一列的数据帧。这应该不会有问题,但在使用merge时我发现了一个奇怪的错误。 (这是测试数据,这些表是通过从csv读取的read.csv2导入的)

数据帧1:

match                              similar
aa.alcoholics_anonymous.n.01       mission.n.01
aa.associate_in_arts.n.01          associate_in_nursing.n.01
abbreviation.abbreviation.n.01     word.n.01  
abbreviation.abbreviation.n.02     truncation.n.03
aberration.aberrance.n.01          varix.n.01
aberration.aberration.n.02         dissociative_disorder.n.01
aberration.aberration.n.03         tyndall_effect.n.01 
abnormality.abnormality.n.01       varix.n.01
abnormality.abnormality.n.02       imbecility.n.01
abnormality.abnormality.n.03       unusualness.n.01
数据帧2:

match                            wordnet_number
aa.alcoholics_anonymous.n.01     2
aa.associate_in_arts.n.01        3
abbreviation.abbreviation.n.01   1
aberration.aberrance.n.01        1
aberration.aberration.n.02       2
aberration.aberration.n.03       3
abnormality.abnormality.n.01     1
预期结果应如下所示:

match                            similar                   Wordnet_number
aa.alcoholics_anonymous.n.01     mission.n.01               2
aa.associate_in_arts.n.01        associate_in_nursing.n.01  3
abbreviation.abbreviation.n.01   word.n.01                  2
abbreviation.abbreviation.n.02   truncation.n.03            NA
aberration.aberrance.n.01        varix.n.01                 1
aberration.aberration.n.02       dissociative_disorder.n.01 2
aberration.aberration.n.03       tyndall_effect.n.01        3
abnormality.abnormality.n.01     varix.n.01                 1
abnormality.abnormality.n.02     imbecility.n.01            NA
abnormality.abnormality.n.03     unusualness.n.01           NA
通常,按“匹配”列合并应该可以正常工作吗?(以前做过类似的事情),但出于某种原因,我一直得到这样的结果:

总计根据
帮助(合并)
,仅在一列上合并数据帧时,应使用
不可比较=NA
。看起来它在这里起作用了

merge(test1, test2, by = "match", all = TRUE, incomparables = NA)
#                             match                    similar wordnet_number
# 1    aa.alcoholics_anonymous.n.01               mission.n.01              2
# 2       aa.associate_in_arts.n.01  associate_in_nursing.n.01              3
# 3  abbreviation.abbreviation.n.01                  word.n.01              1
# 4  abbreviation.abbreviation.n.02            truncation.n.03             NA
# 5       aberration.aberrance.n.01                 varix.n.01              1
# 6      aberration.aberration.n.02 dissociative_disorder.n.01              2
# 7      aberration.aberration.n.03        tyndall_effect.n.01              3
# 8    abnormality.abnormality.n.01                 varix.n.01              1
# 9    abnormality.abnormality.n.02            imbecility.n.01             NA
# 10   abnormality.abnormality.n.03           unusualness.n.01             NA

在那些“匹配”列上使用
打印
。我打赌其中一个有尾随空格。或者发布
dput(dataframe1)
dput(dataframe2)
的输出,而不是print.data.frame输出
merge(test1, test2, by = "match", all = TRUE, incomparables = NA)
#                             match                    similar wordnet_number
# 1    aa.alcoholics_anonymous.n.01               mission.n.01              2
# 2       aa.associate_in_arts.n.01  associate_in_nursing.n.01              3
# 3  abbreviation.abbreviation.n.01                  word.n.01              1
# 4  abbreviation.abbreviation.n.02            truncation.n.03             NA
# 5       aberration.aberrance.n.01                 varix.n.01              1
# 6      aberration.aberration.n.02 dissociative_disorder.n.01              2
# 7      aberration.aberration.n.03        tyndall_effect.n.01              3
# 8    abnormality.abnormality.n.01                 varix.n.01              1
# 9    abnormality.abnormality.n.02            imbecility.n.01             NA
# 10   abnormality.abnormality.n.03           unusualness.n.01             NA