从R中的不同数据集中筛选匹配案例_R_Filter

从R中的不同数据集中筛选匹配案例

r filter

从R中的不同数据集中筛选匹配案例,r,filter,R,Filter,我有一些时间序列数据集，例如1960-2000年期间的GDP增长、外国直接投资和教育。在所有数据集中识别（唯一）变量是country\u name。一些国家存在于一个数据集中，但在另一个数据集中缺失。我想选择（过滤）所有数据集中存在的国家。如何在R中执行此操作？ data.frame的外观如下： FDI country 2001 2002 2003 2004 2005 A -0.4769080 -0.89159864 -

我有一些时间序列数据集，例如1960-2000年期间的GDP增长、外国直接投资和教育。在所有数据集中识别（唯一）变量是

country\u name

。一些国家存在于一个数据集中，但在另一个数据集中缺失。我想选择（过滤）所有数据集中存在的国家。如何在

中执行此操作？

data.frame

的外观如下：

FDI
country       2001        2002       2003        2004       2005
   A    -0.4769080 -0.89159864 -0.2140591 -0.93326470 -0.1726757
   B    -0.1246048  0.09929738  1.0522747  0.08724465 -0.9064532
   C     1.9592917  1.06080273  0.5316807 -0.94478259 -1.1342767
   E    -1.0585177  0.58981906  0.5210434 -0.81212231  0.7862898

GDP growth
country       2001        2002       2003       2004        2005
   A    0.06323898  0.08537586  0.8982821 -1.3635704  0.45569153
   B    1.19848687  1.41307212  0.3358561 -0.8368255  0.22987821
   D    1.13491209 -0.98472341  0.7545730 -0.3595143  0.07172593
   E    0.83561289  0.51227238 -0.1377516  1.8841489 -0.94319505

我需要选择匹配的案例A、B和E，并将所有内容放在长格式中，最好使用

reformae2

输出应如下所示（不包括C和D，因为它们不在两个数据集中）：

我们首先使用数据帧之间的公共国家创建一个索引变量，并使用

melt

from

restrape2

包转换为长格式。最后，我们通过前两列合并


library(reshape2)
ind <- intersect(FDI$country, GDP_Growth$country)
d1 <- melt(FDI[FDI$country %in% ind,], id.vars = 'country')
d2 <- melt(GDP_Growth[GDP_Growth$country %in% ind,], id.vars = 'country')
new.df <- merge(d1, d2, by = c('country', 'variable'))
head(new.df)
#    country variable    value.x     value.y
#1       A    X2001 -0.4769080  0.06323898
#2       A    X2002 -0.8915986  0.08537586
#3       A    X2003 -0.2140591  0.89828210
#4       A    X2004 -0.9332647 -1.36357040
#5       A    X2005 -0.1726757  0.45569153
#6       B    X2001 -0.1246048  1.19848687

library（重塑2）
ind您需要包括一个可复制的示例。添加了一个示例@nRussell您的预期输出是什么？添加了输出应该是什么样子。我不明白为什么这个问题被否决了。@MoazzemHossen，我认为否决是因为你没有努力解决你的问题。人们有时会因此而生气不用说，我不是落魄者谢谢！我对R相当陌生，不知道intersect函数。我现在可以想办法了。
library(reshape2)
ind <- intersect(FDI$country, GDP_Growth$country)
d1 <- melt(FDI[FDI$country %in% ind,], id.vars = 'country')
d2 <- melt(GDP_Growth[GDP_Growth$country %in% ind,], id.vars = 'country')
new.df <- merge(d1, d2, by = c('country', 'variable'))
head(new.df)
#    country variable    value.x     value.y
#1       A    X2001 -0.4769080  0.06323898
#2       A    X2002 -0.8915986  0.08537586
#3       A    X2003 -0.2140591  0.89828210
#4       A    X2004 -0.9332647 -1.36357040
#5       A    X2005 -0.1726757  0.45569153
#6       B    X2001 -0.1246048  1.19848687