在R中匹配不同数据帧中的站点_R_Dataframe_Subset_Extract

在R中匹配不同数据帧中的站点

r dataframe

在R中匹配不同数据帧中的站点,r,dataframe,subset,extract,R,Dataframe,Subset,Extract,我有多个数据帧，看起来像这样：专栏中有许多物种，我在这里不作报道。 d1： d2 d3 我想做的是提取每年仅出现在中的sames站点（Reef），并将结果输入一个数据框，如下所示： Year Region Reef Depth Transect Pharia pyramidatus 2000 LP ISLOTES 5 1 0.20 2000 LP NORTE 5 1 0.10

我有多个数据帧，看起来像这样：专栏中有许多物种，我在这里不作报道。 d1：

我想做的是提取每年仅出现在中的sames站点（
Reef
），并将结果输入一个数据框，如下所示：

Year Region Reef Depth Transect Pharia pyramidatus 2000 LP ISLOTES 5 1 0.20 2000 LP NORTE 5 1 0.10 2000 LP NORTE 20 1 0.00 2010 LP ISLOTES 5 1 0.20 2010 LP NORTE 5 1 0.10 2010 LP NORTE 20 1 0.00 2016 LP ISLOTES 5 1 0.20 2016 LP NORTE 20 1 0.00

非常感谢您对
dplyr解决方案的帮助： library(dplyr) rbind(df1, df2, df3) %>% group_by(Reef) %>% filter(n_distinct(Year) == 3) 结果： # A tibble: 8 x 6 # Groups: Reef [2] Year Region Reef Depth Transect Pharia_pyramidatus <int> <fctr> <fctr> <int> <int> <dbl> 1 2000 LP ISLOTES 5 1 0.2 2 2000 LP NORTE 5 1 0.1 3 2000 LP NORTE 20 1 0.0 4 2010 LP ISLOTES 5 1 0.2 5 2010 LP NORTE 5 1 0.1 6 2010 LP NORTE 20 1 0.0 7 2016 LP ISLOTES 5 1 0.2 8 2016 LP NORTE 20 1 0.0 df1 = read.table(text = "Year Region Reef Depth Transect Pharia_pyramidatus 2000 LP BALLENA 5 1 0.03 2000 LP ISLOTES 5 1 0.20 2000 LP NORTE 5 1 0.10 2000 LP NORTE 20 1 0.00", header = TRUE) df2 = read.table(text = "Year Region Reef Depth Transect Pharia_pyramidatus 2010 LP PLAYA 5 1 0.03 2010 LP ISLOTES 5 1 0.20 2010 LP NORTE 5 1 0.10 2010 LP NORTE 20 1 0.00", header = TRUE) df3 = read.table(text = "Year Region Reef Depth Transect Pharia_pyramidatus 2016 LP BALLENA 5 1 0.03 2016 LP ISLOTES 5 1 0.20 2016 LP SUR 5 1 0.10 2016 LP NORTE 20 1 0.00", header = TRUE) 数据： # A tibble: 8 x 6 # Groups: Reef [2] Year Region Reef Depth Transect Pharia_pyramidatus <int> <fctr> <fctr> <int> <int> <dbl> 1 2000 LP ISLOTES 5 1 0.2 2 2000 LP NORTE 5 1 0.1 3 2000 LP NORTE 20 1 0.0 4 2010 LP ISLOTES 5 1 0.2 5 2010 LP NORTE 5 1 0.1 6 2010 LP NORTE 20 1 0.0 7 2016 LP ISLOTES 5 1 0.2 8 2016 LP NORTE 20 1 0.0 df1 = read.table(text = "Year Region Reef Depth Transect Pharia_pyramidatus 2000 LP BALLENA 5 1 0.03 2000 LP ISLOTES 5 1 0.20 2000 LP NORTE 5 1 0.10 2000 LP NORTE 20 1 0.00", header = TRUE) df2 = read.table(text = "Year Region Reef Depth Transect Pharia_pyramidatus 2010 LP PLAYA 5 1 0.03 2010 LP ISLOTES 5 1 0.20 2010 LP NORTE 5 1 0.10 2010 LP NORTE 20 1 0.00", header = TRUE) df3 = read.table(text = "Year Region Reef Depth Transect Pharia_pyramidatus 2016 LP BALLENA 5 1 0.03 2016 LP ISLOTES 5 1 0.20 2016 LP SUR 5 1 0.10 2016 LP NORTE 20 1 0.00", header = TRUE) 你所说的“站点”是指礁石Reef ？另外，NORTE 在d3 中不存在，那么你为什么要将其包括在最终df 中？请参阅更新数据的解决方案谢谢你的回复，它解决了问题，我有一个问题，n_distinct（Year）==3如何工作？“这是因为三个数据帧吗？”FabioFavoritto在我的回答中补充了解释。我还添加了一个更通用的版本，以防您不想硬编码数据框中的年数！感谢您的快速和有益的答复！我注意到，在一般情况下，在输出中添加了一个附加列，并带有“Year_distinct”，这不是一个问题，只是如果有人想使用此函数，并且没有发现添加的新列，它应该是一个警告；）@FabioFavoriteTo如果您不想在最终的数据集中显示年份不同列，您只需添加%%>%select（-Year\u distinct）：） rbind(df1, df2, df3) %>% mutate(Year_distinct = n_distinct(Year)) %>% group_by(Reef) %>% filter(n_distinct(Year) == Year_distinct) %>% select(-Year_distinct) df1 = read.table(text = "Year Region Reef Depth Transect Pharia_pyramidatus 2000 LP BALLENA 5 1 0.03 2000 LP ISLOTES 5 1 0.20 2000 LP NORTE 5 1 0.10 2000 LP NORTE 20 1 0.00", header = TRUE) df2 = read.table(text = "Year Region Reef Depth Transect Pharia_pyramidatus 2010 LP PLAYA 5 1 0.03 2010 LP ISLOTES 5 1 0.20 2010 LP NORTE 5 1 0.10 2010 LP NORTE 20 1 0.00", header = TRUE) df3 = read.table(text = "Year Region Reef Depth Transect Pharia_pyramidatus 2016 LP BALLENA 5 1 0.03 2016 LP ISLOTES 5 1 0.20 2016 LP SUR 5 1 0.10 2016 LP NORTE 20 1 0.00", header = TRUE)