R 如何创建具有完整连续月份的组子集_R

R 如何创建具有完整连续月份的组子集

R 如何创建具有完整连续月份的组子集,r,R,我正在尝试创建一组在R中连续满个月的组的子集例如，如果有如下数据： structure(list(Group = c(1, 1, 1, 1, 2, 2, 2, 2), Month = c(3, 4, 7, 8, 1, 2, 3, 4)), class = "data.frame", row.names = c(NA, -8L), codepage = 65001L) 在表中，这看起来像： ╔═══════╦═══════╗ ║ Group ║ Month ║ ╠═══

我正在尝试创建一组在R中连续满个月的组的子集

例如，如果有如下数据：

structure(list(Group = c(1, 1, 1, 1, 2, 2, 2, 2), Month = c(3, 
4, 7, 8, 1, 2, 3, 4)), class = "data.frame", row.names = c(NA, 
-8L), codepage = 65001L)

在表中，这看起来像：

╔═══════╦═══════╗
║ Group ║ Month ║
╠═══════╬═══════╣
║ 1     ║ 3     ║
╠═══════╬═══════╣
║ 1     ║ 4     ║
╠═══════╬═══════╣
║ 1     ║ 7     ║
╠═══════╬═══════╣
║ 1     ║ 8     ║
╠═══════╬═══════╣
║ 2     ║ 1     ║
╠═══════╬═══════╣
║ 2     ║ 2     ║
╠═══════╬═══════╣
║ 2     ║ 3     ║
╠═══════╬═══════╣
║ 2     ║ 4     ║
╚═══════╩═══════╝

我希望第1组被放弃，因为在连续的几个月里有一个“转折点”（没有第5、6个月）。

一个

dplyr

选项可以是：

df %>%
 group_by(Group) %>%
 filter(all(diff(Month) == 1))

  Group Month
  <dbl> <dbl>
1     2     1
2     2     2
3     2     3
4     2     4

df%>%
分组依据（分组）%>%
过滤器（全部（差异（月）==1））
团体月
1     2     1
2     2     2
3     2     3
4     2     4

基本R解决方案可以使用

ave

，即

df[!!with(df, ave(Month, Group, FUN = function(i)all(diff(i) == 1))),]

#  Group Month
#5     2     1
#6     2     2
#7     2     3
#8     2     4

下面是一个使用

子集+ave

> subset(df,as.logical(ave(Month,Group, FUN = function(x) all(diff(x)==1))))
  Group Month
5     2     1
6     2     2
7     2     3
8     2     4

与所有组的观察次数进行对比，并检查所有差异是否等于一，也是有效的：
library(tidyverse)
#Code
df %>% group_by(Group) %>%
  mutate(Diff=c(1,diff(Month)),
         Value=n()==sum(Diff==1)) %>%
  filter(Value) %>% ungroup() %>% select(-c(Value,Diff))

输出：
# A tibble: 4 x 2
  Group Month
  <dbl> <dbl>
1     2     1
2     2     2
3     2     3
4     2     4

#一个tible:4 x 2
团体月
1     2     1
2     2     2
3     2     3
4     2     4

使用的一些数据：
#Data
df <- structure(list(Group = c(1, 1, 1, 1, 2, 2, 2, 2), Month = c(3, 
4, 7, 8, 1, 2, 3, 4)), class = "data.frame", row.names = c(NA, 
-8L), codepage = 65001L)

#数据
你走了最长的路：P。我的编辑与基于逻辑列的过滤有关。您不需要通过==TRUE
@Sotos重新进行逻辑操作，因为您是专业的程序员：）谢谢您编辑我的答案：）