删除R中没有非连续NA值的组
我有以下数据框删除R中没有非连续NA值的组,r,na,R,Na,我有以下数据框 group <- c(2,2,2,2,4,4,4,4,5,5,5,5) D <- c(NA,2,NA,NA,NA,2,3,NA,NA,NA,1,1) df <- data.frame(group, D) df group D 1 2 NA 2 2 2 3 2 NA 4 2 NA 5 4 NA 6 4 2 7 4 3 8 4 NA 9 5 NA 10 5
group <- c(2,2,2,2,4,4,4,4,5,5,5,5)
D <- c(NA,2,NA,NA,NA,2,3,NA,NA,NA,1,1)
df <- data.frame(group, D)
df
group D
1 2 NA
2 2 2
3 2 NA
4 2 NA
5 4 NA
6 4 2
7 4 3
8 4 NA
9 5 NA
10 5 NA
11 5 1
12 5 1
有什么想法吗:)?如何使用每组NA值指数之间的差异
library(dplyr)
df %>% group_by(group) %>% filter(any(diff(which(is.na(D))) > 1))
## A tibble: 8 x 2
## Groups: group [2]
# group D
# <dbl> <dbl>
#1 2. NA
#2 2. 2.
#3 2. NA
#4 2. NA
#5 4. NA
#6 4. 2.
#7 4. 3.
#8 4. NA
库(dplyr)
df%%>%group_by(group)%%>%filter(任意(diff(即.na(D)))>1))
##一个tibble:8x2
##分组:分组[2]
#D组
#
#1 2. NA
#2 2. 2.
#3 2. NA
#4 2. NA
#5 4. NA
#6 4. 2.
#7 4. 3.
#8 4. NA
我不确定这是否能抓住所有潜在的边缘情况,但对于给定的示例来说似乎是可行的
library(dplyr)
df %>% group_by(group) %>% filter(any(diff(which(is.na(D))) > 1))
## A tibble: 8 x 2
## Groups: group [2]
# group D
# <dbl> <dbl>
#1 2. NA
#2 2. 2.
#3 2. NA
#4 2. NA
#5 4. NA
#6 4. 2.
#7 4. 3.
#8 4. NA