如果数据中的NAs频率高于dplyr中的某个阈值,如何删除组?

如果数据中的NAs频率高于dplyr中的某个阈值,如何删除组?,r,dplyr,tidyverse,R,Dplyr,Tidyverse,我怎样才能从 # A tibble: 6 x 2 group_var psbl_NAs <chr> <dbl> 1 a 1 2 a NA 3 a NA 4 b 1 5 b 1 6 b NA 我们可以通过,变异,然后筛选,对您进行分组: d %>% group_by

我怎样才能从

# A tibble: 6 x 2
  group_var psbl_NAs
  <chr>        <dbl>
1 a                1
2 a               NA
3 a               NA
4 b                1
5 b                1
6 b               NA

我们可以通过,
变异
,然后
筛选
,对您进行分组:

d %>%
    group_by(group_var) %>%
    # calculate % of NA values by group
    mutate(pct_na = mean(is.na(psbl_NAs))) %>%
    # only keep where % of NA values < 0.5
    filter(pct_na < 0.5) %>%
    select(-pct_na) # remove % NA column

#  group_var psbl_NAs
#  <chr>        <dbl>
# 1 b                1
# 2 b                1
# 3 b               NA
tibble(
  group_var = c(rep("a",3), rep("b",3)),
  psbl_NAs  = c(1, NA, NA, 1, 1, NA)
) %>% 
group_by(group_var) %>%
??????
d %>%
    group_by(group_var) %>%
    # calculate % of NA values by group
    mutate(pct_na = mean(is.na(psbl_NAs))) %>%
    # only keep where % of NA values < 0.5
    filter(pct_na < 0.5) %>%
    select(-pct_na) # remove % NA column

#  group_var psbl_NAs
#  <chr>        <dbl>
# 1 b                1
# 2 b                1
# 3 b               NA
d %>%
    group_by(group_var) %>%
    # calculate % of NA values by group
    mutate(pct_na = mean(is.na(psbl_NAs)))

#   group_var psbl_NAs pct_na
#   <chr>        <dbl>  <dbl>
# 1 a                1  0.667
# 2 a               NA  0.667
# 3 a               NA  0.667
# 4 b                1  0.333
# 5 b                1  0.333
# 6 b               NA  0.333
d[with(d, ave(psbl_NAs, group_var, FUN = function(x) mean(is.na(x)))) < 0.5,]