R 选择每个组具有多个不同值的组_R_Subset

R 选择每个组具有多个不同值的组

R 选择每个组具有多个不同值的组,r,subset,R,Subset,我有如下数据： ID category class 1 a m 1 a s 1 b s 2 a m 3 b s 4 c s 5 d s 我想通过只包含那些具有多个（>1）不同类别的“ID”来子集数据我的预期产出： ID category class 1 a m 1 a s 1 b s 有没有办法做到这一点

我有如下数据：

ID  category class
1   a        m  
1   a        s
1   b        s
2   a        m
3   b        s
4   c        s
5   d        s

我想通过只包含那些具有多个（

>1

）不同类别的“ID”来子集数据

我的预期产出：

ID  category class
1   a        m
1   a        s
1   b        s

有没有办法做到这一点

我试过了

library(dplyr)
df %>% 
  group_by(ID) %>%
  filter(n_distinct(category, class) > 1)

但它给了我一个错误：

# Error: expecting a single value

使用

数据表

library(data.table) #see: https://github.com/Rdatatable/data.table/wiki for more
setDT(data) #convert to native 'data.table' type by reference
data[ , if(uniqueN(category) > 1) .SD, by = ID]

uniqueN

是

数据。table

的

length（unique（））

的（fast）本机掩码，而

.SD

只是整个

数据。table

（在更一般的情况下，它可以表示列的子集，例如当

.SDcols

参数被激活时）。因此，基本上中间语句（

，column selection参数）表示返回与

ID

关联的所有列和行，其中至少有两个不同的

category

值

使用

by

参数扩展到涉及多列的情况