Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/67.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R:根据列删除缺少数据的所有行_R_Dataframe - Fatal编程技术网

R:根据列删除缺少数据的所有行

R:根据列删除缺少数据的所有行,r,dataframe,R,Dataframe,我在R中有以下示例数据帧: Test <- data.frame("Individual"=c("John", "John", "Alice", "Alice", "Alice", "Eve", "Eve","Eve","Jack"), "ExamNumber"=c("Test1", "Test2", "Test1", "Test2", "Test3", "Test1", "Test2", "Test3", "Test3")) 但是,我想删除未通过所有三项测试的个人: Indivi

我在R中有以下示例数据帧:

Test <- data.frame("Individual"=c("John", "John", "Alice", "Alice", "Alice", "Eve", "Eve","Eve","Jack"), "ExamNumber"=c("Test1", "Test2", "Test1", "Test2", "Test3", "Test1", "Test2", "Test3",  "Test3"))
但是,我想删除未通过所有三项测试的个人:

  Individual ExamNumber
1      Alice      Test1
2      Alice      Test2
3      Alice      Test3
4        Eve      Test1
5        Eve      Test2
6        Eve      Test3

您可以使用
ave
按个人分组,并使用
NROW

Test[ave(1:nrow(Test), Test$Individual, FUN = NROW)==3,]
#  Individual ExamNumber
#3      Alice      Test1
#4      Alice      Test2
#5      Alice      Test3
#6        Eve      Test1
#7        Eve      Test2
#8        Eve      Test3
这里有一个稍微更稳健的方法,使用相同的思想,但是使用了
split

Test[order(Test$Individual),][unlist(lapply(split(Test, Test$Individual), function(a)
          rep(all(unique(Test$ExamNumber) %in% a$ExamNumber), NROW(a)))),]

下面是另一种使用
dplyr
检查组内是否存在所有三个测试的方法:

library(dplyr)
Test %>% 
  group_by(Individual) %>%
  filter(all(c("Test1", "Test2", "Test3") %in% ExamNumber)) %>%
  ungroup()

# A tibble: 6 × 2
  Individual ExamNumber
      <fctr>     <fctr>
1      Alice      Test1
2      Alice      Test2
3      Alice      Test3
4        Eve      Test1
5        Eve      Test2
6        Eve      Test3
库(dplyr)
测试%>%
分组(个人)%>%
过滤器(所有(c(“Test1”、“Test2”、“Test3”)%在%ExamNumber中))%>%
解组()
#一个tibble:6×2
个人考试号
1爱丽丝测试1
2爱丽丝测试2
3爱丽丝测试3
4 Eve测试1
5除夕夜测试2
6 Eve测试3

使用基本R

ind_eq3 <- names( which( with( Test, by( Test, 
                                         INDICES = list(Individual), 
                                         FUN = function(x) length(unique(x$ExamNumber)) == 3) ) ) )
with(Test, Test[ Individual %in% ind_eq3, ] )

#   Individual ExamNumber
# 3      Alice      Test1
# 4      Alice      Test2
# 5      Alice      Test3
# 6        Eve      Test1
# 7        Eve      Test2
# 8        Eve      Test3
ind_eq3 <- names( which( with( Test, by( Test, 
                                         INDICES = list(Individual), 
                                         FUN = function(x) length(unique(x$ExamNumber)) == 3) ) ) )
with(Test, Test[ Individual %in% ind_eq3, ] )

#   Individual ExamNumber
# 3      Alice      Test1
# 4      Alice      Test2
# 5      Alice      Test3
# 6        Eve      Test1
# 7        Eve      Test2
# 8        Eve      Test3
library('data.table')
setDT(Test)[ , 
             j  = .SD[length( unique(ExamNumber) ) == 3, ],
             by = 'Individual']