Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/81.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R 获取组内的重复值_R_Dataframe_Group By - Fatal编程技术网

R 获取组内的重复值

R 获取组内的重复值,r,dataframe,group-by,R,Dataframe,Group By,我试图在组中获得重复的值。更具体地说,我想检查每个家庭内的重复儿童ID,但不同家庭之间的重复儿童ID是可以的。例如,我有一个名为“village”的数据框: village我们可以按“家庭”、“儿童”和filter对行数大于1的行进行分组 library(dplyr) village %>% group_by(household, children) %>% filter(n() > 1) %>% ungroup -输出 # A tibble: 2

我试图在组中获得重复的值。更具体地说,我想检查每个家庭内的重复儿童ID,但不同家庭之间的重复儿童ID是可以的。例如,我有一个名为“village”的数据框:


village我们可以按“家庭”、“儿童”和
filter对行数大于1的行进行分组

library(dplyr)
village %>% 
   group_by(household, children) %>% 
   filter(n() > 1) %>%
   ungroup
-输出

# A tibble: 2 x 2
#  household children
#      <dbl> <chr>   
#1         1 A001    
#2         1 A001   

另一个使用
子集
+
ave

> subset(village,ave(household,household,children,FUN = length)>1)
  household children
1         1     A001
3         1     A001

group_by()答案正是我所需要的,因为我在实际数据集中有两列以上的数据。谢谢大家!@KarenLiu您也可以使用
duplicated
子集,即
village[duplicated(village[1:2])|duplicated(village[1:2],fromLast=TRUE),]
# A tibble: 2 x 2
#  household children
#      <dbl> <chr>   
#1         1 A001    
#2         1 A001   
village[duplicated(village)|duplicated(village, fromLast = TRUE),]
#  household children
#1         1     A001
#3         1     A001 
> subset(village,ave(household,household,children,FUN = length)>1)
  household children
1         1     A001
3         1     A001