检查列表中的所有元素在R中是否相等_R_Comparison

检查列表中的所有元素在R中是否相等

检查列表中的所有元素在R中是否相等,r,comparison,R,Comparison,我有几个向量的列表。我想检查一下列表中的所有向量是否相等。有idential，它只适用于成对比较。所以我写了下面的函数，它看起来很难看。我仍然没有找到更好的解决办法。这是我的简历： test_true <- list(a=c(1,2,3),b=c(1,2,3),d=c(1,2,3)) test_false <- list(a=c(1,2,3),b=c(1,2,3),d=c(1,32,13)) compareList <- function(li){ stopifnot(l

我有几个向量的列表。我想检查一下列表中的所有向量是否相等。有

idential

，它只适用于成对比较。所以我写了下面的函数，它看起来很难看。我仍然没有找到更好的解决办法。这是我的简历：

test_true <- list(a=c(1,2,3),b=c(1,2,3),d=c(1,2,3))
test_false <- list(a=c(1,2,3),b=c(1,2,3),d=c(1,32,13))

compareList <- function(li){
  stopifnot(length(li) > 1)
  l <- length(li)
  res <- lapply(li[-1],function(X,x) identical(X,x),x=li[[1]])
  res <- all(unlist(res))
  res
}

compareList(test_true)
compareList(test_false)

test\u true怎么样
allSame <- function(x) length(unique(x)) == 1

allSame(test_true)
# [1] TRUE
allSame(test_false)
# [1] FALSE

在找到一个不匹配项后，继续进行比较是低效的。我的粗略解决方案是编写else break
，而不是else FALSE
，抛出一个错误。
这同样有效
m <- combn(length(test_true),2)

for(i in 1:ncol(m)){
    print(all(test_true[[m[,i][1]]] == test_true[[m[,i][2]]]))
    }

m为cgwtools:：approxeq
提出我的自我提升建议，它本质上做了all.equal
所做的事情，但返回一个表示相等与否的逻辑值向量
所以：取决于您是想要精确相等还是浮点表示相等。
我会：
all.identical <- function(l) all(mapply(identical, head(l, 1), tail(l, -1)))

all.identical(test_true)
# [1] TRUE
all.identical(test_false)
# [1] FALSE

all.idential总结解决方案。试验数据：
x1 <- as.list(as.data.frame(replicate(1000, 1:100)))
x2 <- as.list(as.data.frame(replicate(1000, sample(1:100, 100))))

基准：
library(microbenchmark)
microbenchmark(comp_list1(x1), comp_list2(x1), comp_list3(x1), comp_list4(x1))
#> Unit: microseconds
#>            expr      min        lq      mean   median        uq      max neval cld
#>  comp_list1(x1)  138.327  148.5955  171.9481  162.013  188.9315  269.342   100 a  
#>  comp_list2(x1) 1023.932 1125.2210 1387.6268 1255.985 1403.1885 3458.597   100  b 
#>  comp_list3(x1) 1130.275 1275.9940 1511.7916 1378.789 1550.8240 3254.292   100   c
#>  comp_list4(x1)  138.075  144.8635  169.7833  159.954  185.1515  298.282   100 a  
microbenchmark(comp_list1(x2), comp_list2(x2), comp_list3(x2), comp_list4(x2))
#> Unit: microseconds
#>            expr     min        lq      mean   median        uq      max neval cld
#>  comp_list1(x2) 139.492  140.3540  147.7695  145.380  149.6495  218.800   100  a 
#>  comp_list2(x2) 995.373 1030.4325 1179.2274 1054.711 1136.5050 3763.506   100   b
#>  comp_list3(x2) 977.805 1029.7310 1134.3650 1049.684 1086.0730 2846.592   100   b
#>  comp_list4(x2) 135.516  136.4685  150.7185  139.030  146.7170  345.985   100  a

正如我们看到的，最有效的解决方案是基于复制的和唯一的函数。
不是答案，但您可以将lappy
更改为sapply
并从函数中删除几行。函数体可以替换为all（sapply（li，idential，li[[1]]）
+1用于Reduce
方法。我想到了这一点（一直到考虑while
方法），但我的前几次尝试失败了在列表中使用unique
可能会很慢。。。请参见？unique
@JoshuaUlrich interest。我已经把这一点编辑到了答案中，效率很低。你最多只需要做n-1
比较，而你正在提议n*（n+1）/2
@Frank:Answer updated。还要注意的是，microbenchmark
允许测量最小的差异。对comp_列表4
同样快速的回答是comp_列表5
comp_list1 <- function(x) length(unique.default(x)) == 1L
comp_list2 <- function(x) all(vapply(x[-1], identical, logical(1L), x = x[[1]]))
comp_list3 <- function(x) all(vapply(x[-1], function(x2) all(x[[1]] == x2), logical(1L)))
comp_list4 <- function(x) sum(duplicated.default(x)) == length(x) - 1L

for (i in 1:4) cat(match.fun(paste0("comp_list", i))(x1), " ")
#> TRUE  TRUE  TRUE  TRUE   
for (i in 1:4) cat(match.fun(paste0("comp_list", i))(x2), " ")
#> FALSE  FALSE  FALSE  FALSE  

library(microbenchmark)
microbenchmark(comp_list1(x1), comp_list2(x1), comp_list3(x1), comp_list4(x1))
#> Unit: microseconds
#>            expr      min        lq      mean   median        uq      max neval cld
#>  comp_list1(x1)  138.327  148.5955  171.9481  162.013  188.9315  269.342   100 a  
#>  comp_list2(x1) 1023.932 1125.2210 1387.6268 1255.985 1403.1885 3458.597   100  b 
#>  comp_list3(x1) 1130.275 1275.9940 1511.7916 1378.789 1550.8240 3254.292   100   c
#>  comp_list4(x1)  138.075  144.8635  169.7833  159.954  185.1515  298.282   100 a  
microbenchmark(comp_list1(x2), comp_list2(x2), comp_list3(x2), comp_list4(x2))
#> Unit: microseconds
#>            expr     min        lq      mean   median        uq      max neval cld
#>  comp_list1(x2) 139.492  140.3540  147.7695  145.380  149.6495  218.800   100  a 
#>  comp_list2(x2) 995.373 1030.4325 1179.2274 1054.711 1136.5050 3763.506   100   b
#>  comp_list3(x2) 977.805 1029.7310 1134.3650 1049.684 1086.0730 2846.592   100   b
#>  comp_list4(x2) 135.516  136.4685  150.7185  139.030  146.7170  345.985   100  a