R 所有长度的无序组合

R 所有长度的无序组合,r,combinations,powerset,R,Combinations,Powerset,我正在寻找一个函数,它返回一个向量的所有无序组合。乙二醇 x <- c('red','blue','black') uncomb(x) [1]'red' [2]'blue' [3]'black' [4]'red','blue' [5]'blue','black' [6]'red','black' [7]'red','blue','black' x您可以在combn()函数的m参数上应用长度为x的序列 x <- c("red", "blue", "black") do.call(c,

我正在寻找一个函数,它返回一个向量的所有无序组合。乙二醇

x <- c('red','blue','black')
uncomb(x)
[1]'red'
[2]'blue'
[3]'black'
[4]'red','blue'
[5]'blue','black'
[6]'red','black'
[7]'red','blue','black'

x您可以在
combn()
函数的
m
参数上应用长度为
x
的序列

x <- c("red", "blue", "black")
do.call(c, lapply(seq_along(x), combn, x = x, simplify = FALSE))
# [[1]]
# [1] "red"
# 
# [[2]]
# [1] "blue"
# 
# [[3]]
# [1] "black"
# 
# [[4]]
# [1] "red"  "blue"
# 
# [[5]]
# [1] "red"   "black"
# 
# [[6]]
# [1] "blue"  "black"
# 
# [[7]]
# [1] "red"   "blue"  "black"

因为这是被骗的目标之一,所以我被重新安排到这里。这是一个古老的问题,@RichScriven提供的答案非常好,但我想给社区更多的选择,可以说更自然、更高效(最后两个)

我们首先注意到输出非常类似于。从
rje
包调用
powerSet
,我们可以看到,我们的输出确实与电源组中的每个元素相匹配,除了第一个元素,它相当于:

如果不喜欢处理空白元素和/或矩阵,还可以使用
lappy
返回列表

lapply(seq_along(x), comboGeneral, v = x)
[[1]]
     [,1]   
[1,] "red"  
[2,] "blue" 
[3,] "black"

[[2]]
     [,1]   [,2]   
[1,] "red"  "blue" 
[2,] "red"  "black"
[3,] "blue" "black"

[[3]]
     [,1]  [,2]   [,3]   
[1,] "red" "blue" "black"


lapply(seq_along(x), function(y) arrangements::combinations(x, y))
[[1]]
     [,1]   
[1,] "red"  
[2,] "blue" 
[3,] "black"

[[2]]
     [,1]   [,2]   
[1,] "red"  "blue" 
[2,] "red"  "black"
[3,] "blue" "black"

[[3]]
     [,1]  [,2]   [,3]   
[1,] "red" "blue" "black"
现在我们展示了后两种方法的效率更高(注意:我从@RichSciven提供的答案中删除了
do.call(c,
simplify=FALSE
,以便比较类似输出的生成。为了更好地衡量,我还包括
rje::powerSet
):

以及基准:

microbenchmark(powSetRje = powerSet(bigX)[-1],
               powSetRich = do.call(c, lapply(seq_along(bigX), combn, x = bigX, simplify = FALSE)),
               powSetArrange = do.call(c, lapply(seq_along(bigX), function(y) arrangements::combinations(bigX, y, layout = "l"))),
               times = 15, unit = "relative")
Unit: relative
          expr      min       lq     mean   median       uq      max neval
     powSetRje 5.539967 4.785415 4.277319 4.387410 3.739593 3.543570    15
    powSetRich 4.994366 4.306784 3.863612 3.932252 3.334708 3.327467    15
 powSetArrange 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000    15    15

不使用任何外部软件包的矩阵结果解决方案:

store <- lapply(
  seq_along(x), 
  function(i) {
    out <- combn(x, i) 
    N <- NCOL(out)
    length(out) <- length(x) * N
    matrix(out, ncol = N, byrow = TRUE)
})
t(do.call(cbind, store))

     [,1]    [,2]    [,3]   
[1,] "red"   NA      NA     
[2,] "blue"  NA      NA     
[3,] "black" NA      NA     
[4,] "red"   "black" NA     
[5,] "blue"  "blue"  NA     
[6,] "red"   "black" NA     
[7,] "red"   "blue"  "black"

store我不会发布我的答案,因为它非常接近Richard Scriven的答案。但是,如果你想利用
gtool
软件包,你可以使用
组合
而不是
排列
sapply(seq_-along(x),combines,v=x,n=length(x))
Yep-
unlist(lapply(seq_-along(x),combn,x=x,simplify=FALSE),recursive=FALSE)
用于另一种可能的输出变体。长度不等的数据对象非常适合于
列表
我同意,但在注释中提示我更接近所需的输出。即使
lappy(seq_-along(x),combn,x=x)
完全按照它应该的方式读取,这是列表中的列(在我的变体中)几乎正是OP在问题中作为所需输出所呈现的内容。由于所有NA,使用矩阵似乎传递给其他函数要困难得多。我完全同意@latemail-我在第一部分进行了编辑。出于某种原因,我更喜欢
do.call(c,…)
而不是
unlist(…,recursive=FALSE)
考虑了很多问题-“西红柿,西红柿,让我们取消整个事情…”我如何使用此函数来获得一系列长度的所有组合?例如,如果我的输入向量是x,你可以将
3L
更改为
length(x)
,以获得更一般的解决方案
lapply(seq_along(x), comboGeneral, v = x)
[[1]]
     [,1]   
[1,] "red"  
[2,] "blue" 
[3,] "black"

[[2]]
     [,1]   [,2]   
[1,] "red"  "blue" 
[2,] "red"  "black"
[3,] "blue" "black"

[[3]]
     [,1]  [,2]   [,3]   
[1,] "red" "blue" "black"


lapply(seq_along(x), function(y) arrangements::combinations(x, y))
[[1]]
     [,1]   
[1,] "red"  
[2,] "blue" 
[3,] "black"

[[2]]
     [,1]   [,2]   
[1,] "red"  "blue" 
[2,] "red"  "black"
[3,] "blue" "black"

[[3]]
     [,1]  [,2]   [,3]   
[1,] "red" "blue" "black"
set.seed(8128)
bigX <- sort(sample(10^6, 20)) ## With this as an input, we will get 2^20 - 1 results.. i.e. 1,048,575
library(microbenchmark)
microbenchmark(powSetRje = powerSet(bigX),
               powSetRich = lapply(seq_along(bigX), combn, x = bigX),
               powSetArrange = lapply(seq_along(bigX), function(y) arrangements::combinations(x = bigX, k = y)),
               powSetAlgos = lapply(seq_along(bigX), comboGeneral, v = bigX),
               unit = "relative")

Unit: relative
          expr        min        lq      mean   median        uq      max neval
     powSetRje 64.4252454 44.063199 16.678438 18.63110 12.082214 7.317559   100
    powSetRich 61.6766640 43.027789 16.009151 17.88944 11.406994 7.222899   100
 powSetArrange  0.9508052  1.060309  1.080341  1.02257  1.262713 1.126384   100
   powSetAlgos  1.0000000  1.000000  1.000000  1.00000  1.000000 1.000000   100
do.call(c, lapply(seq_along(x), function(y) {
                    arrangements::combinations(x, y, layout = "l")
                  }))
[[1]]
[1] "red"

[[2]]
[1] "blue"

[[3]]
[1] "black"

[[4]]
[1] "red"  "blue"

[[5]]
[1] "red"   "black"

[[6]]
[1] "blue"  "black"

[[7]]
[1] "red"   "blue"  "black"
microbenchmark(powSetRje = powerSet(bigX)[-1],
               powSetRich = do.call(c, lapply(seq_along(bigX), combn, x = bigX, simplify = FALSE)),
               powSetArrange = do.call(c, lapply(seq_along(bigX), function(y) arrangements::combinations(bigX, y, layout = "l"))),
               times = 15, unit = "relative")
Unit: relative
          expr      min       lq     mean   median       uq      max neval
     powSetRje 5.539967 4.785415 4.277319 4.387410 3.739593 3.543570    15
    powSetRich 4.994366 4.306784 3.863612 3.932252 3.334708 3.327467    15
 powSetArrange 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000    15    15
store <- lapply(
  seq_along(x), 
  function(i) {
    out <- combn(x, i) 
    N <- NCOL(out)
    length(out) <- length(x) * N
    matrix(out, ncol = N, byrow = TRUE)
})
t(do.call(cbind, store))

     [,1]    [,2]    [,3]   
[1,] "red"   NA      NA     
[2,] "blue"  NA      NA     
[3,] "black" NA      NA     
[4,] "red"   "black" NA     
[5,] "blue"  "blue"  NA     
[6,] "red"   "black" NA     
[7,] "red"   "blue"  "black"