R 在相似元素的数据集上循环
我有以下数据集,我想使用允许所有可能组合的循环策略(即“setA、setB、setC、setD”;“setA、setB、setC”;“setA、setB、setC”;“setA、setB”、“setB、setC、setD”;“setC、setD”;“setC、setD”等)比较它们所包含元素的相似性 数据集:R 在相似元素的数据集上循环,r,R,我有以下数据集,我想使用允许所有可能组合的循环策略(即“setA、setB、setC、setD”;“setA、setB、setC”;“setA、setB、setC”;“setA、setB”、“setB、setC、setD”;“setC、setD”;“setC、setD”等)比较它们所包含元素的相似性 数据集: setA <- c("dog", "cat", "cow", "sheep", "dunkey") setB <- c("fox", "cat", "cow", "snake
setA <- c("dog", "cat", "cow", "sheep", "dunkey")
setB <- c("fox", "cat", "cow", "snake")
setC <- c("dog", "cat", "cow", "sheep", "dunkey", "fox", "python")
setD <- c("dog", "cat", "lion", "sheep", "elephant", "fox")
setA这里是一个来自a的函数,用于获取所有交点
## Build intersections, 'out' accumulates the result
intersects <- function(sets, out=NULL) {
if (length(sets) < 2) return ( out ) # return result
len <- seq(length(sets))
if (missing(out)) out <- list() # initialize accumulator
for (idx in split((inds <- combn(length(sets), 2)), col(inds))) { # 2-way combinations
ii <- len > idx[2] & !(len %in% idx) # indices to keep for next intersect
out[[(n <- paste(names(sets[idx]), collapse="."))]] <- intersect(sets[[idx[1]]], sets[[idx[2]]])
out <- intersects(append(out[n], sets[ii]), out=out)
}
out
}
## Put the sets in a list
sets <- mget(paste0("set", toupper(letters[1:4])))
intersects(sets)
# $setA.setB
# [1] "cat" "cow"
#
# $setA.setB.setC
# [1] "cat" "cow"
#
# $setA.setB.setC.setD
# [1] "cat"
#
# $setA.setB.setD
# [1] "cat"
#
# $setC.setD
# [1] "dog" "cat" "sheep" "fox"
#
# $setA.setC
# [1] "dog" "cat" "cow" "sheep" "dunkey"
#
# $setA.setC.setD
# [1] "dog" "cat" "sheep"
#
# $setA.setD
# [1] "dog" "cat" "sheep"
#
# $setB.setC
# [1] "fox" "cat" "cow"
#
# $setB.setC.setD
# [1] "fox" "cat"
#
# $setB.setD
# [1] "fox" "cat"
##构建交点,“out”累积结果
什么是期望的输出?太好了!正是我想要的。谢谢nongkrong,代码为我工作。非常感谢。
## Build intersections, 'out' accumulates the result
intersects <- function(sets, out=NULL) {
if (length(sets) < 2) return ( out ) # return result
len <- seq(length(sets))
if (missing(out)) out <- list() # initialize accumulator
for (idx in split((inds <- combn(length(sets), 2)), col(inds))) { # 2-way combinations
ii <- len > idx[2] & !(len %in% idx) # indices to keep for next intersect
out[[(n <- paste(names(sets[idx]), collapse="."))]] <- intersect(sets[[idx[1]]], sets[[idx[2]]])
out <- intersects(append(out[n], sets[ii]), out=out)
}
out
}
## Put the sets in a list
sets <- mget(paste0("set", toupper(letters[1:4])))
intersects(sets)
# $setA.setB
# [1] "cat" "cow"
#
# $setA.setB.setC
# [1] "cat" "cow"
#
# $setA.setB.setC.setD
# [1] "cat"
#
# $setA.setB.setD
# [1] "cat"
#
# $setC.setD
# [1] "dog" "cat" "sheep" "fox"
#
# $setA.setC
# [1] "dog" "cat" "cow" "sheep" "dunkey"
#
# $setA.setC.setD
# [1] "dog" "cat" "sheep"
#
# $setA.setD
# [1] "dog" "cat" "sheep"
#
# $setB.setC
# [1] "fox" "cat" "cow"
#
# $setB.setC.setD
# [1] "fox" "cat"
#
# $setB.setD
# [1] "fox" "cat"