在R中获得多个分区方法的一致性_R_Algorithm_Classification_Cluster Analysis_Partitioning

在R中获得多个分区方法的一致性

r algorithm

在R中获得多个分区方法的一致性,r,algorithm,classification,cluster-analysis,partitioning,R,Algorithm,Classification,Cluster Analysis,Partitioning,我的数据： data=cbind(c(1,1,2,1,1,3),c(1,1,2,1,1,1),c(2,2,1,2,1,2)) colnames(data)=paste("item",1:3) rownames(data)=paste("method",1:6) 我想作为一个输出，根据多数票，有两个社区（及其组成部分）。类似于：group1={item1，item2}，group2={item3} 你可以试试这个，baseR： res=apply(data,2,function(u) as.nu

我的数据：

data=cbind(c(1,1,2,1,1,3),c(1,1,2,1,1,1),c(2,2,1,2,1,2))
colnames(data)=paste("item",1:3)
rownames(data)=paste("method",1:6)

我想作为一个输出，根据多数票，有两个社区（及其组成部分）。类似于：

group1={item1，item2}

，

group2={item3}

你可以试试这个，base

：

res=apply(data,2,function(u) as.numeric(names(sort(table(u), decreasing=T))[1]))

setNames(lapply(unique(res), function(u) names(res)[res==u]), unique(res))
#$`1`
#[1] "item 1" "item 2"

#$`2`
#[1] "item 3"

该函数被传递一个矩阵，其中每一列是一个项，每一行是根据聚类方法对应于项的划分的成员向量。组成每一行的元素（数字）除了表示成员身份之外没有其他意义，并且在每一行之间循环使用。该函数返回多数票分区。当某个项目不存在一致意见时，第一行给出的分区将获胜。例如，这允许通过降低模块化的值来对分区进行排序

    consensus.final <-
  function(data){
    output=list()
    for (i in 1:nrow(data)){
      row=as.numeric(data[i,])
      output.inner=list()
      for (j in 1:length(row)){
        group=character()
        group=c(group,colnames(data)[which(row==row[j])])
        output.inner[[j]]=group
      }
      output.inner=unique(output.inner)
      output[[i]]=output.inner
    }

    # gives the mode of the vector representing the number of groups found by each method
    consensus.n.comm=as.numeric(names(sort(table(unlist(lapply(output,length))),decreasing=TRUE))[1])

    # removes the elements of the list that do not correspond to this consensus solution
    output=output[lapply(output,length)==consensus.n.comm]

    # 1) find intersection 
    # 2) use majority vote for elements of each vector that are not part of the intersection

    group=list()

    for (i in 1:consensus.n.comm){ 
      list.intersection=list()
      for (p in 1:length(output)){
        list.intersection[[p]]=unlist(output[[p]][i])
      }

      # candidate group i
      intersection=Reduce(intersect,list.intersection)
      group[[i]]=intersection

      # we need to reinforce that group
      for (p in 1:length(list.intersection)){
        vector=setdiff(list.intersection[[p]],intersection)
        if (length(vector)>0){
          for (j in 1:length(vector)){
            counter=vector(length=length(list.intersection))
            for (k in 1:length(list.intersection)){
              counter[k]=vector[j]%in%list.intersection[[k]]
            }
            if(length(which(counter==TRUE))>=ceiling((length(counter)/2)+0.001)){
              group[[i]]=c(group[[i]],vector[j])
            }
          }
        }
      }
    }

    group=lapply(group,unique)

    # variables for which consensus has not been reached
    unclassified=setdiff(colnames(data),unlist(group))

    if (length(unclassified)>0){
      for (pp  in 1:length(unclassified)){
        temp=matrix(nrow=length(output),ncol=consensus.n.comm)
        for (i in 1:nrow(temp)){
          for (j in 1:ncol(temp)){
            temp[i,j]=unclassified[pp]%in%unlist(output[[i]][j])
          }
        }
        # use the partition of the first method when no majority exists (this allows ordering of partitions by decreasing modularity values for instance)
        index.best=which(temp[1,]==TRUE)
        group[[index.best]]=c(group[[index.best]],unclassified[pp])
      }
    }
    output=list(group=group,unclassified=unclassified)
  }

对不起，那不行。例如：data=cbind（c（1,1,1,1,1,1,3），c（1,1,1,1,1,1），c（1,1,1,2,1,2））colnames（data）=paste（“item”，1:3）rownames（data）=paste（“method”，1:6）您的方法返回3个组，但显然只有一个基于多数票的组，如：consensensensess所强调的。名字和顺序上的打字错误。您的第二个示例在代码中得到了充分反映，我发现了一个新问题：例如：data=cbind（c（1,3,2,1），c（2,2,3,3），c（3,1,1,2））；colnames（数据）=粘贴（“项目”，1:3）；rownames（data）=粘贴（“method”，1:4）当一致性显然是3集群解决方案时，您的命令返回{item1，item3}和{item2}。请记住，数字不是固定的组标签，它们仅表示成员身份，并且在相等的情况下从一行循环到另一行（第2列），您的意思是您想要两个组吗？对于每一行（每种方法），这三个项分为不同的组。我不太明白你所说的“在平等的情况下（第2栏）”是什么意思。

data=cbind(c(1,1,2,1,1,3),c(1,1,2,1,1,1),c(2,2,1,2,1,2))
colnames(data)=paste("item",1:3)
rownames(data)=paste("method",1:6)
data
consensus.final(data)$group

[[1]]
[1] "item 1" "item 2"

[[2]]
[1] "item 3"

data=cbind(c(1,1,1,1,1,3),c(1,1,1,1,1,1),c(1,1,1,2,1,2)) 
colnames(data)=paste("item",1:3) 
rownames(data)=paste("method",1:6)
data
consensus.final(data)$group

[[1]]
[1] "item 1" "item 2" "item 3"

data=cbind(c(1,3,2,1),c(2,2,3,3),c(3,1,1,2))
colnames(data)=paste("item",1:3)
rownames(data)=paste("method",1:4)
data
consensus.final(data)$group

[[1]]
[1] "item 1"

[[2]]
[1] "item 2"

[[3]]
[1] "item 3"