如何在R中使用WeightedCluster:：wcKMedoids为heatmap或heatmap.2提供群集？_R_Cluster Analysis_Heatmap

如何在R中使用WeightedCluster:：wcKMedoids为heatmap或heatmap.2提供群集？

如何在R中使用WeightedCluster:：wcKMedoids为heatmap或heatmap.2提供群集？,r,cluster-analysis,heatmap,R,Cluster Analysis,Heatmap,TL；DR：如何使用WeightedCluster库（特别是wcKMedoids（）方法）作为heatmap、heatmap.2或类似工具的输入，为其提供聚类信息我们正在从R中的一些二进制数据（是/否值，表示为1和0）创建热图，并且需要为基于列的聚类调整一些行的权重（它们从多项选择类别生成到多个二进制是/否值行，因此被过度表示）我找到了这个库，它可以使用权重进行聚类现在的问题是如何使用此库（特别是wcKMedoids（）方法）作为heatmap、heatmap.2或类似方法的输入我尝

TL；DR：如何使用

WeightedCluster

库（特别是

wcKMedoids（）

方法）作为

heatmap

、

heatmap.2

或类似工具的输入，为其提供聚类信息

我们正在从R中的一些二进制数据（是/否值，表示为1和0）创建热图，并且需要为基于列的聚类调整一些行的权重

（它们从多项选择类别生成到多个二进制是/否值行，因此被过度表示）

我找到了这个库，它可以使用权重进行聚类

现在的问题是如何使用此库（特别是

wcKMedoids（）

方法）作为

heatmap

、

heatmap.2

或类似方法的输入

我尝试了以下代码，结果出现以下错误消息：

library(gplots)
library(WeightedCluster)

dataset <- "
F,T1,T2,T3,T4,T5,T6,T7,T8
A,1,1,0,1,1,1,1,1
B,1,0,1,0,1,0,1,1
C,1,1,1,1,1,1,1,0
D,1,1,1,0,1,1,1,0
E,0,1,0,0,1,0,1,0
F,0,0,1,0,0,0,0,0
G,1,1,1,0,1,1,1,1
H,1,1,0,0,0,0,0,0
I,1,0,1,0,0,1,0,0
J,1,1,1,0,0,0,0,1
K,1,0,0,0,1,1,1,1
L,1,1,1,0,1,1,1,1
M,0,1,1,1,1,1,1,1
N,1,1,1,0,1,1,1,1"
fakefile <- textConnection(dataset)

d <- read.csv(fakefile, header=T, row.names = 1)

weights <- c(1,1,1,1,1,1,1,1,1,1,1,1,1,1)

distf <- function(x) dist(x, method="binary")
wclustf <- function(x) wcKMedoids(distf(x), 
                                 k=8, 
                                 weights=weights, 
                                 npass = 1, 
                                 initialclust=NULL, 
                                 method="PAMonce", 
                                 cluster.only = FALSE, 
                                 debuglevel=0)

cluster_colors <- colorRampPalette(c("red", "green"))(256);
heatmap(as.matrix(d), 
        col=cluster_colors,
        distfun = distf,
        hclustfun = wclustf,
        keep.dendro = F,
        margins=c(10,16),
        scale="none")

显然，

wcKMedoids

并不能替代R的

hclust

，但是有人对如何解决这个问题有一些建议吗

更新：到目前为止，我取得的微小进展表明，我应该实现一个方法

作为.dendrogram.kmedoids

，该方法产生与

hclust（dist（x））

类似的输出。（可以使用

dput

详细检查其输出：

dput（hclust（dist（x）））

）。非常欢迎想法和建议。

这是不可能做到的。K-Medoid聚类是一种划分方法，而不是分层方法。Dendogram仅对分层聚类算法有意义。

如果您可以使用更简单的解决方案，只需将权重乘以原始矩阵，以这种方式赋予它们更大的权重。我不是100%确定这是统计上正确的方法，但取决于你想要实现什么，它可能会起作用

# Create the dataset
dataset <- matrix(
  dimnames = list(LETTERS[seq( from = 1, to = 14 )], c("T1","T2","T3","T4","T5","T6","T7","T8")),
  data = c(1,1,0,1,1,1,1,1,
           1,0,1,0,1,0,1,1,
           1,1,1,1,1,1,1,0,
           1,1,1,0,1,1,1,0,
           0,1,0,0,1,0,1,0,
           0,0,1,0,0,0,0,0,
           1,1,1,0,1,1,1,1,
           1,1,0,0,0,0,0,0,
           1,0,1,0,0,1,0,0,
           1,1,1,0,0,0,0,1,
           1,0,0,0,1,1,1,1,
           1,1,1,0,1,1,1,1,
           0,1,1,1,1,1,1,1,
           1,1,1,0,1,1,1,1),
  ncol=8,
  nrow=14)

# Assign weights to the different columns
col.weights <- c(2,3,1,1,1,1,1,1)

# Transform the original matrix with the weights
# you want to assign to each column.
create.weights.matrix <- function(weights, rows) {
  sapply(weights, function(x){rep(x, rows)})
}
weights.matrix <- create.weights.matrix(col.weights, nrow(dataset))
d.weighted <- weights.matrix * dataset

# Create the plot
cluster_colors <- colorRampPalette(c("red", "green"))(256);
heatmap(as.matrix(d.weighted), 
        col=cluster_colors,
        keep.dendro = F,
        margins=c(10,16),
        scale="none")

#创建数据集
dataset我投票结束这个问题，因为它是关于如何在没有可复制示例的情况下使用R的。@对此我很抱歉（对R有点陌生，以及如何做），使代码示例现在完全独立且可复制！
# Create the dataset
dataset <- matrix(
  dimnames = list(LETTERS[seq( from = 1, to = 14 )], c("T1","T2","T3","T4","T5","T6","T7","T8")),
  data = c(1,1,0,1,1,1,1,1,
           1,0,1,0,1,0,1,1,
           1,1,1,1,1,1,1,0,
           1,1,1,0,1,1,1,0,
           0,1,0,0,1,0,1,0,
           0,0,1,0,0,0,0,0,
           1,1,1,0,1,1,1,1,
           1,1,0,0,0,0,0,0,
           1,0,1,0,0,1,0,0,
           1,1,1,0,0,0,0,1,
           1,0,0,0,1,1,1,1,
           1,1,1,0,1,1,1,1,
           0,1,1,1,1,1,1,1,
           1,1,1,0,1,1,1,1),
  ncol=8,
  nrow=14)

# Assign weights to the different columns
col.weights <- c(2,3,1,1,1,1,1,1)

# Transform the original matrix with the weights
# you want to assign to each column.
create.weights.matrix <- function(weights, rows) {
  sapply(weights, function(x){rep(x, rows)})
}
weights.matrix <- create.weights.matrix(col.weights, nrow(dataset))
d.weighted <- weights.matrix * dataset

# Create the plot
cluster_colors <- colorRampPalette(c("red", "green"))(256);
heatmap(as.matrix(d.weighted), 
        col=cluster_colors,
        keep.dendro = F,
        margins=c(10,16),
        scale="none")