R 如何绘制簇内平方和的图?

R 如何绘制簇内平方和的图?,r,plot,cluster-analysis,hierarchical-clustering,R,Plot,Cluster Analysis,Hierarchical Clustering,我有一个R的集群图,我想用wss图优化集群的“肘部标准”,但我不知道如何为给定集群绘制wss图,有人会帮我吗 以下是我的数据: Friendly<-c(0.467,0.175,0.004,0.025,0.083,0.004,0.042,0.038,0,0.008,0.008,0.05,0.096) Polite<-c(0.117,0.55,0,0,0.054,0.017,0.017,0.017,0,0.017,0.008,0.104,0.1) Praising<-c(0.079

我有一个R的集群图,我想用wss图优化集群的“肘部标准”,但我不知道如何为给定集群绘制wss图,有人会帮我吗

以下是我的数据:

Friendly<-c(0.467,0.175,0.004,0.025,0.083,0.004,0.042,0.038,0,0.008,0.008,0.05,0.096)
Polite<-c(0.117,0.55,0,0,0.054,0.017,0.017,0.017,0,0.017,0.008,0.104,0.1)
Praising<-c(0.079,0.046,0.563,0.029,0.092,0.025,0.004,0.004,0.129,0,0,0,0.029)
Joking<-c(0.125,0.017,0.054,0.383,0.108,0.054,0.013,0.008,0.092,0.013,0.05,0.017,0.067)
Sincere<-c(0.092,0.088,0.025,0.008,0.383,0.133,0.017,0.004,0,0.063,0,0,0.188)
Serious<-c(0.033,0.021,0.054,0.013,0.2,0.358,0.017,0.004,0.025,0.004,0.142,0.021,0.108)
Hostile<-c(0.029,0.004,0,0,0.013,0.033,0.371,0.363,0.075,0.038,0.025,0.004,0.046)
Rude<-c(0,0.008,0,0.008,0.017,0.075,0.325,0.313,0.004,0.092,0.063,0.008,0.088)
Blaming<-c(0.013,0,0.088,0.038,0.046,0.046,0.029,0.038,0.646,0.029,0.004,0,0.025)
Insincere<-c(0.075,0.063,0,0.013,0.096,0.017,0.021,0,0.008,0.604,0.004,0,0.1)
Commanding<-c(0,0,0,0,0,0.233,0.046,0.029,0.004,0.004,0.538,0,0.146)
Suggesting<-c(0.038,0.15,0,0,0.083,0.058,0,0,0,0.017,0.079,0.133,0.442)
Neutral<-c(0.021,0.075,0.017,0,0.033,0.042,0.017,0,0.033,0.017,0.021,0.008,0.717)

data <- data.frame(Friendly,Polite,Praising,Joking,Sincere,Serious,Hostile,Rude,Blaming,Insincere,Commanding,Suggesting,Neutral)

Friendly如果我按照您的要求操作,那么我们需要一个函数来计算WSS

wss <- function(d) {
  sum(scale(d, scale = FALSE)^2)
}
此包装器接受以下参数和输入:

  • i
    要将数据剪切到的群集数
  • hc
    层次聚类分析对象
  • x
    原始数据
wrap
然后将树状图切割成
i
簇,将原始数据分割成
cl
给出的簇成员资格,并计算每个簇的WSS。将这些WSS值相加,得到该集群的WSS

我们使用
sapply
在集群1、2、
nrow(数据)

以下是使用著名的Edgar Anderson Iris数据集的示例:

iris2 <- iris[, 1:4]  # drop Species column
cl <- hclust(dist(iris2), method = "ward.D")

## Takes a little while as we evaluate all implied clustering up to 150 groups
res <- sapply(seq.int(1, nrow(iris2)), wrap, h = cl, x = iris2)
plot(seq_along(res), res, type = "b", pch = 19)

您可以通过适当的并行替代方案运行
sapply()
,或者只对小于
nrow(data)
的集群执行计算,从而加快主要计算步骤,例如

res <- sapply(seq.int(1, 50), wrap, h = cl, x = iris2) ## 1st 50 groups

res谢谢!但为什么y轴上的值如此巨大,而我的数据却非常小呢?另外,你能顺便回答我关于wss绘图的另一个问题吗?y轴上的值是由数据中变量的比例决定的。我来看看另一个问题。
res <- sapply(seq.int(1, nrow(data)), wrap, h = cl, x = data)
plot(seq_along(res), res, type = "b", pch = 19)
iris2 <- iris[, 1:4]  # drop Species column
cl <- hclust(dist(iris2), method = "ward.D")

## Takes a little while as we evaluate all implied clustering up to 150 groups
res <- sapply(seq.int(1, nrow(iris2)), wrap, h = cl, x = iris2)
plot(seq_along(res), res, type = "b", pch = 19)
plot(seq_along(res[1:50]), res[1:50], type = "o", pch = 19)
res <- sapply(seq.int(1, 50), wrap, h = cl, x = iris2) ## 1st 50 groups