R 秩kmeans输出_R_Cluster Analysis_K Means

R 秩kmeans输出

R 秩kmeans输出,r,cluster-analysis,k-means,R,Cluster Analysis,K Means,我试图找到一种方法来对Kmeans的输出进行排序。我看到了如下一些例子，其中一些人对集群距离内的排名感兴趣： x <- rbind(matrix(rnorm(100, sd = 0.3), ncol = 2), matrix(rnorm(100, mean = 1, sd = 0.3), ncol = 2)) colnames(x) <- c("x", "y") cl <- kmeans(x, 2) x &l

我试图找到一种方法来对Kmeans的输出进行排序。我看到了如下一些例子，其中一些人对集群距离内的排名感兴趣：

       x <- rbind(matrix(rnorm(100, sd = 0.3), ncol = 2),
       matrix(rnorm(100, mean = 1, sd = 0.3), ncol = 2))
       colnames(x) <- c("x", "y")
       cl <- kmeans(x, 2)
       x <- cbind(x,cl = cl$cluster)
       #Function to apply to each cluster to 
       # do the ordering
      orderCluster <- function(i,data,centers){
      #Extract cluster and center
      dt <- data[data[,3] == i,]
      ct <- centers[i,]
      #Calculate distances
     dt <- cbind(dt,dist = apply((dt[,1:2] - ct)^2,1,sum))
      #Sort
     dt[order(dt[,4]),]
       }
      do.call(rbind,lapply(sort(unique(cl$cluster)),orderCluster,data = x,centers = cl$centers))

除了前一个，我感兴趣的是根据受试者之间的距离对总体输出进行排名。知道怎么做吗？

感谢

以下代码基本上是您发布的，但出于显示目的稍微小一些，它构建了一个数据框，其中距离在每个簇内进行排序：

set.seed(144)
x <- rbind(matrix(rnorm(16, sd = 0.3), ncol = 2),
           matrix(rnorm(16, mean = 1, sd = 0.3), ncol = 2))
colnames(x) <- c("x", "y")
cl <- kmeans(x, 2)
x <- cbind(x,cl = cl$cluster)
#Function to apply to each cluster to do the ordering
orderCluster <- function(i,data,centers){
  #Extract cluster and center
  dt <- data[data[,3] == i,]
  ct <- centers[i,]
  #Calculate distances
  dt <- cbind(dt,dist = apply((dt[,1:2] - ct)^2,1,sum))

  #Sort
  dt[order(dt[,4]),]
}
mat = do.call(rbind,lapply(sort(unique(cl$cluster)),orderCluster,data = x,centers = cl$centers))
mat
#                 x            y cl        dist
#  [1,] -0.14216517 -0.176358815  1 0.006587057
#  [2,] -0.42633630 -0.133045170  1 0.051628451
#  [3,]  0.04766625 -0.303396617  1 0.087914240
#  [4,]  0.18084317  0.293752160  1 0.131882291
#  [5,] -0.37685063  0.004506463  1 0.132548458
#  [6,] -0.49516685  0.044300123  1 0.145771677
#  [7,]  0.03892986  0.091644147  1 0.157442636
#  [8,] -0.53928391  0.072918569  1 0.284171983
#  [9,]  0.84816448  0.849422864  2 0.012406249
# [10,]  1.11092777  0.940139924  2 0.017545334
# [11,]  1.07090647  1.323721146  2 0.067496535
# [12,]  0.97112156  0.797174195  2 0.079830832
# [13,]  0.61441906  0.972293074  2 0.100053503
# [14,]  1.07548106  1.283839079  2 0.148824424
# [15,]  0.57826111  0.747899273  2 0.335880701
# [16,]  1.15112709  1.597604041  2 0.498957213

以下是通过减少距离对它们进行排序的方法：

mat[order(mat[,"dist"]),]
#                 x            y cl        dist
#  [1,] -0.14216517 -0.176358815  1 0.006587057
#  [2,]  0.84816448  0.849422864  2 0.012406249
#  [3,]  1.11092777  0.940139924  2 0.017545334
#  [4,] -0.42633630 -0.133045170  1 0.051628451
#  [5,]  1.07090647  1.323721146  2 0.067496535
#  [6,]  0.97112156  0.797174195  2 0.079830832
#  [7,]  0.04766625 -0.303396617  1 0.087914240
#  [8,]  0.61441906  0.972293074  2 0.100053503
#  [9,]  0.18084317  0.293752160  1 0.131882291
# [10,] -0.37685063  0.004506463  1 0.132548458
# [11,] -0.49516685  0.044300123  1 0.145771677
# [12,]  1.07548106  1.283839079  2 0.148824424
# [13,]  0.03892986  0.091644147  1 0.157442636
# [14,] -0.53928391  0.072918569  1 0.284171983
# [15,]  0.57826111  0.747899273  2 0.335880701
# [16,]  1.15112709  1.597604041  2 0.498957213

mat[order(mat[,"dist"], decreasing=T),]
#                 x            y cl        dist
#  [1,]  1.15112709  1.597604041  2 0.498957213
#  [2,]  0.57826111  0.747899273  2 0.335880701
#  [3,] -0.53928391  0.072918569  1 0.284171983
#  [4,]  0.03892986  0.091644147  1 0.157442636
#  [5,]  1.07548106  1.283839079  2 0.148824424
#  [6,] -0.49516685  0.044300123  1 0.145771677
#  [7,] -0.37685063  0.004506463  1 0.132548458
#  [8,]  0.18084317  0.293752160  1 0.131882291
#  [9,]  0.61441906  0.972293074  2 0.100053503
# [10,]  0.04766625 -0.303396617  1 0.087914240
# [11,]  0.97112156  0.797174195  2 0.079830832
# [12,]  1.07090647  1.323721146  2 0.067496535
# [13,] -0.42633630 -0.133045170  1 0.051628451
# [14,]  1.11092777  0.940139924  2 0.017545334
# [15,]  0.84816448  0.849422864  2 0.012406249
# [16,] -0.14216517 -0.176358815  1 0.006587057

你的意思是对整体输出进行排序？我试图根据所有受试者之间的关系同时对其进行排序-这有意义吗？你的意思是你试图根据受试者与指定集群中心的距离对所有受试者进行排序吗？是的，这就是我需要做的。。。。。谢谢