基于NAs的分类数据R中的K模式聚类_R_Cluster Analysis_Na_Categorical Data

基于NAs的分类数据R中的K模式聚类

基于NAs的分类数据R中的K模式聚类,r,cluster-analysis,na,categorical-data,R,Cluster Analysis,Na,Categorical Data,这是有缺陷的，因为默认情况下，kmodes使用简单的匹配距离来确定两个对象的相异性，因此我们将NA和NA作为匹配项另一个想法是对待每个NA不同，即在我的数据中，x中有8个NA，因此我可以将它们视为8个不同的类别 dat[c(1:5,9,17,20),1] <- "NA";dat[c(8,11),2] <- "NA" (cl <- kmodes(dat,modes=dat[c(6,7),])) K-modes clustering wit

这是有缺陷的，因为默认情况下，

kmodes

使用简单的匹配距离来确定两个对象的相异性，因此我们将NA和NA作为匹配项

另一个想法是对待每个NA不同，即在我的数据中，

中有8个NA，因此我可以将它们视为8个不同的类别

dat[c(1:5,9,17,20),1] <- "NA";dat[c(8,11),2] <- "NA"
(cl <- kmodes(dat,modes=dat[c(6,7),]))
K-modes clustering with 2 clusters of sizes 11, 9

Cluster modes:
  x    y  
1 "NA" "H"
2 "a"  "G"

Clustering vector:
 1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 
 1  1  1  1  1  1  2  2  1  2  1  2  2  2  1  2  1  2  2  1 

Within cluster simple-matching distance by cluster:
[1] 10  4

Available components:
[1] "cluster"    "size"       "modes"      "withindiff" "iterations" "weighted"

dat[c（1:5,9,17,20），1]
dat[c(1:5,9,17,20),1] <- "NA";dat[c(8,11),2] <- "NA"
(cl <- kmodes(dat,modes=dat[c(6,7),]))
K-modes clustering with 2 clusters of sizes 11, 9

Cluster modes:
  x    y  
1 "NA" "H"
2 "a"  "G"

Clustering vector:
 1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 
 1  1  1  1  1  1  2  2  1  2  1  2  2  2  1  2  1  2  2  1 

Within cluster simple-matching distance by cluster:
[1] 10  4

Available components:
[1] "cluster"    "size"       "modes"      "withindiff" "iterations" "weighted" 

dat[c(1:5,9,17,20),1] <- paste("NA",1:8,sep=""); dat[c(8,11),2] <- paste("NA",1:2,sep="")
(cl <- kmodes(dat,modes=dat[c(6,7),]))
K-modes clustering with 2 clusters of sizes 10, 10

Cluster modes:
  x   y  
1 "c" "H"
2 "a" "G"

Clustering vector:
 1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 
 1  1  2  1  1  1  1  2  2  1  1  2  2  2  1  2  1  2  2  2 

Within cluster simple-matching distance by cluster:
[1] 13  5

Available components:
[1] "cluster"    "size"       "modes"      "withindiff" "iterations" "weighted"