R 使用'时出错;mclust';混合模型估计软件包

R 使用'时出错;mclust';混合模型估计软件包,r,eda,R,Eda,当尝试执行单变量混合模型估计时,Rpackagemclust产生以下输出,并带有错误: ---------------------------------------------------- Gaussian finite mixture model fitted by EM algorithm ---------------------------------------------------- Mclust V (univariate, unequal variance) model

当尝试执行单变量混合模型估计时,
R
package
mclust
产生以下输出,并带有错误

----------------------------------------------------
Gaussian finite mixture model fitted by EM algorithm 
----------------------------------------------------

Mclust V (univariate, unequal variance) model with 9 components:

    log.likelihood     n               df               BIC              ICL
 -25576.3596860173 53940 26.0000000000001 -51436.0056895487 -84761.892259584

Clustering table:
    1     2     3     4     5     6     7     8     9 
 3081  8923  8177  3017  8742  6568 10378  3650  1404 
Error in object[as.character(G), modelNames, drop = FALSE] : 
  incorrect number of dimensions
可复制示例:

library(mclust)

set.seed(12345) # for reproducibility

data(diamonds, package='ggplot2')  # use built-in data
myData <- log10(diamonds$price)

# determine number of components
mc <- Mclust(myData)
print(summary(mc))
> str(mc)
List of 15
 $ call          : language Mclust(data = myData)
 $ data          : num [1:53940, 1] 2.51 2.51 2.51 2.52 2.53 ...
 $ modelName     : chr "V"
 $ n             : int 53940
 $ d             : num 1
 $ G             : int 9
 $ BIC           : num [1:9, 1:2] -64689 -57330 -54802 -52807 -52721 ...
 ..- attr(*, "dimnames")=List of 2
 .. ..$ : chr [1:9] "1" "2" "3" "4" ...
 .. ..$ : chr [1:2] "E" "V"
 ..- attr(*, "G")= num [1:9] 1 2 3 4 5 6 7 8 9
 ..- attr(*, "modelNames")= chr [1:2] "E" "V"
 ..- attr(*, "oneD")= logi TRUE
 $ bic           : num -51436
 $ loglik        : num -25576
 $ df            : num 26
 $ hypvol        : num NA
 $ parameters    :List of 4
 ..$ Vinv    : NULL
 ..$ pro     : num [1:9] 0.0606 0.1558 0.1628 0.0411 0.1655 ...
 ..$ mean    : Named num [1:9] 2.7 2.86 3.03 3.24 3.39 ...
 .. ..- attr(*, "names")= chr [1:9] "1" "2" "3" "4" ...
 ..$ variance:List of 5
 .. ..$ modelName: chr "V"
 .. ..$ d        : num 1
 .. ..$ G        : int 9
 .. ..$ sigmasq  : num [1:9] 0.004915 0.006932 0.009595 0.000833 0.008896 ...
 .. ..$ scale    : num [1:9] 0.004915 0.006932 0.009595 0.000833 0.008896 ...
 $ classification: num [1:53940] 1 1 1 1 1 1 1 1 1 1 ...
 $ uncertainty   : num [1:53940] 0.0132 0.0132 0.0134 0.015 0.0152 ...
 $ z             : num [1:53940, 1:9] 0.987 0.987 0.987 0.985 0.985 ...
 ..- attr(*, "dimnames")=List of 2
 .. ..$ : NULL
 .. ..$ : NULL
 - attr(*, "class")= chr "Mclust"

> sessionInfo()
R version 3.1.1 (2014-07-10)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C         LC_TIME=C           
 [4] LC_COLLATE=C         LC_MONETARY=C        LC_MESSAGES=C       
 [7] LC_PAPER=C           LC_NAME=C            LC_ADDRESS=C        
[10] LC_TELEPHONE=C       LC_MEASUREMENT=C     LC_IDENTIFICATION=C 

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] rebmix_2.6.1       ddst_1.03          evd_2.3-0          orthopolynom_1.0-5
 [5] polynom_1.3-8      mixtools_1.0.2     segmented_0.4-0.0  boot_1.3-11       
 [9] mclust_4.3         MASS_7.3-34       

loaded via a namespace (and not attached):
 [1] Rcpp_0.11.1      colorspace_1.2-4 digest_0.6.4     ggplot2_1.0.0   
 [5] grid_3.1.1       gtable_0.1.2     munsell_0.4.2    plyr_1.8.1      
 [9] proto_0.3-10     reshape2_1.4     scales_0.2.4     stringr_0.6.2   
[13] tools_3.1.1     

我已经弄明白问题出在哪里了。错误消息是由另一个
mclust
函数
mclustModel()
生成的,我的代码在调用
mclust()
后一直在执行该函数。该错误是因为
mclustModel()
函数需要另一种类型的对象作为第二个参数传递。使用
mclustModel()
(而不是
Mclust()
)时,正确的调用顺序如下所示

mc <- mclustBIC(myData)
print(summary(mc, myData, parameters = TRUE))
plot(mc)

bestModel <- mclustModel(myData, mc)
print(summary(bestModel, myData))

mc我很惊讶这个问题是从交叉验证(CV)迁移过来的,因为问题不太可能与R语言或
mclust
包本身有关,但可能与数据或数据的统计方面有关。奇怪的是,我在简历上没有看到任何人在这方面的注释/评论。另一个问题,可能值得作为一个单独的问题发布,是如何自动确定混合物成分的最小数量,负责任意大份额(比如95%)的分布。我在
mclust
包中没有看到这样的截止参数。