Scikit learn scikit学习高斯混合在初始化后立即终止
我使用GaussianMixture对特征向量(8维)进行聚类。我将GaussianMixture设置为:Scikit learn scikit学习高斯混合在初始化后立即终止,scikit-learn,Scikit Learn,我使用GaussianMixture对特征向量(8维)进行聚类。我将GaussianMixture设置为: gm = GaussianMixture(n_components=class_no, tol=1e-6, covariance_type="spherical", init_params="random", verbose
gm = GaussianMixture(n_components=class_no,
tol=1e-6,
covariance_type="spherical",
init_params="random",
verbose=2, verbose_interval=1)
我有大约1000万个样本向量,类号是100。高斯混合拟合在初始化后提前终止
如果我将样本数量减少到100万到200万,那么拟合就正常了
可能的原因是什么
此外,我还看到在第一次迭代中(当它没有提前终止时)将更改inf。这正常吗?经过更多的实验,我可能发现了这个问题。如果我使用1M个样本运行GM,底部显示前30次迭代的ll更改。总的趋势是ll的变化开始很小,然后逐渐增加到峰值,然后收敛到零。随着样本量的增加,前几个步骤中的ll变化越来越小。当我使用10M样本时,迭代2中的ll变化已经小于我的阈值1e-6。如果我把它减少到1e-7,它将像我使用更小的样本大小一样结束 看来我们应该以某种方式将ll的变化与样本量进行标准化
Initialization 0
Iteration 1 time lapse 10.15743s ll change inf
Iteration 2 time lapse 7.76352s ll change 0.00001
Iteration 3 time lapse 7.83711s ll change 0.00006
Iteration 4 time lapse 7.83133s ll change 0.00044
Iteration 5 time lapse 8.18798s ll change 0.00317
Iteration 6 time lapse 7.78111s ll change 0.02268
Iteration 7 time lapse 7.95682s ll change 0.13413
Iteration 8 time lapse 7.87189s ll change 0.38677
Iteration 9 time lapse 7.75204s ll change 0.53651
Iteration 10 time lapse 7.73964s ll change 0.46236
Iteration 11 time lapse 7.75558s ll change 0.55855
Iteration 12 time lapse 7.75457s ll change 0.57340
Iteration 13 time lapse 7.77386s ll change 0.21811
Iteration 14 time lapse 7.75011s ll change 0.09917
Iteration 15 time lapse 7.78765s ll change 0.06162
Iteration 16 time lapse 7.81858s ll change 0.04783
Iteration 17 time lapse 7.76057s ll change 0.04079
Iteration 18 time lapse 7.73551s ll change 0.03687
Iteration 19 time lapse 7.82454s ll change 0.03416
Iteration 20 time lapse 7.78091s ll change 0.02830
Iteration 21 time lapse 7.78215s ll change 0.02189
Iteration 22 time lapse 7.75392s ll change 0.01775
Iteration 23 time lapse 7.77399s ll change 0.01523
Iteration 24 time lapse 7.73693s ll change 0.01342
Iteration 25 time lapse 7.74950s ll change 0.01205
Iteration 26 time lapse 7.73767s ll change 0.01107
Iteration 27 time lapse 7.79130s ll change 0.01021
Iteration 28 time lapse 7.76402s ll change 0.00916
Iteration 29 time lapse 7.77638s ll change 0.00799
Iteration 30 time lapse 7.76722s ll change 0.00695
当它终止时,会出现什么错误?您是否设法用较少的数据量重现了问题?