Scikit learn scikit学习高斯混合在初始化后立即终止

Scikit learn scikit学习高斯混合在初始化后立即终止,scikit-learn,Scikit Learn,我使用GaussianMixture对特征向量(8维)进行聚类。我将GaussianMixture设置为: gm = GaussianMixture(n_components=class_no, tol=1e-6, covariance_type="spherical", init_params="random", verbose

我使用GaussianMixture对特征向量(8维)进行聚类。我将GaussianMixture设置为:

gm = GaussianMixture(n_components=class_no,
                     tol=1e-6,
                     covariance_type="spherical",
                     init_params="random",
                     verbose=2, verbose_interval=1)
我有大约1000万个样本向量,类号是100。高斯混合拟合在初始化后提前终止

如果我将样本数量减少到100万到200万,那么拟合就正常了

可能的原因是什么


此外,我还看到在第一次迭代中(当它没有提前终止时)将更改inf。这正常吗?

经过更多的实验,我可能发现了这个问题。如果我使用1M个样本运行GM,底部显示前30次迭代的ll更改。总的趋势是ll的变化开始很小,然后逐渐增加到峰值,然后收敛到零。随着样本量的增加,前几个步骤中的ll变化越来越小。当我使用10M样本时,迭代2中的ll变化已经小于我的阈值1e-6。如果我把它减少到1e-7,它将像我使用更小的样本大小一样结束

看来我们应该以某种方式将ll的变化与样本量进行标准化

Initialization 0
  Iteration 1    time lapse 10.15743s    ll change inf
  Iteration 2    time lapse 7.76352s     ll change 0.00001
  Iteration 3    time lapse 7.83711s     ll change 0.00006
  Iteration 4    time lapse 7.83133s     ll change 0.00044
  Iteration 5    time lapse 8.18798s     ll change 0.00317
  Iteration 6    time lapse 7.78111s     ll change 0.02268
  Iteration 7    time lapse 7.95682s     ll change 0.13413
  Iteration 8    time lapse 7.87189s     ll change 0.38677
  Iteration 9    time lapse 7.75204s     ll change 0.53651
  Iteration 10   time lapse 7.73964s     ll change 0.46236
  Iteration 11   time lapse 7.75558s     ll change 0.55855
  Iteration 12   time lapse 7.75457s     ll change 0.57340
  Iteration 13   time lapse 7.77386s     ll change 0.21811
  Iteration 14   time lapse 7.75011s     ll change 0.09917
  Iteration 15   time lapse 7.78765s     ll change 0.06162
  Iteration 16   time lapse 7.81858s     ll change 0.04783
  Iteration 17   time lapse 7.76057s     ll change 0.04079
  Iteration 18   time lapse 7.73551s     ll change 0.03687
  Iteration 19   time lapse 7.82454s     ll change 0.03416
  Iteration 20   time lapse 7.78091s     ll change 0.02830
  Iteration 21   time lapse 7.78215s     ll change 0.02189
  Iteration 22   time lapse 7.75392s     ll change 0.01775
  Iteration 23   time lapse 7.77399s     ll change 0.01523
  Iteration 24   time lapse 7.73693s     ll change 0.01342
  Iteration 25   time lapse 7.74950s     ll change 0.01205
  Iteration 26   time lapse 7.73767s     ll change 0.01107
  Iteration 27   time lapse 7.79130s     ll change 0.01021
  Iteration 28   time lapse 7.76402s     ll change 0.00916
  Iteration 29   time lapse 7.77638s     ll change 0.00799
  Iteration 30   time lapse 7.76722s     ll change 0.00695

当它终止时,会出现什么错误?您是否设法用较少的数据量重现了问题?