Python-多峰拟合-直方图

Python-多峰拟合-直方图,python,statistics,histogram,gaussian,data-fitting,Python,Statistics,Histogram,Gaussian,Data Fitting,我想用高斯函数的和来拟合我的数据,但是程序不能收敛。我不知道这是代码问题还是数据问题 #My function: sum of two gaussian def gauss2(x, *p): A, mu, sigma, A1, mu1, sigma1 = p return (A / (math.sqrt(2 * math.pi) * sigma)) * np.exp(- (x - mu) ** 2 / (2.

我想用高斯函数的和来拟合我的数据,但是程序不能收敛。我不知道这是代码问题还是数据问题

#My function: sum of two gaussian
def gauss2(x, *p):
    A, mu, sigma, A1, mu1, sigma1 = p                                    
    return (A / (math.sqrt(2 * math.pi) * sigma)) * np.exp(- (x - mu) ** 2 / (2. * sigma ** 2)) + (A1 / (math.sqrt(2 * math.pi) * sigma1)) * np.exp(- (x - mu1) ** 2 / (2. * sigma1 ** 2))

#Histogram
hist, bin_edges = np.histogram(data, density=True)
#I consider the center of each column of the histogram for the fit
bin_centres = (bin_edges[:-1] + bin_edges[1:]) / 2
#Guess
p0 = [2., 50.,0.05, 2., 52.,1.]
#Fit using curve_fit
coeff, var_matrix = curve_fit(gauss2, bin_centres, hist, p0=p0)

#For the plot
xx = []
ss = -14
prova2 = []
for i in range(10000):
    ss += 0.01
    xx.append(ss)

hist_fit = gauss2(xx, *coeff)
plt.plot(xx, hist_fit, 'b')
拟合的结果是:

1:[  1.45724361e+05   3.14206364e+03  -2.95328767e+02   8.89521631e-01
   5.20036421e+01   5.79493687e-01]!
我的数据将在50.5和52左右达到峰值


拟合函数的过程是否与“曲线拟合”不同?

这里是一个草图伪代码,而不是EM算法的真实代码。你根本不需要直方图

function M_step (x, responsibility, j)
  bump_mean[j] = sum (x[j]*responsibility[i, j], j, 1, n)
    where n = length(x)
  bump_mean_x2[j] = sum (x[j]**2 * responsibility[i, j], j, 1, n)
  bump_variance[j] = bump_mean_x2[j] - bump_mean[j]**2
  mixing_proportion[j] = sum (responsibility[i, j], j, 1, n)

function E_step (x, means, variances, mixing weights)
  responsibility[i, j] = p(bump i | x[j])
    for each bump i and datum x[j]

function EM (x)
  for many times:
    call E_step for data x and current parameter estimates
      to obtain responsibility values
    call M_step with responsibility values for each bump
      to update parameters

我漏掉了很多细节,我是凭记忆工作的,所以可能会有错误。但总结如下:E-step=估计每个数据的每个凹凸的责任,然后M-step=估计给定责任的凹凸参数和混合权重。M步与使用加权数据计算均值和方差完全相同。

我解决了最小化负对数似然函数的问题,如以下伪代码所示:

#Gaussian function
def mygauss(x, *p):
    mu, sigma = p                                    
    return (1 / (math.sqrt(2 * math.pi) * sigma))  * np.exp(- (x - mu) ** 2 / (2. * sigma ** 2))

#Model to calculate the likelihood
def pdf_model(x, p):
    mu1, sig1= p
    return mygauss(x, mu1, sig1)

#Negative Log likelohood function
def log_likelihood_two_1d_gauss(p, sample):
    h = 0
    for x in sample:
        h = h + math.log(pdf_model(x, p))    
    return -h


from scipy.optimize import minimize
#Guess    
p0 = np.array([a,   b])
#My data
mydata = mydata

res = minimize(log_likelihood_two_1d_gauss, x0 = p0, args = (mydata,), method='nelder-mead')
print res.success
print res.message
print res.x

拟合高斯混合的常用方法是所谓的期望最大化EM算法。这真的很简单。更一般地说,使用所谓的最大似然法,其中EM是一种将概率分布拟合到数据的方法。谢谢@RobertDodier的建议。我正在遵循此链接中描述的处方。谢谢@RobertDodier!我正在关注这个链接。我现在的代码是:Log likelohood函数def Log\u likelization\u two\u 1d\u gaussp,sample:h=0,用于范围内的i sample:h=math.logpdf\u modelsample[i],p+h从scipy返回-h.optimize import最小化x\u histo=np.arrayx\u histo res=minimizelog\u likelization\u 1d\u gauss,x0=p0,args=x\u histo,,method='nelder-mead',options=dictmaxiter=10000,maxfev=2e4 printres.xB但它不适用于三个高斯函数…可能是因为我的数据。拟合模型的自由参数是每个高斯凹凸的平均值和方差,以及混合比例。因此,对于n个凸点,有3n个自由参数。直方图根本不进入图片。此外,EM是一种迭代算法。我将编辑我的答案,加入一些伪代码。谢谢@RobertDodier,你的评论对我非常有用。最后,我选择了另一种方式……正如我在回答中解释的那样,对我来说更简单。