Python 两个高斯混合交换值。全局优化
我有数据,我想用两个高斯拟合,同时保持一个平均值全局。我已经使用scipy、lmfit和numpy库编写了Python程序。这是我的已拟合数据结果(最小二乘): 我的计算功能:Python 两个高斯混合交换值。全局优化,python,numpy,Python,Numpy,我有数据,我想用两个高斯拟合,同时保持一个平均值全局。我已经使用scipy、lmfit和numpy库编写了Python程序。这是我的已拟合数据结果(最小二乘): 我的计算功能: y0 + + sqrt(2/PI)*A1/w1*exp(-2*(x-xc1)^2/w1^2) + sqrt(2/PI)*A2/w2*exp(-2*(x-xc2)^2/w2^2) Sorry, I dont know how to change into normal math formula. 这是一个测试,所以正确答
y0 + + sqrt(2/PI)*A1/w1*exp(-2*(x-xc1)^2/w1^2) + sqrt(2/PI)*A2/w2*exp(-2*(x-xc2)^2/w2^2)
Sorry, I dont know how to change into normal math formula.
这是一个测试,所以正确答案必须是:
mean1 sd1 A1 mean2 sd2 A2 y0
1 12 10 27000 25 20 30000 500
2 21 10 27000 25 20 30000 500
3 30 10 27000 25 20 30000 500
4 39 10 27000 25 20 30000 500
5 48 10 27000 25 20 30000 500
正如您所看到的,对于独立配件,它可以正确安装。问题是,我编写的拟合程序有时“交换第一个高斯参数值和第二个高斯参数值”,这意味着现在如果我尝试为每个数据集设置mean2 fixed,它将出错,因为第三个和第五个数据集被交换,因此mean2将不正确(但我不确定)(例如,mean2始终必须是25)。这一问题在实际数据中更为严峻。
基本上,正如我所理解的,因为我的函数是f=y+gauss1+gauss2,并且两个高斯都是相同的,所以在拟合gauss1或gauss2时,它看不到任何区别,有时会混淆
输出全局拟合:
mean1 sd1 A1 mean2 sd2 A2 y0
12.28 10.31 28483 25.90 19.77 29169 508.60
21.42 10.42 29148 25.90 20.51 28746 505.21
30.61 9.99 26045 25.90 20.26 32149 499.46
39.84 10.11 26605 25.90 21.44 33000 475.15
48.87 9.49 25000 25.90 23.00 33000 485.45
试验数据(dab分开):
我的脚本(取消对上述部分的全局拟合注释):
那么,我该如何改进我的代码呢?
全球适应真的包括错误的方式吗?因为它有点接近25。我没有工具检查它。
另外,我的值与真实值有点“偏离”,这是否正常。例如,我不认为mean2是25,每个数据集都是25.5。首先,这里是你的数据图: 当您从两条gauß曲线的相同参数开始时,很明显,计算机不知道数据中应该是哪个参数。那么,你能做什么
我还可以确认偏移量大约为1,至少对于第2列。高斯函数中的x值似乎与数据中的x值不同。谢谢您的回答。1.我不能将一个设置为低值,另一个设置为高值,因为如果有大偏移量,它就不能用大偏移量拟合数据(解决方案?)2。在非全局拟合之后,我可以轻松地进行交换。但我想做全球拟合。在这种情况下,它直接进入解算器。检查注释行的代码,如何进行全局拟合。3.我永远不知道第二峰在哪里。在实际系统中,它可以在+/-值上移动,这就是为什么我需要全局拟合-以找到最佳值,这就是为什么在第二列中有正确的数字->以计算近似值如此重要的原因。嗯,现在我明白了。这很难。我也不知道为什么全局拟合不能正确处理全局峰值。可能是,您可以拟合两个数据集,并找到两者共有的一个峰值,即距离最小的两个峰值。对不起,我没有更多的想法,但我已经把问题投了赞成票。
mean1 sd1 A1 mean2 sd2 A2 y0
12.28 10.31 28483 25.90 19.77 29169 508.60
21.42 10.42 29148 25.90 20.51 28746 505.21
30.61 9.99 26045 25.90 20.26 32149 499.46
39.84 10.11 26605 25.90 21.44 33000 475.15
48.87 9.49 25000 25.90 23.00 33000 485.45
321 759 568 567 567 567
322 877 587 585 585 585
323 1033 610 606 606 606
324 1231 639 632 632 632
325 1471 675 662 662 662
326 1745 721 697 697 697
327 2043 780 737 737 737
328 2346 855 782 782 782
329 2632 954 833 833 833
330 2877 1080 889 889 889
331 3061 1241 951 949 949
332 3168 1440 1017 1014 1014
333 3194 1682 1089 1083 1083
334 3142 1962 1166 1154 1154
335 3025 2275 1250 1226 1226
336 2863 2605 1341 1298 1298
337 2676 2933 1442 1369 1369
338 2485 3236 1558 1437 1437
339 2308 3488 1691 1500 1500
340 2155 3668 1848 1558 1556
341 2031 3759 2031 1608 1605
342 1936 3756 2243 1651 1644
343 1865 3662 2482 1686 1673
344 1812 3490 2739 1715 1691
345 1770 3261 3003 1740 1697
346 1734 2997 3255 1764 1691
347 1697 2722 3473 1794 1673
348 1657 2453 3633 1836 1645
349 1611 2204 3716 1896 1606
350 1560 1983 3710 1983 1560
351 1501 1791 3611 2099 1506
352 1437 1628 3425 2245 1450
353 1369 1490 3168 2418 1393
354 1298 1372 2863 2605 1341
355 1226 1269 2533 2790 1299
356 1154 1177 2202 2953 1274
357 1083 1095 1891 3071 1274
358 1014 10211613 3126 1306
359 949 952 1376 3103 1376
360 889 890 1180 3000 1488
361 833 833 1024 2821 1641
362 782 782 903 2582 1831
363 737 737 810 2301 2043
364 697 697 740 2003 2261
365 662 662 686 1711 2461
366 632 632 645 1440 2621
367 606 606 613 1205 2718
368 585 585 588 1011 2739
369 567 567 569 859 2679
import numpy as np
import matplotlib.pyplot as plt
from lmfit import minimize, Parameters, report_fit
# python 3.3
# Unofficial Windows Binaries for Python Extension Packages
# http://www.lfd.uci.edu/~gohlke/pythonlibs/
# VARIABLES
show_plot = 1
size_cols = 11
size_rows = 50
nm_start = 320
data_sets = 5
file_name = "5_testas.txt"
intens = [[[0] for i in range(size_cols)] for j in range(size_rows)]
with open(file_name) as f:
for row in range (0, size_rows):
datal = f.readline();
data = datal.split();
col = 0;
for datab in data:
intens[row][col] = datab;
col = col+1;
#def gauss(x, amp, cen, sigma):
# "basic gaussian"
def gauss(x, mean, sd, A):
"basic gaussian"
return np.sqrt(2/np.pi)*A/sd*np.exp(-2*np.power(((x-mean)/sd), 2))
def gauss_dataset(params, i, x):
"""calc gaussian from params for data set i
using simple, hardwired naming convention"""
mean1 = params['mean1_%i' % (i+1)].value
sd1 = params['sd1_%i' % (i+1)].value
A1 = params['A1_%i' % (i+1)].value
mean2 = params['mean2_%i' % (i+1)].value
sd2 = params['sd2_%i' % (i+1)].value
A2 = params['A2_%i' % (i+1)].value
y0 = params['y0_%i' % (i+1)].value
return y0 + gauss(x, mean1, sd1, A1) + gauss(x, mean2, sd2, A2)
def gauss_dataset_a(params, i, x):
"""calc gaussian from params for data set i
using simple, hardwired naming convention"""
mean1 = params['mean1_%i' % (i+1)].value
sd1 = params['sd1_%i' % (i+1)].value
A1 = params['A1_%i' % (i+1)].value
mean2 = params['mean2_%i' % (i+1)].value
sd2 = params['sd2_%i' % (i+1)].value
A2 = params['A2_%i' % (i+1)].value
y0 = params['y0_%i' % (i+1)].value
return y0 + gauss(x, mean1, sd1, A1)
def gauss_dataset_b(params, i, x):
"""calc gaussian from params for data set i
using simple, hardwired naming convention"""
mean1 = params['mean1_%i' % (i+1)].value
sd1 = params['sd1_%i' % (i+1)].value
A1 = params['A1_%i' % (i+1)].value
mean2 = params['mean2_%i' % (i+1)].value
sd2 = params['sd2_%i' % (i+1)].value
A2 = params['A2_%i' % (i+1)].value
y0 = params['y0_%i' % (i+1)].value
return y0 + gauss(x, mean2, sd2, A2)
def objective(params, x, data):
""" calculate total residual for fits to several data sets held
in a 2-D array, and modeled by Gaussian functions"""
ndata, nx = data.shape
resid = 0.0*data[:]
# make residual per data set
for i in range(ndata):
resid[i, :] = data[i, :] - gauss_dataset(params, i, x)
# now flatten this to a 1D array, as minimize() needs
return resid.flatten()
x = np.linspace(0, 50, 50)
data = []
# dummy data
for i in np.arange(data_sets):
dat = gauss(x, 1, 1, 1)
data.append(dat)
# data has shape
data = np.array(data)
# Rearange data, exclude 1st set.
for col in range(0, data_sets):
for row in range (0, size_rows):
data[col][row] = intens[row][col+1]
# create 5 sets of parameters, one per data set
fit_params = Parameters()
for iy, y in enumerate(data):
fit_params.add( 'mean1_%i' % (iy+1), value=26.0, min=0.0, max=50.0)
fit_params.add( 'mean2_%i' % (iy+1), value=26.0, min=0.0, max=50.0)
fit_params.add( 'A1_%i' % (iy+1), value=28500.0, min=25000.0, max=33000.0)
fit_params.add( 'A2_%i' % (iy+1), value=28500.0, min=25000.0, max=33000.0)
fit_params.add( 'sd1_%i' % (iy+1), value=15.0, min=7.0, max=23.0)
fit_params.add( 'sd2_%i' % (iy+1), value=15.0, min=7.0, max=23.0)
fit_params.add( 'y0_%i' % (iy+1), value=1000.0, min=300.0, max=1500.0)
# UNCOMMENT FOR GLOBAL FIT
#for iy in range(2, data_sets+1):
#fit_params['mean2_%i' % iy].expr='mean2_1'
# run the global fit to all the data sets
minimize(objective, fit_params, args=(x, data))
# plot the data sets and fits
plt.figure()
print('mean1\tsd1\tA1\tmean2\tsd2\tA2\ty0')
for i in range(data_sets):
print("%0.2f" % fit_params['mean1_%i' % (i+1)].value+'\t'+"%0.2f" % fit_params['sd1_%i' % (i+1)].value+'\t'+"%0.0f" % fit_params['A1_%i' % (i+1)].value+'\t'+"%0.2f" % fit_params['mean2_%i' % (i+1)].value+'\t'+"%0.2f" % fit_params['sd2_%i' % (i+1)].value+'\t'+"%0.0f" % fit_params['A2_%i' % (i+1)].value+'\t'+"%0.2f" % fit_params['y0_%i' % (i+1)].value, end="\n")
if show_plot == 1:
for i in range(data_sets):
y_fit = gauss_dataset(fit_params, i, x)
y_fit_a = gauss_dataset_a(fit_params, i, x)
y_fit_b = gauss_dataset_b(fit_params, i, x)
plt.plot(x, data[i, :], 'o', x, y_fit, '-')
plt.plot(x, data[i, :], 'o', x, y_fit_a, '-')
plt.plot(x, data[i, :], 'o', x, y_fit_b, '-')
plt.show()