Python 3.x 如何通过嵌套函数链传递参数来计算结果?
我的问题很快,但我已经提供了一段重要的代码来更好地说明我的问题,因为我还没有从阅读相关文章中理解答案 下面的代码用于选择作为args列表一部分的优化参数。args列表应该是单个条目。我希望找到正确的args组合,以最适合数据。scipy优化模块应该波动参数的值,以找到最大程度地减少错误的组合。但是,我很难将参数从一个函数传递到另一个函数 有时我会放一个Python 3.x 如何通过嵌套函数链传递参数来计算结果?,python-3.x,scipy,arguments,parameter-passing,minimization,Python 3.x,Scipy,Arguments,Parameter Passing,Minimization,我的问题很快,但我已经提供了一段重要的代码来更好地说明我的问题,因为我还没有从阅读相关文章中理解答案 下面的代码用于选择作为args列表一部分的优化参数。args列表应该是单个条目。我希望找到正确的args组合,以最适合数据。scipy优化模块应该波动参数的值,以找到最大程度地减少错误的组合。但是,我很难将参数从一个函数传递到另一个函数 有时我会放一个*或***/code>但成功率比命中率更为低。我想知道如何将参数从一个函数传递到另一个函数,同时允许它们更改值,以便找到它们的优化值。(优化值可减
*
或***/code>但成功率比命中率更为低。我想知道如何将参数从一个函数传递到另一个函数,同时允许它们更改值,以便找到它们的优化值。(优化值可减少误差,如下所述)。我有一些函数作为其他函数的输入,这里缺少一个关键概念。像这样的事情需要夸尔格吗?如果arg是一个元组,它们是否仍然可以更改值以找到优化的参数?我知道在这里有人问过一些类似的问题,但我还不能用这些资源来理解
代码解释如下(导入后)
我生成了1000个数据点的随机高斯分布样本,平均μ=48,标准偏差σ=7。我可以对数据进行柱状图分析,我的目标是找到参数mu、sigma和normc(比例因子或标准化常数),这些参数最适合样本数据的柱状图。有许多错误分析方法,但就我的目的而言,最佳拟合被确定为使卡方误差最小化的拟合(下面将进一步描述)。我知道代码很长(甚至太长),但我的问题需要一些设置
## generate data sample
a, b = 48, 7 ## mu, sigma
randg = []
for index in range( 1000 ):
randg.append( random.gauss(a,b) )
data = sorted( randg )
small = min( data )
big = max( data )
domain = np.linspace(small,big,3000) ## for fitted plot overlay on histogram of data
然后我整理了我的垃圾箱,准备做柱状图
numbins = 30 ## number of bins
def binbounder( small , big , numbins ):
## generates list of bound bins for histogram ++ bincount
binwide = ( big - small ) / numbins ## binwidth
binleft = [] ## left edges of bins
for index in range( numbins ):
binleft.append( small + index * binwide )
binbound = [val for val in binleft]
binbound.append( big ) ## all bin edges
return binbound
binborders = binbounder( small , big , numbins )
## useful if one performs plt.hist(data, bins = binborders, ...)
def binmidder( small , big , numbins ):
## all midtpts of bins
## for x-ticks on histogram
## useful to visualize over/under -estimate of error
binbound = binbounder( small , big , numbins )
binwide = ( big - small ) / numbins
binmiddles = []
for index in range( len( binbound ) - 1 ):
binmiddles.append( binwide/2 + index * binwide )
return binmiddles
binmids = binmidder( small , big , numbins )
要进行卡方分析,必须输入每个料仓的期望值(E_i)和每个料仓的观测值的多重性(O_i)以及它们与每个料仓的期望值之差的平方
def countsperbin( xdata , args = [ small , big , numbins ]):
## calculates multiplicity of observed values per bin
binborders = binbounder( small , big , numbins )
binmids = binmidder( small , big , numbins )
values = sorted( xdata ) ## function(xdata) ~ f(x)
bincount = []
for jndex in range( len( binborders ) ):
if jndex != len( binborders ) - 1:
summ = 0
for val in values:
if val > binborders[ jndex ] and val <= binborders[ jndex + 1 ]:
summ += 1
bincount.append( summ )
if jndex == len( binborders ) - 1:
pass
return bincount
obsperbin = countsperbin( binborders , data ) ## multiplicity of observed values per bin
因为我们在积分f(x)dx,所以这里的数据点(或扩展数据)是不相关的
def GaussDistrib( xdata , args = [ mu , sigma , normc ] ): ## G(x)
return normc * exp( (-1) * (xdata - mu)**2 / (2 * sigma**2) )
def expectperbin( args ):
## calculates expectation values per bin
## needed with observation values per bin for ChiSquared
## expectation value of single bin is equal to area under Gaussian curve from left binedge to right binedge
## area under curve for ith bin = integral G(x)dx from x_i (left edge) to x_i+1 (right edge)
ans = []
for index in range(len(binborders)-1): # ith index does not exist for rightmost boundary
ans.append( quad( GaussDistrib , binborders[ index ] , binborders[ index + 1 ], args = [ mu , sigma , normc ])[0])
return ans
我定义的函数chisq
从scipy模块调用chisquare
,以返回结果
def chisq( args ):
## args[0] = mu
## args[1] = sigma
## args[2] = normc
## last subscript [0] gives chi-squared value, [1] gives 0 ≤ p-value ≤ 1
## can also minimize negative p-value to find best fitting chi square
return chisquare( obsperbin , expectperbin( args[0] , args[1] , args[2] ))[0]
我不知道怎么做,但我想对我的系统施加限制。具体来说,装箱数据高度列表的最大值必须大于零(由于微分后仍然存在指数项,因此卡方必须大于零)
最小化的目的是找到最佳拟合的优化参数
optmu , optsigma , optnormc = result.x[0], abs(result.x[1]), result.x[2]
chisqcheck = chisquare(obsperbin, expperbin)
chisqmin = result.fun
print("chisqmin -- ",chisqmin," ",chisqcheck," -- check chi sq")
print("""
""")
## CHECK
checkbins = bstat(xdata, xdata, statistic = 'sum', bins = binborders) ## via SCIPY (imports)
binsum = checkbins[0]
binedge = checkbins[1]
binborderindex = checkbins[2]
print("binsum",binsum)
print("")
print("binedge",binedge)
print("")
print("binborderindex",binborderindex)
# Am I doing this part right?
tl;dr:我想要result
,它调用函数minimize
,该函数调用一个scipy模块来使用猜测值最小化卡方误差。卡方和猜测值彼此调用其他函数等。如何以正确的方式传递参数?您可以访问从优化.basinhoing返回的所有信息
我已经抽象出随机样本的生成,并将函数的数量减少到运行优化真正需要的5个函数
参数传递中唯一“棘手”的部分是将参数mu
和sigma
传递给四元组调用中的GaussDistrib函数,但这在。除此之外,我没有看到在这里传递参数的真正问题
您长期使用normc
是错误的。通过这种方式无法从高斯分布中获得正确的值(当2个参数足够时,不需要改变3个独立的参数)。此外,要获得卡方检验的正确值,必须将高斯分布的概率与样本计数相乘(将obsperbin
bin的绝对计数与高斯分布下的概率进行比较,这显然是错误的)
初始猜测的平方比:3772.70822797
优化后的chisquare:26.35128491178447
优化后的mu,sigma:48.2701027439,7.046156286
顺便说一句,对于这类问题来说,这是过分的。我会留在(内尔德·米德)。您可以访问从optimize.basinhoping
返回的所有信息
我已经抽象出随机样本的生成,并将函数的数量减少到运行优化真正需要的5个函数
参数传递中唯一“棘手”的部分是将参数mu
和sigma
传递给四元组调用中的GaussDistrib函数,但这在。除此之外,我没有看到在这里传递参数的真正问题
您长期使用normc
是错误的。通过这种方式无法从高斯分布中获得正确的值(当2个参数足够时,不需要改变3个独立的参数)。此外,要获得卡方检验的正确值,必须将高斯分布的概率与样本计数相乘(将obsperbin
bin的绝对计数与高斯分布下的概率进行比较,这显然是错误的)
初始猜测的平方比:3772.70822797
优化后的chisquare:26.35128491178447
优化后的mu,sigma:48.2701027439,7.046156286
顺便说一句,对于这类问题来说,这是过分的。我会继续(内尔德·米德)。只是一个简单的观点,args=(μ,sigma,normc])
args
通常是元组,而不是列表。它是args1=(x,)+args
,你的函数(*args1)
。如果我把要优化的参数放在一个元组中,它们还能变化吗?我以为元组中的条目不能更改?还有,这是否意味着我应该
def chisq( args ):
## args[0] = mu
## args[1] = sigma
## args[2] = normc
## last subscript [0] gives chi-squared value, [1] gives 0 ≤ p-value ≤ 1
## can also minimize negative p-value to find best fitting chi square
return chisquare( obsperbin , expectperbin( args[0] , args[1] , args[2] ))[0]
def miniz( chisq , chisqguess , niter = 200 ):
minimizer = basinhopping( chisq , chisqguess , niter = 200 )
## Minimization methods available via https://docs.scipy.org/doc/scipy-0.18.1/reference/optimize.html
return minimizer
expperbin = expectperbin( args = [mu , sigma , normc] )
# chisqmin = chisquare( obsperbin , expperbin )[0]
# chisqmin = result.fun
""" OPTIMIZATION """
print("")
print("initial guess of optimal parameters")
initial_mu, initial_sigma, initial_normc = np.mean(data)+30 , np.std(data)+20 , maxbin( (obsperbin) )
## check optimized result against: mu = 48, sigma = 7 (via random number generator for Gaussian Distribution)
chisqguess = chisquare( obsperbin , expectperbin( args[0] , args[1] , args[2] ))[0]
## initial guess for optimization
result = miniz( chisqguess, args = [mu, sigma, normc] )
print(result)
print("")
optmu , optsigma , optnormc = result.x[0], abs(result.x[1]), result.x[2]
chisqcheck = chisquare(obsperbin, expperbin)
chisqmin = result.fun
print("chisqmin -- ",chisqmin," ",chisqcheck," -- check chi sq")
print("""
""")
## CHECK
checkbins = bstat(xdata, xdata, statistic = 'sum', bins = binborders) ## via SCIPY (imports)
binsum = checkbins[0]
binedge = checkbins[1]
binborderindex = checkbins[2]
print("binsum",binsum)
print("")
print("binedge",binedge)
print("")
print("binborderindex",binborderindex)
# Am I doing this part right?
from math import exp
from math import pi
from scipy.integrate import quad
from scipy.stats import chisquare
from scipy.optimize import basinhopping
# smallest value in the sample
small = 26.55312337811099
# largest value in the sample
big = 69.02965763016027
# a random sample from N(48, 7) with 999 sample
# values binned into 30 equidistant bins ranging
# from 'small' (bin[0] lower bound) to 'big'
# (bin[29] upper bound)
obsperbin = [ 1, 1, 2, 4, 8, 10, 13, 29, 35, 45,
51, 56, 63, 64, 96, 89, 68, 80, 61, 51,
49, 30, 34, 19, 22, 3, 7, 5, 1, 2]
numbins = len(obsperbin) # 30
numobs = sum(obsperbin) # 999
# intentionally wrong guesses of mu and sigma
# to be provided as optimizer's initial values
initial_mu, initial_sigma = 78.5, 27.0
def binbounder( small , big , numbins ):
## generates list of bound bins for histogram ++ bincount
binwide = ( big - small ) / numbins ## binwidth
binleft = [] ## left edges of bins
for index in range( numbins ):
binleft.append( small + index * binwide )
binbound = [val for val in binleft]
binbound.append( big ) ## all bin edges
return binbound
# setup the bin borders
binborders = binbounder( small , big , numbins )
def GaussDistrib( x , mu , sigma ):
return 1/(sigma * (2*pi)**(1/2)) * exp( (-1) * (x - mu)**2 / ( 2 * (sigma **2) ) )
def expectperbin( musigma ):
## musigma[0] = mu
## musigma[1] = sigma
## calculates expectation values per bin
## expectation value of single bin is equal to area under Gaussian
## from left binedge to right binedge multiplied by the sample size
e = []
for i in range(len(binborders)-1): # ith i does not exist for rightmost boundary
e.append( quad( GaussDistrib , binborders[ i ] , binborders[ i + 1 ],
args = ( musigma[0] , musigma[1] ))[0] * numobs)
return e
def chisq( musigma ):
## first subscript [0] gives chi-squared value, [1] gives 0 = p-value = 1
return chisquare( obsperbin , expectperbin( musigma ))[0]
def miniz( chisq , musigma ):
return basinhopping( chisq , musigma , niter = 200 )
## chisquare value for initial parameter guess
chisqguess = chisquare( obsperbin , expectperbin( [initial_mu , initial_sigma] ))[0]
res = miniz( chisq, [initial_mu , initial_sigma] )
print("chisquare from initial guess:" , chisqguess)
print("chisquare after optimization:" , res.fun)
print("mu, sigma after optimization:" , res.x[0], ",", res.x[1])