Python 如何从PyStan中提取对数似然的后验样本？_Python_Cross Validation_Bayesian_Stan_Pystan

Python 如何从PyStan中提取对数似然的后验样本？

python

Python 如何从PyStan中提取对数似然的后验样本？,python,cross-validation,bayesian,stan,pystan,Python,Cross Validation,Bayesian,Stan,Pystan,我需要对数似然项的后验样本来运行 log_lik : ndarray Array of size n x m containing n posterior samples of the log likelihood terms :math:`p(y_i|\theta^s)`. 其中小示例为pip安装pystan和 import pystan schools_code = """ data { int<lower=0> J; // number of scho

我需要对数似然项的后验样本来运行

log_lik : ndarray
    Array of size n x m containing n posterior samples of the log likelihood
    terms :math:`p(y_i|\theta^s)`.

其中小示例为

pip安装pystan

和

import pystan
schools_code = """
data {
    int<lower=0> J; // number of schools
    real y[J]; // estimated treatment effects
    real<lower=0> sigma[J]; // s.e. of effect estimates
}
parameters {
    real mu;
    real<lower=0> tau;
    real eta[J];
}
transformed parameters {
    real theta[J];
    for (j in 1:J)
    theta[j] = mu + tau * eta[j];
}
model {
    eta ~ normal(0, 1);
    y ~ normal(theta, sigma);
}
"""

schools_dat = {'J': 8,
               'y': [28,  8, -3,  7, -1,  1, 18, 12],
               'sigma': [15, 10, 16, 11,  9, 11, 10, 18]}

sm = pystan.StanModel(model_code=schools_code)
fit = sm.sampling(data=schools_dat, iter=1000, chains=4)

导入pystan
学校代码=”“
资料{
int J；//学校数量
实y[J]；//估计的治疗效果
实西格玛[J]；//效应估计的s.e
}
参数{
实木；
真头；
实际埃塔[J]；
}
变换参数{
实θ[J]；
对于（1:j中的j）
θ[j]=mu+tau*eta[j]；
}
模型{
eta～正常（0,1）；
y~正常值（θ，σ）；
}
"""
学校{'J'：8所，
‘y’：[28,8，-3,7，-1,1,18,12]，
“西格玛”：[15,10,16,11,9,11,10,18]}
sm=pystan.StanModel（型号代码=学校代码）
拟合=标准采样（数据=学校数据，iter=1000，链=4）

我如何获得PyStan拟合模型的对数似然的后验样本？

您可以通过执行以下操作获得对数似然的后验样本：

logp=fit.extract（）['lp\uuu']

我认为在这种情况下计算对数似然的正确方法如下：

generated quantities {
    vector[J] log_lik;
    for (i in 1:J)
        log_lik[i] = normal_lpdf(y[i] | theta, sigma);
}

loo, loos, ks = psisloo(fit['log_lik'])
print('PSIS-LOO value: {:.2f}'.format(loo))

之后，您可以运行以下命令：

generated quantities {
    vector[J] log_lik;
    for (i in 1:J)
        log_lik[i] = normal_lpdf(y[i] | theta, sigma);
}

loo, loos, ks = psisloo(fit['log_lik'])
print('PSIS-LOO value: {:.2f}'.format(loo))

完整代码将变为：

import pystan
from psis import psisloo
schools_code = """
data {
    int<lower=0> J;            // number of schools
    real y[J];                 // estimated treatment effects
    real<lower=0> sigma[J];    // s.e. of effect estimates
}
parameters {
    real mu;
    real<lower=0> tau;
    real eta[J];
}
transformed parameters {
    real theta[J];
    for (j in 1:J)
       theta[j] = mu + tau * eta[j];
}
model {
    eta ~ normal(0, 1);
    y ~ normal(theta, sigma);
}
generated quantities {
    vector[J] log_lik;
    for (i in 1:J)
         log_lik[i] = normal_lpdf(y[i] | theta, sigma);
}
"""

schools_dat = {'J': 8,
               'y': [28,  8, -3,  7, -1,  1, 18, 12],
               'sigma': [15, 10, 16, 11,  9, 11, 10, 18]}

sm = pystan.StanModel(model_code=schools_code) 
fit = sm.sampling(data=schools_dat, iter=1000, chains=4)
loo, loos, ks = psisloo(fit['log_lik'])
print('PSIS-LOO value: {:.2f}'.format(loo))

导入pystan
从PSISLO导入
学校代码=”“
资料{
int J；//学校数量
实y[J]；//估计的治疗效果
实西格玛[J]；//效应估计的s.e
}
参数{
实木；
真头；
实际埃塔[J]；
}
变换参数{
实θ[J]；
对于（1:j中的j）
θ[j]=mu+tau*eta[j]；
}
模型{
eta～正常（0,1）；
y~正常值（θ，σ）；
}
生成量{
向量[J]log_-lik；
对于（1:J中的i）
log_lik[i]=正常的lpdf（y[i]|θ，σ）；
}
"""
学校{'J'：8所，
‘y’：[28,8，-3,7，-1,1,18,12]，
“西格玛”：[15,10,16,11,9,11,10,18]}
sm=pystan.StanModel（型号代码=学校代码）
拟合=标准采样（数据=学校数据，iter=1000，链=4）
loo，loos，ks=psisloo（fit['log_lik']）
打印（'PSIS-LOO值：{.2f}'。格式（LOO））

这是错误的

lp__

是对数后验概率，而不是对数似然。因此，

lp__;

还包括先前密度的贡献，以及允许在无约束尺度上进行采样所需的任何雅可比调整。这是错误的。PSIS-LOO（和WAIC）需要对数似然，而不是对数后验。与您的示例相关的R包LOO中还有一个方便的函数：