Python PyMC3层次二项模型-调整后的发散_Python_Bayesian_Pymc3

Python PyMC3层次二项模型-调整后的发散

python

Python PyMC3层次二项模型-调整后的发散,python,bayesian,pymc3,Python,Bayesian,Pymc3,我试图使用pyMC3为一些实验数据建立一个简单的贝叶斯层次模型。我有两个数据集，但对于其中一个，采样器不收敛，我无法找到解决方案设置如下所示：有两种实验条件（难以想象地称为A和B）和两组在其中一种条件下测试的个体（A组和B组）每个人都做他们喜欢的试验，所以不是所有人都有相同的试验次数每个试验都有一个二元结果（1或0）每个受试者的表现数据将是一个由1和0组成的字符串，我想根据观察到的数据估计每个人的1的潜在比率因为对于一些受试者，我很少进行试验，所以我决定使用分层贝叶斯模型（参见）

我试图使用pyMC3为一些实验数据建立一个简单的贝叶斯层次模型。我有两个数据集，但对于其中一个，采样器不收敛，我无法找到解决方案

设置如下所示：

有两种实验条件（难以想象地称为A和B）和两组在其中一种条件下测试的个体（A组和B组）
每个人都做他们喜欢的试验，所以不是所有人都有相同的试验次数
每个试验都有一个二元结果（1或0）

每个受试者的表现数据将是一个由1和0组成的字符串，我想根据观察到的数据估计每个人的1的潜在比率

因为对于一些受试者，我很少进行试验，所以我决定使用分层贝叶斯模型（参见）。我决定使用的模型的灵感来源于图中所示的模型[参见下面的代码]

现在，该模型在两个数据集（B）中的一个数据集上运行良好，但在另一个数据集上，采样器不会收敛。我在网上看到一个可能的解决方案是切换到a，但我不知道如何在这里实现

下面是一个简单的工作示例和结果


import numpy as np
import pymc3 as pm
import theano.tensor as tt
import matplotlib.pyplot as plt




def run():
    # Define data
    datasets_names = ['A', 'B']
    number_of_individuals =[22, 17] # per experimental condition

    # Number of trials and number of successes (1) of each individual
    n_trials_A = [21, 15,  6,  5, 10,  6,  4,  6,  5,  7, 14, 12, 15,  4,  4,  6,  6,  9,  7,  6, 11, 10]
    hits_A = [21, 14,  6,  0,  6,  6,  3,  6,  5,  6, 14,  9, 15,  4,  4,  5,  6,  8,  7,  4,  8, 10]

    n_trials_B = [5,  5, 33,  4, 13, 18, 24,  8,  8,  9,  9,  7, 14,  8, 15,  9, 11]
    hits_B = [2,  5, 26,  3,  7,  7, 13,  6,  1,  5,  4,  2,  7,  5,  9,  4,  1]

    datasets = [(number_of_individuals[0], n_trials_A, hits_A), (number_of_individuals[1], n_trials_B, hits_B)]

    # Model each dataset separately
    for i, (m, n, h) in enumerate(datasets):
        print('Modelling dataset: ', datasets_names[i])

        # pyMC3 model
        with pm.Model() as model:
            # The model is from: https://docs.pymc.io/notebooks/hierarchical_partial_pooling.html

            # Define hyperpriors
            phi = pm.Uniform('phi', lower=0.0, upper=1.0)

            kappa_log = pm.Exponential('kappa_log', lam=1.5)
            kappa = pm.Deterministic('kappa', tt.exp(kappa_log))

            # define second level of hierarchical model
            thetas = pm.Beta('thetas', alpha=phi*kappa, beta=(1.0-phi)*kappa, shape=m)

            # Likelihood
            y = pm.Binomial('y', n=n, p=thetas, observed=h)

            # Fit
            trace = pm.sample(6000, tune=2000, nuts_kwargs={'target_accept': 0.95}) 

        # Show traceplot
        pm.traceplot(trace)
    plt.show()




if __name__ == "__main__":
    run()

这是在代码运行时打印到控制台的内容：


Modeeling dataset:  A
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (4 chains in 4 jobs)
NUTS: [thetas, kappa_log, phi]
Sampling 4 chains: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 32000/32000 [00:52<00:00, 610.30draws/s]
There were 928 divergences after tuning. Increase `target_accept` or reparameterize.
There were 818 divergences after tuning. Increase `target_accept` or reparameterize.
There were 885 divergences after tuning. Increase `target_accept` or reparameterize.
There were 842 divergences after tuning. Increase `target_accept` or reparameterize.
The number of effective samples is smaller than 25% for some parameters.
Modeeling dataset:  B
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (4 chains in 4 jobs)
NUTS: [thetas, kappa_log, phi]
Sampling 4 chains: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 32000/32000 [00:35<00:00, 899.07draws/s]


Modeeling数据集：一个
自动分配螺母采样器。。。
使用抖动+自适应诊断初始化螺母。。。
多进程采样（4个作业中的4个链）
螺母：[θ，kappa_log，φ]
取样4链：100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 32000/32000 [00:52