为什么我的metropolis算法（mcmc）的python实现如此缓慢？_Python_Performance_Machine Learning_Random_Mcmc

为什么我的metropolis算法（mcmc）的python实现如此缓慢？

python performance machine-learning random

为什么我的metropolis算法（mcmc）的python实现如此缓慢？,python,performance,machine-learning,random,mcmc,Python,Performance,Machine Learning,Random,Mcmc,我试图用Python实现该算法（Metropolis Hastings算法的一个简单版本）以下是我的实现： def Metropolis_Gaussian(p, z0, sigma, n_samples=100, burn_in=0, m=1): """ Metropolis Algorithm using a Gaussian proposal distribution. p: distribution that we want to sample from (can

我试图用Python实现该算法（Metropolis Hastings算法的一个简单版本）

以下是我的实现：

def Metropolis_Gaussian(p, z0, sigma, n_samples=100, burn_in=0, m=1):
    """
    Metropolis Algorithm using a Gaussian proposal distribution.
    p: distribution that we want to sample from (can be unnormalized)
    z0: Initial sample
    sigma: standard deviation of the proposal normal distribution.
    n_samples: number of final samples that we want to obtain.
    burn_in: number of initial samples to discard.
    m: this number is used to take every mth sample at the end
    """
    # List of samples, check feasibility of first sample and set z to first sample
    sample_list = [z0]
    _ = p(z0) 
    z = z0
    # set a counter of samples for burn-in
    n_sampled = 0

    while len(sample_list[::m]) < n_samples:
        # Sample a candidate from Normal(mu, sigma),  draw a uniform sample, find acceptance probability
        cand = np.random.normal(loc=z, scale=sigma)
        u = np.random.rand()
        try:
            prob = min(1, p(cand) / p(z))
        except (OverflowError, ValueError) as error:
            continue
        n_sampled += 1

        if prob > u:
            z = cand  # accept and make candidate the new sample

        # do not add burn-in samples
        if n_sampled > burn_in:
            sample_list.append(z)

    # Finally want to take every Mth sample in order to achieve independence
    return np.array(sample_list)[::m]

这段代码需要相当长的时间来运行，我不知道为什么。在我的Metropolis_Gaussian代码中，我试图通过

不向列表中添加重复样本

不记录老化样品

函数

pdf\t

的定义如下

from scipy.stats import t
def pdf_t(x, df=10):
    return t.pdf(x, df=df)

我回答了一个问题。我在这里提到的很多东西（不是每次迭代都计算当前的可能性，预先计算随机创新等）都可以在这里使用

实现的其他改进是不使用列表来存储示例。相反，您应该为样本预先分配内存，并将它们存储为数组。类似于

samples=np的东西。零（n_samples）

比在每次迭代时附加到列表更有效

您已经提到，您试图通过不记录老化样本来提高效率。这是个好主意。您还可以通过只记录每个第m个样本来实现类似的细化，因为您在返回语句中使用

np.array（sample_list）[：：m]

丢弃了这些样本。您可以通过更改以下内容来执行此操作：

   # do not add burn-in samples
    if n_sampled > burn_in:
        sample_list.append(z)

到

还值得注意的是，您不需要计算

min（1，p（cand）/p（z））

，只需计算

p（cand）/p（z）

。我意识到在形式上，

min

是必要的（以确保概率在0和1之间）。但是，在计算上，我们不需要最小值，因为如果

p（cand）/p（z）>1

，那么

p（cand）/p（z）

总是大于

将这一切加在一起，以及预先计算随机创新、接受概率

，并仅在您确实需要时计算可能性，我得出：

def my_Metropolis_Gaussian(p, z0, sigma, n_samples=100, burn_in=0, m=1):
    # Pre-allocate memory for samples (much more efficient than using append)
    samples = np.zeros(n_samples)

    # Store initial value
    samples[0] = z0
    z = z0
    # Compute the current likelihood
    l_cur = p(z)

    # Counter
    iter = 0
    # Total number of iterations to make to achieve desired number of samples
    iters = (n_samples * m) + burn_in

    # Sample outside the for loop
    innov = np.random.normal(loc=0, scale=sigma, size=iters)
    u = np.random.rand(iters)

    while iter < iters:
        # Random walk innovation on z
        cand = z + innov[iter]

        # Compute candidate likelihood
        l_cand = p(cand)

        # Accept or reject candidate
        if l_cand / l_cur > u[iter]:
            z = cand
            l_cur = l_cand

        # Only keep iterations after burn-in and for every m-th iteration
        if iter > burn_in and iter % m == 0:
            samples[(iter - burn_in) // m] = z

        iter += 1

    return samples