Python 3个嵌套循环：优化速度的简单模拟背景_Python_Montecarlo_Markov Chains

Python 3个嵌套循环：优化速度的简单模拟背景

python

Python 3个嵌套循环：优化速度的简单模拟背景,python,montecarlo,markov-chains,Python,Montecarlo,Markov Chains,我遇到了一个难题。这是：有一天，一个外星人来到了地球。每一天，每个外星人都会做四件事中的一件，每件事的概率相等：自杀无所事事把自己分成两个外星人（同时自杀）把自己分成三个外星人（同时自杀）外来物种最终完全灭绝的可能性有多大不幸的是，我还不能从理论上解决这个问题。然后我继续用一个基本的马尔可夫链和蒙特卡罗模拟来模拟它这不是在采访中问我的。我从一个朋友那里学到了这个问题，然后在寻找数学答案时找到了上面的链接重新解释这个问题我们从外星人的数量开始n=1n有可能不改变，递减为1

我遇到了一个难题。这是：

有一天，一个外星人来到了地球。每一天，每个外星人都会做四件事中的一件，每件事的概率相等：

自杀
无所事事
把自己分成两个外星人（同时自杀）
把自己分成三个外星人（同时自杀）

外来物种最终完全灭绝的可能性有多大

不幸的是，我还不能从理论上解决这个问题。然后我继续用一个基本的马尔可夫链和蒙特卡罗模拟来模拟它

这不是在采访中问我的。我从一个朋友那里学到了这个问题，然后在寻找数学答案时找到了上面的链接

重新解释这个问题我们从外星人的数量开始

n=1

有可能不改变，递减为

，递增为

，递减为

，每项为%25。如果

递增，即外星人相乘，我们将对

再次重复此过程。这相当于每个外星人都会再次做自己的事情。但我必须设定一个上限，这样我们才能停止模拟，避免崩溃

可能会增加，我们正在一次又一次地循环

如果外星人不知何故灭绝了，我们就不再模拟了，因为已经没有什么可以模拟的了

在

达到零或上限后，我们还将记录总体（它将是零或某个数字

>=max\u pop

）

我重复了很多次，并记录了每个结果。最后，零的数量除以结果的总数应该给我一个近似值

代码后果我对结果没意见，但我不能对代码所花的时间说同样的话。大约需要16-17秒：）

我怎样才能提高速度？如何优化循环（尤其是

while

循环）？也许有更好的方法或更好的模型？

您可以通过使用numpy一次性生成

随机整数（速度更快）来对内部循环进行矢量化，并使用算术而不是布尔逻辑删除所有if语句

while...: 
    #population changes by (-1, 0, +1, +2) for each alien
    n += np.random.randint(-1,3, size=n).sum()

使用您的精确代码进行其他一切（您可能会在其他地方找到其他优化），我使用这一更改从21.2秒变为4.3秒

在不改变算法的情况下（即使用蒙特卡罗以外的方法求解），我看不到任何其他彻底的改变可以使它更快，直到您开始编译机器代码（幸运的是，如果您安装了numba，这非常容易）

我不会给出有关numba执行的即时编译的完整教程，但我将与大家分享我的代码，并记录我所做的更改：

from time import time
import numpy as np
from numpy.random import randint
from numba import njit, int32, prange

@njit('i4(i4)')
def simulate(pop_max): #move simulation of one population to a function for parallelization
    n = 1
    while 0 < n < pop_max:
        n += np.sum(randint(-1,3,n))
    return n

@njit('i4[:](i4,i4)', parallel=True)
def solve(pop_max, iter_max):
    #this could be easily simplified to just return the raio of populations that die off vs survive to pop_max
    # which would save you some ram (though the speed is about the same)
    results = np.zeros(iter_max, dtype=int32) #numba needs int32 here rather than python int
    for i in prange(iter_max): #prange specifies that this loop can be parallelized
        results[i] = simulate(pop_max)
    return results

pop_max = 100
iter_max = 100000

t = time()
print( np.bincount(solve(pop_max, iter_max))[0] / iter_max )
print('time elapsed: ', time()-t)

从时间导入时间
将numpy作为np导入
从numpy.random导入randint
来自numba import njit，int32，prange
@njit（‘i4（i4）’）
def simulate（pop_max）：#将一个总体的模拟移动到一个函数以进行并行化
n=1
当0


在我的系统上，使用并行化编译可以将计算速度降低到0.15秒左右。
没有numpy解决方案，100k模拟大约需要5秒：
from random import choices

def simulate_em():
    def spwn(aliens):
        return choices(range(-1,3), k=aliens)

    aliens = {1:1}
    i = 1
    while aliens[i] > 0 and aliens[i] < 100:    
        i += 1
        num = aliens[i-1]
        aliens[i] = num + sum(spwn(num))

    # commented for speed
    # print(f"Round {i:<5} had {aliens[i]:>20} alien alive.")
    return (i,aliens[i])

所以100k测试运行大约需要5秒
输出（共2次运行）：
通过在选项（范围（-1,3），k=外星人的数量）上使用外星人的数量
+总和
，
您可以更轻松地求和并更快地填写您的记录？如果你的外星人数量下降到0以下，他们就灭绝了。
你可以把这篇文章发到。我想速度迷们都聚集在那里<代码>；-）@BramVanroy打得好，会的。你的花了4,3秒，包括numba？我所做的纯python、NoNumpy解决方案在4.84s上完成了100k个示例—不知道PyFIDLE env有什么规格though@PatrickArtnerOP的soln:21.2秒，我的soln:4.3秒，我的numba soln:0.6秒（包括编译）。（i5-6300U@2.4 GHz）您可以使用chache=True
来避免每次重新启动解释器时的重新编译步骤…@max9111缓存与循环并行化不兼容，在这种情况下可以给我们带来更高的速度我不知道我们可以使用randint。仅此一项就大大加快了速度。我也不知道numba的事，谢谢你的洞察力！
from time import time
import numpy as np
from numpy.random import randint
from numba import njit, int32, prange

@njit('i4(i4)')
def simulate(pop_max): #move simulation of one population to a function for parallelization
    n = 1
    while 0 < n < pop_max:
        n += np.sum(randint(-1,3,n))
    return n

@njit('i4[:](i4,i4)', parallel=True)
def solve(pop_max, iter_max):
    #this could be easily simplified to just return the raio of populations that die off vs survive to pop_max
    # which would save you some ram (though the speed is about the same)
    results = np.zeros(iter_max, dtype=int32) #numba needs int32 here rather than python int
    for i in prange(iter_max): #prange specifies that this loop can be parallelized
        results[i] = simulate(pop_max)
    return results

pop_max = 100
iter_max = 100000

t = time()
print( np.bincount(solve(pop_max, iter_max))[0] / iter_max )
print('time elapsed: ', time()-t)

from random import choices

def simulate_em():
    def spwn(aliens):
        return choices(range(-1,3), k=aliens)

    aliens = {1:1}
    i = 1
    while aliens[i] > 0 and aliens[i] < 100:    
        i += 1
        num = aliens[i-1]
        aliens[i] = num + sum(spwn(num))

    # commented for speed
    # print(f"Round {i:<5} had {aliens[i]:>20} alien alive.")
    return (i,aliens[i])

from datetime import datetime

t = datetime.now()    
d = {}
wins = 0
test = 100000
for k in range(test):
    d[k] = simulate_em()
    wins += d[k][1]>=100

print(1-wins/test)         # 0.41532
print(datetime.now()-t)    # 0:00:04.840127

Round 1     had                    1 alien alive.
Round 2     had                    3 alien alive.
Round 3     had                    6 alien alive.
Round 4     had                    9 alien alive.
Round 5     had                    7 alien alive.
Round 6     had                   13 alien alive.
Round 7     had                   23 alien alive.
Round 8     had                   20 alien alive.
Round 9     had                   37 alien alive.
Round 10    had                   54 alien alive.
Round 11    had                   77 alien alive.
Round 12    had                  118 alien alive.

Round 1     had                    1 alien alive.
Round 2     had                    0 alien alive.