Python Monty Hall:多处理比直接处理慢

Python Monty Hall:多处理比直接处理慢,python,multiprocessing,simulation,Python,Multiprocessing,Simulation,我正在为我的Monty Hall游戏模拟尝试多处理,以提高性能。游戏支付10毫米的次数,直接运行时大约需要17秒,但是,我的多处理实现运行的时间要长得多。我显然做错了什么,但我不知道是什么 import multiprocessing from MontyHall.game import Game from MontyHall.player import Player from Timer.timer import Timer def doWork(input, output): w

我正在为我的Monty Hall游戏模拟尝试多处理,以提高性能。游戏支付10毫米的次数,直接运行时大约需要17秒,但是,我的多处理实现运行的时间要长得多。我显然做错了什么,但我不知道是什么

import multiprocessing
from MontyHall.game import Game
from MontyHall.player import Player
from Timer.timer import Timer


def doWork(input, output):
    while True:
        try:
            f = input.get(timeout=1)
            res = f()
            output.put(res)
        except:
            break


def main():
    # game setup
    player_1 = Player(True) # always switch strategy
    game_1 = Game(player_1)

    input_queue = multiprocessing.Queue()
    output_queue = multiprocessing.Queue()

    # total simulations
    for i in range(10000000):
        input_queue.put(game_1.play_game)

    with Timer('timer') as t:
        # initialize 5 child processes
        processes = []
        for i in range(5):
            p = multiprocessing.Process(target=doWork, args=(input_queue, output_queue))
            processes.append(p)
            p.start()
        
        # terminate the processes
        for p in processes:
            p.join()

        results = []
        while len(results) != 10000000:
            r = output_queue.get()
            results.append(r)

    win = results.count(True) / len(results)
    loss = results.count(False) / len(results)

    print(len(results))
    print(win)
    print(loss)


if __name__ == '__main__':
    main()
这是我的第一篇文章。关于发帖礼仪的建议也很感激。多谢各位

课程代码:

class Player(object):
    def __init__(self, switch_door=False):
        self._switch_door = switch_door

    @property
    def switch_door(self):
        return self._switch_door

    @switch_door.setter
    def switch_door(self, iswitch):
        self._switch_door = iswitch

    def choose_door(self):
        return random.randint(0, 2)

class Game(object):
    def __init__(self, player):
        self.player = player

    def non_prize_door(self, door_with_prize, player_choice):
        """Returns a door that doesn't contain the prize and that isn't the players original choice"""
        x = 1
        while x == door_with_prize or x == player_choice:
            x = (x + 1) % 3  # assuming there are only 3 doors. Can be modified for more doors
        return x

    def switch_function(self, open_door, player_choice):
        """Returns the door that isn't the original player choice and isn't the opened door """
        x = 1
        while x == open_door or x == player_choice:
            x = (x + 1) % 3  # assuming there are only 3 doors. Can be modified for more doors
        return x

    def play_game(self):
        """Game Logic"""
        # randomly places the prize behind one of the three doors
        door_with_prize = random.randint(0, 2)

        # player chooses a door
        player_choice = self.player.choose_door()

        # host opens a door that doesn't contain the prize
        open_door = self.non_prize_door(door_with_prize, player_choice)

        # final player choice
        if self.player.switch_door:
            player_choice = self.switch_function(open_door, player_choice)

        # Result
        return player_choice == door_with_prize

用于在不进行多处理的情况下运行它的代码:


from MontyHall.game import Game
from MontyHall.player import Player
from Timer.timer import Timer


def main():

    # Setting up the game
    player_2 = Player(True)  # always switch
    game_1 = Game(player_2)

    # Testing out the hypothesis
    with Timer('timer_1') as t:
        results = []
        for i in range(10000000):
            results.append(game_1.play_game())

        win = results.count(True) / len(results)
        loss = results.count(False) / len(results)

        print(
            f'When switch strategy is {player_2.switch_door}, the win rate is {win:.2%} and the loss rate is {loss:.2%}')

if __name__ == '__main__':
    main()
由于您没有给出我们可以在本地运行的完整代码,我只能推测。我的猜测是,您正在将一个对象(游戏中的一个方法)传递给其他进程,因此酸洗和取消酸洗花费了太多时间。与可以“共享”数据的多线程不同,在多线程中,需要打包数据并发送到另一个进程

然而,当我在优化之前尝试优化我的代码配置文件时,我总是遵循一条规则!知道什么东西慢比猜测要好得多

这是一个多处理程序,因此市场上没有很多选择。您可以尝试支持多处理的
viztracer

pip install viztracer
viztracer --log_multiprocess your_program.py
它将生成一个
result.html
,您可以用chrome打开它。或者你也可以这样做

vizviewer result.html
我建议减少迭代次数,这样就可以看到整个画面(因为viztracer使用循环缓冲区,1000万次迭代肯定会溢出)。但是,如果您不这样做,您仍然可以执行最后一段代码,这对您了解发生了什么应该是很有帮助的

我使用viztracer,因为你给出了整个代码

这是工作进程中的一个迭代。正如你所知道的,实际的工作部分非常小(中间的黄色ISH片,代码> P.<代码>)。大部分时间都花在接收和放置数据上,这消除了并行化的优势

正确的方法是分批进行。另外,由于这个游戏实际上不需要任何数据,您应该只向进程发送“我想做1000次”,让它去做,而不是一个接一个地发送方法

使用viztracer,您可以很容易地发现另一个有趣的问题:


这是员工流程的大图。注意最后的大“无”字吗?因为您的员工需要一个
超时
才能完成,而这正是他们等待的时候。您应该想出一个更好的主意来优雅地完成工作进程。

更新了我的代码。我从根本上误解了多重处理方法

def do_work(input, output):
    """Generic function that takes an input function and argument and runs it"""
    while True:
        try:
            f, args = input.get(timeout=1)
            results = f(*args)
            output.put(results)
        except:
            output.put('Done')
            break


def run_sim(game, num_sim):
    """Runs the game the given number of times"""
    res = []
    for i in range(num_sim):
        res.append(game.play_game())
    return res


def main():
    input_queue = multiprocessing.Queue()
    output_queue = multiprocessing.Queue()
    g = Game(Player(False))  # set up game and player
    num_sim = 2000000

    for i in range(5):
        input_queue.put((run_sim, (g, num_sim)))  # run sim with game object and number of simulations passed into
    # the queue

    with Timer('Monty Hall Timer: ') as t:
        processes = []  # list to save processes
        for i in range(5):
            p = multiprocessing.Process(target=do_work, args=(input_queue, output_queue))
            processes.append(p)
            p.start()

        results = []
        while True:
            r = output_queue.get()
            if r != 'Done':
                results.append(r)
            else:
                break

        # terminate processes
        for p in processes:
            p.terminate()

    # combining the five returned list
    flat_list = [item for sublist in results for item in sublist]
    print(len(flat_list))
    print(len(results))

我们如何重现您的结果?您尚未提供所有代码。请同时添加一个代码示例,说明如何在不进行多处理的情况下执行此操作。谢谢。虽然不完全是我想要的,但你的回答帮助我看到了更大的图景并得出了解决方案。