Python multiprocessing.Process（）如何知道要打开多少个并发进程？_Python_Multithreading_Parallel Processing_Multiprocessing

Python multiprocessing.Process（）如何知道要打开多少个并发进程？

python multithreading parallel-processing

Python multiprocessing.Process（）如何知道要打开多少个并发进程？,python,multithreading,parallel-processing,multiprocessing,Python,Multithreading,Parallel Processing,Multiprocessing,我正在运行一个脚本来获取数据库表的列表，检查每个表的行数，并将每个查询的结果附加到字典中。我正在使用多处理来加快它的速度：Manager创建一个可共享列表和可共享字典，进程可以从中读取和附加到其中，Process设置进程 from multiprocessing import Process, Manager def main(): mgr = Manager() # Function to get the list of tables table_list = mgr

我正在运行一个脚本来获取数据库表的列表，检查每个表的行数，并将每个查询的结果附加到字典中。我正在使用多处理来加快它的速度：Manager创建一个可共享列表和可共享字典，进程可以从中读取和附加到其中，Process设置进程

from multiprocessing import Process, Manager

def main():
    mgr = Manager()
    # Function to get the list of tables
    table_list = mgr.list(get_table_list())

    counts = mgr.dict()
    for table in table_list:
        # get_table_count runs a 'SELECT COUNT(*) FROM <table>' and appends
        # the result to the counts dict
        p = Process(target=select_star, args=(table, counts, 'prod'))
        p.start()
        p.join()

2-将Pool.map（）与itertools.partial（）一起使用

multiprocessing.Process

不知道有多少其他进程处于打开状态，也不知道如何管理正在运行的

Process

对象的数量。您需要使用

多处理.Pool

来获得该功能

直接使用

Process

时，只要调用

p.start（）

，就可以启动子流程，并在调用

p.join（）

时等待

流程退出。因此，在示例代码中，您一次只运行一个进程，但启动了len（表列表）
不同的进程
from multiprocessing import Process, Manager

def main():
    mgr = Manager()
    # Function to get the list of tables
    table_list = mgr.list(get_table_list())

    counts = mgr.dict()
    for table in table_list:
        # get_table_count runs a 'SELECT COUNT(*) FROM <table>' and appends
        # the result to the counts dict
        p = Process(target=select_star, args=(table, counts, 'prod'))
        p.start()
        p.join()

这不是一个好办法；因为您一次只启动一个进程，所以实际上并没有同时执行任何操作。由于启动子进程和访问Manager.dict
的开销，这最终会比常规的单线程/进程方法慢。您应该只使用池
：
from functools import partial
from multiprocessing import Manager, Pool

def select_star(table, counts, type_):  # counts and type_ will always be the counts dict and "prod", respectively
   pass

def main():
    mgr = Manager()
    counts = mgr.dict()

    p = Pool()
    func = partial(select_star, counts, "prod")  # Using a partial lets us pass extra parameters to select_start
    p.map(func, get_table_list())  # No need to use a manager for the list, since you're not passing the whole thing to the children.

if __name__ == "__main__":
    main()

multiprocessing.Process
不知道有多少其他进程处于打开状态，也不知道如何管理正在运行的Process
对象的数量。您需要使用多处理.Pool
来获得该功能
直接使用Process
时，只要调用p.start（）
，就可以启动子流程，并在调用p.join（）
时等待流程退出。因此，在示例代码中，您一次只运行一个进程，但启动了len（表列表）
不同的进程
from multiprocessing import Process, Manager

def main():
    mgr = Manager()
    # Function to get the list of tables
    table_list = mgr.list(get_table_list())

    counts = mgr.dict()
    for table in table_list:
        # get_table_count runs a 'SELECT COUNT(*) FROM <table>' and appends
        # the result to the counts dict
        p = Process(target=select_star, args=(table, counts, 'prod'))
        p.start()
        p.join()

这不是一个好办法；因为您一次只启动一个进程，所以实际上并没有同时执行任何操作。由于启动子进程和访问Manager.dict
的开销，这最终会比常规的单线程/进程方法慢。您应该只使用池
：
from functools import partial
from multiprocessing import Manager, Pool

def select_star(table, counts, type_):  # counts and type_ will always be the counts dict and "prod", respectively
   pass

def main():
    mgr = Manager()
    counts = mgr.dict()

    p = Pool()
    func = partial(select_star, counts, "prod")  # Using a partial lets us pass extra parameters to select_start
    p.map(func, get_table_list())  # No need to use a manager for the list, since you're not passing the whole thing to the children.

if __name__ == "__main__":
    main()

multiprocessing.Process
不知道有多少其他进程处于打开状态，也不知道如何管理正在运行的Process
对象的数量。您需要使用多处理.Pool
来获得该功能
直接使用Process
时，只要调用p.start（）
，就可以启动子流程，并在调用p.join（）
时等待流程退出。因此，在示例代码中，您一次只运行一个进程，但启动了len（表列表）
不同的进程
from multiprocessing import Process, Manager

def main():
    mgr = Manager()
    # Function to get the list of tables
    table_list = mgr.list(get_table_list())

    counts = mgr.dict()
    for table in table_list:
        # get_table_count runs a 'SELECT COUNT(*) FROM <table>' and appends
        # the result to the counts dict
        p = Process(target=select_star, args=(table, counts, 'prod'))
        p.start()
        p.join()

这不是一个好办法；因为您一次只启动一个进程，所以实际上并没有同时执行任何操作。由于启动子进程和访问Manager.dict
的开销，这最终会比常规的单线程/进程方法慢。您应该只使用池
：
from functools import partial
from multiprocessing import Manager, Pool

def select_star(table, counts, type_):  # counts and type_ will always be the counts dict and "prod", respectively
   pass

def main():
    mgr = Manager()
    counts = mgr.dict()

    p = Pool()
    func = partial(select_star, counts, "prod")  # Using a partial lets us pass extra parameters to select_start
    p.map(func, get_table_list())  # No need to use a manager for the list, since you're not passing the whole thing to the children.

if __name__ == "__main__":
    main()

multiprocessing.Process
不知道有多少其他进程处于打开状态，也不知道如何管理正在运行的Process
对象的数量。您需要使用多处理.Pool
来获得该功能
直接使用Process
时，只要调用p.start（）
，就可以启动子流程，并在调用p.join（）
时等待流程退出。因此，在示例代码中，您一次只运行一个进程，但启动了len（表列表）
不同的进程
from multiprocessing import Process, Manager

def main():
    mgr = Manager()
    # Function to get the list of tables
    table_list = mgr.list(get_table_list())

    counts = mgr.dict()
    for table in table_list:
        # get_table_count runs a 'SELECT COUNT(*) FROM <table>' and appends
        # the result to the counts dict
        p = Process(target=select_star, args=(table, counts, 'prod'))
        p.start()
        p.join()

这不是一个好办法；因为您一次只启动一个进程，所以实际上并没有同时执行任何操作。由于启动子进程和访问Manager.dict
的开销，这最终会比常规的单线程/进程方法慢。您应该只使用池
：
from functools import partial
from multiprocessing import Manager, Pool

def select_star(table, counts, type_):  # counts and type_ will always be the counts dict and "prod", respectively
   pass

def main():
    mgr = Manager()
    counts = mgr.dict()

    p = Pool()
    func = partial(select_star, counts, "prod")  # Using a partial lets us pass extra parameters to select_start
    p.map(func, get_table_list())  # No need to use a manager for the list, since you're not passing the whole thing to the children.

if __name__ == "__main__":
    main()

从：
在多处理中，通过创建进程对象生成进程
然后调用它的start（）方法
简而言之，它不管理开放进程的数量。当您从以下位置调用start（）时，它只会生成一个进程。
：
在多处理中，通过创建进程对象生成进程
然后调用它的start（）方法
简而言之，它不管理开放进程的数量。当您从以下位置调用start（）时，它只会生成一个进程。
：
在多处理中，通过创建进程对象生成进程
然后调用它的start（）方法
简而言之，它不管理开放进程的数量。当您从以下位置调用start（）时，它只会生成一个进程。
：
在多处理中，通过创建进程对象生成进程
然后调用它的start（）方法
简而言之，它不管理开放进程的数量。它只是在您调用start（）时生成一个进程。
作为使用池的替代方法，他也不能在每次调用start
后立即调用join
。相反，在所有进程启动后，在单独的循环中对所有进程调用join
。@beetea True，但这将导致len（表列表）
并发运行的进程数，可能是数百或数千个进程。这几乎肯定会使系统陷入停顿。并增加了大量的进程启动开销。除了并发运行的进程数，您真的不需要更多的多处理.cpu\u count（）。他只是建议另一种选择，因为他似乎意识到