Python 如何并行化或使用多核来加速while循环？_Python_Python 3.x

Python 如何并行化或使用多核来加速while循环？

python python-3.x

Python 如何并行化或使用多核来加速while循环？,python,python-3.x,Python,Python 3.x,我有一个16核处理器的实例，我有一个while循环，如下所示 count = 200000 num = 0 pbar = tqdm(total=count) lst = [] while num <= count: random_folder = os.path.join(path, np.random.choice(os.listdir(path))) file_path = os.path.join(path, np.random.choice(os.listdir(

我有一个16核处理器的实例，我有一个while循环，如下所示

count = 200000
num = 0

pbar = tqdm(total=count)
lst = []
while num <= count:
    random_folder = os.path.join(path, np.random.choice(os.listdir(path)))
    file_path = os.path.join(path, np.random.choice(os.listdir(random_folder)))
    if not os.path.isdir(file_path):
        lst.append(file_path)
        pbar.update(1)
        num += 1

count=200000
num=0
pbar=tqdm（总数=计数）
lst=[]
而num可以使用多处理，同时使用所有内核
看
大概是这样的：
from multiprocessing import Pool

def get_random_file(num_of_files):
    # your logic goes here
    count = 0
    random_files = []
    while count <  num_of_files: 
        count += 1
        pass
        #get random file and append to 'random_files'
    return random_files

if __name__ == '__main__':
    with Pool(16) as p:
        num_of_files = [200000/16 for i in range(1,16)]
        random_files = p.map(get_random_file,num_of_files)
        # random_files is a list of lists - you need to merge them into one list

来自多处理导入池的
def get_random_文件（文件数）：
#你的逻辑是这样的
计数=0
随机_文件=[]
当计数小于\u文件的数量时：
计数+=1
通过
#获取随机文件并附加到“随机文件”
返回随机文件
如果uuuu name uuuuuu='\uuuuuuu main\uuuuuuu'：
将池（16）作为p：
_文件的数量=[200000/16，对于范围（1,16）中的i）
random\u files=p.map（获取\u random\u file，文件数）
#random_文件是一个列表-您需要将它们合并到一个列表中
您可以使用多处理，同时使用所有内核
看
大概是这样的：
from multiprocessing import Pool

def get_random_file(num_of_files):
    # your logic goes here
    count = 0
    random_files = []
    while count <  num_of_files: 
        count += 1
        pass
        #get random file and append to 'random_files'
    return random_files

if __name__ == '__main__':
    with Pool(16) as p:
        num_of_files = [200000/16 for i in range(1,16)]
        random_files = p.map(get_random_file,num_of_files)
        # random_files is a list of lists - you need to merge them into one list

来自多处理导入池的
def get_random_文件（文件数）：
#你的逻辑是这样的
计数=0
随机_文件=[]
当计数小于\u文件的数量时：
计数+=1
通过
#获取随机文件并附加到“随机文件”
返回随机文件
如果uuuu name uuuuuu='\uuuuuuu main\uuuuuuu'：
将池（16）作为p：
_文件的数量=[200000/16，对于范围（1,16）中的i）
random\u files=p.map（获取\u random\u file，文件数）
#random_文件是一个列表-您需要将它们合并到一个列表中
您的代码在做什么？@BalajiAmbresh因此，path
变量是一个文件夹名称，其中包含多个子文件夹。所以我从这些子文件夹中选择随机文件夹，并将其存储在名为random\u folder
的变量中。在获得一个随机文件夹后，我将获得该随机文件夹中所有文件的列表（主要是.pdf文件
），并从中选择一个随机文件，并将其存储在文件路径
变量中。我一直这样做，直到计数达到200000我猜你想挑选count
数量的pdf
文件。一定要考虑重复的内容。例如，循环的2次迭代可以在同一文件夹下选择相同的pdf
文件。@BalajiAmbresh是的。当然我也完全忘记了。你的代码在做什么？@BalajiAmbresh所以，path
variable是一个文件夹名，其中有多个子文件夹。所以我从这些子文件夹中选择随机文件夹，并将其存储在名为random\u folder
的变量中。在获得一个随机文件夹后，我将获得该随机文件夹中所有文件的列表（主要是.pdf文件
），并从中选择一个随机文件，并将其存储在文件路径
变量中。我一直这样做，直到计数达到200000我猜你想挑选count
数量的pdf
文件。一定要考虑重复的内容。例如，循环的2次迭代可以在同一文件夹下选择相同的pdf
文件。@BalajiAmbresh是的。当然，我也完全忘记了。在我的情况下，函数get\u random\u file
的输入应该是什么？我如何在这里写下我的while条件。另外，当它达到计数值200000
code已更新时，如何停止它。该函数的每个副本将创建200000/16个随机文件。这将并行运行，非常感谢。。让我检查一下。@user\u 12我对TQM不太清楚。。首先，让它在没有TQM的情况下工作。如果你面临问题，请分享代码。（代码已修改-查看并重试）在我的情况下，函数get_random_file的输入应该是什么？我如何在这里写下我的while条件。另外，当它达到计数值200000code已更新时，如何停止它。该函数的每个副本将创建200000/16个随机文件。这将并行运行，非常感谢。。让我检查一下。@user\u 12我对TQM不太清楚。。首先，让它在没有TQM的情况下工作。如果你面临问题，请分享代码。（代码已修改-查看并重试）
from multiprocessing import Pool

def get_random_file(num_of_files):
    # your logic goes here
    count = 0
    random_files = []
    while count <  num_of_files: 
        count += 1
        pass
        #get random file and append to 'random_files'
    return random_files

if __name__ == '__main__':
    with Pool(16) as p:
        num_of_files = [200000/16 for i in range(1,16)]
        random_files = p.map(get_random_file,num_of_files)
        # random_files is a list of lists - you need to merge them into one list