Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/335.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
python循环中的多处理_Python_Python 3.x_List_Multiprocessing - Fatal编程技术网

python循环中的多处理

python循环中的多处理,python,python-3.x,list,multiprocessing,Python,Python 3.x,List,Multiprocessing,我在正对的帮助下生成负对。我想通过使用CPU的所有核心来加快进程。在一个CPU内核上,它几乎需要五天的时间,包括白天和晚上 我倾向于在多处理中更改以下代码。同时,我也没有“肯定和否定.csv”的列表 修改代码 def multi_func(iden, negatives): for combo in tqdm(itertools.combinations(iden.values(), 2), desc="Negatives"): for cross_s

我在正对的帮助下生成负对。我想通过使用CPU的所有核心来加快进程。在一个CPU内核上,它几乎需要五天的时间,包括白天和晚上

我倾向于在多处理中更改以下代码。同时,我也没有“肯定和否定.csv”的列表

修改代码

def multi_func(iden, negatives):
    for combo in tqdm(itertools.combinations(iden.values(), 2), desc="Negatives"):
        for cross_sample in itertools.product(combo[0], combo[1]):
            negatives = negatives.append(pd.Series({"file_x": cross_sample[0], "file_y": cross_sample[1]}).T,
                                         ignore_index=True)
if Path("positives_negatives.csv").exists():
    df = pd.read_csv("positives_negatives.csv")
else:
    with ProcessPoolExecutor() as pool:
        # take cpu_count combinations from identities.values
        for combos in tqdm(more_itertools.ichunked(itertools.combinations(identities.values(), 2), cpu_count())):
            # for each combination iterator that comes out, calculate the cross
            for cross_samples in pool.map(compute_cross_samples, combos):
                # for each product iterator "cross_samples", iterate over its values and append them to negatives
                negatives = negatives.append(cross_samples)

    negatives["decision"] = "No"

    negatives = negatives.sample(positives.shape[0])
    df = pd.concat([positives, negatives]).reset_index(drop=True)
    df.to_csv("positives_negatives.csv", index=False)
已使用

if Path("positives_negatives.csv").exists():
    df = pd.read_csv("positives_negatives.csv")
else:
    with concurrent.futures.ProcessPoolExecutor() as executor:
        secs = [5, 4, 3, 2, 1]
        results = executor.map(multi_func(identities, negatives), secs)

    negatives["decision"] = "No"
    negatives = negatives.sample(positives.shape[0])
    df = pd.concat([positives, negatives]).reset_index(drop=True)
    df.to_csv("positives_negatives.csv", index=False)

最好的方法是实现进程池执行器类并创建一个单独的函数。就像你可以通过这种方式实现一样

from concurrent.futures.process import ProcessPoolExecutor
import more_itertools
from os import cpu_count

def compute_cross_samples(x):
    return pd.DataFrame(itertools.product(*x), columns=["file_x", "file_y"])
修改代码

def multi_func(iden, negatives):
    for combo in tqdm(itertools.combinations(iden.values(), 2), desc="Negatives"):
        for cross_sample in itertools.product(combo[0], combo[1]):
            negatives = negatives.append(pd.Series({"file_x": cross_sample[0], "file_y": cross_sample[1]}).T,
                                         ignore_index=True)
if Path("positives_negatives.csv").exists():
    df = pd.read_csv("positives_negatives.csv")
else:
    with ProcessPoolExecutor() as pool:
        # take cpu_count combinations from identities.values
        for combos in tqdm(more_itertools.ichunked(itertools.combinations(identities.values(), 2), cpu_count())):
            # for each combination iterator that comes out, calculate the cross
            for cross_samples in pool.map(compute_cross_samples, combos):
                # for each product iterator "cross_samples", iterate over its values and append them to negatives
                negatives = negatives.append(cross_samples)

    negatives["decision"] = "No"

    negatives = negatives.sample(positives.shape[0])
    df = pd.concat([positives, negatives]).reset_index(drop=True)
    df.to_csv("positives_negatives.csv", index=False)

你最好的办法是将工作分解成子组,然后从那里使用多重处理。如果可能的话,请给我一个与“else”子句相关的例子。其实。。。从也许开始?事实上,我已经做了两次,但都不管用。你能再补充一些吗?你看到了什么样的加速?另外,你应该能够在2天之后将你自己的答案标记为答案,并将支票放在左边@是的。稍后,我将添加更多细节