为什么我的python多处理代码返回相同的随机数结果?

为什么我的python多处理代码返回相同的随机数结果?,python,multiprocessing,Python,Multiprocessing,我在分析一个大图表。因此,我将图形分成若干块,希望使用多核CPU时速度会更快。然而,我的模型是一个随机模型,所以每次运行的结果可能不一样。我正在测试这个想法,我一直得到相同的结果,所以我想知道我的代码是否正确 这是我的密码 from multiprocessing import Process, Queue # split a list into evenly sized chunks def chunks(l, n): return [l[i:i+n] for i in range

我在分析一个大图表。因此,我将图形分成若干块,希望使用多核CPU时速度会更快。然而,我的模型是一个随机模型,所以每次运行的结果可能不一样。我正在测试这个想法,我一直得到相同的结果,所以我想知道我的代码是否正确

这是我的密码

from multiprocessing import Process, Queue

# split a list into evenly sized chunks

def chunks(l, n):
    return [l[i:i+n] for i in range(0, len(l), n)]

def multiprocessing_icm(queue, nodes):
    queue.put(independent_cascade_igraph(twitter_igraph, nodes, steps=1))

def dispatch_jobs(data, job_number):
    total = len(data)
    chunk_size = total / job_number
    slice = chunks(data, chunk_size)
    jobs = []
    processes = []
    queue = Queue()
    for i, s in enumerate(slice):
        j = Process(target=multiprocessing_icm, args=(queue, s))
        jobs.append(j)
    for j in jobs:
        j.start()
    for j in jobs:
        j.join()

    return queue

dispatch_jobs(['121817564', '121817564'], 2)
如果您想知道什么是
independent\u cascade\u igraph
。这是密码

def independent_cascade_igraph(G, seeds, steps=0):
    # init activation probabilities
    for e in G.es():
        if 'act_prob' not in e.attributes():
            e['act_prob'] = 0.1
        elif e['act_prob'] > 1:
            raise Exception("edge activation probability:", e['act_prob'], "cannot be larger than 1")

    # perform diffusion
    A = copy.deepcopy(seeds)  # prevent side effect
    if steps <= 0:
        # perform diffusion until no more nodes can be activated
        return _diffuse_all(G, A)
    # perform diffusion for at most "steps" rounds
    return _diffuse_k_rounds(G, A, steps)

def _diffuse_all(G, A):
    tried_edges = set()
    layer_i_nodes = [ ]
    layer_i_nodes.append([i for i in A])  # prevent side effect
    while True:
        len_old = len(A)
        (A, activated_nodes_of_this_round, cur_tried_edges) = _diffuse_one_round(G, A, tried_edges)
        layer_i_nodes.append(activated_nodes_of_this_round)
        tried_edges = tried_edges.union(cur_tried_edges)
        if len(A) == len_old:
            break
    return layer_i_nodes

def _diffuse_k_rounds(G, A, steps):
    tried_edges = set()
    layer_i_nodes = [ ]
    layer_i_nodes.append([i for i in A])
    while steps > 0 and len(A) < G.vcount():
        len_old = len(A)
        (A, activated_nodes_of_this_round, cur_tried_edges) = _diffuse_one_round(G, A, tried_edges)
        layer_i_nodes.append(activated_nodes_of_this_round)
        tried_edges = tried_edges.union(cur_tried_edges)
        if len(A) == len_old:
            break
        steps -= 1
    return layer_i_nodes

def _diffuse_one_round(G, A, tried_edges):
    activated_nodes_of_this_round = set()
    cur_tried_edges = set()
    for s in A:
        for nb in G.successors(s):
            if nb in A or (s, nb) in tried_edges or (s, nb) in cur_tried_edges:
                continue
            if _prop_success(G, s, nb):
                activated_nodes_of_this_round.add(nb)
            cur_tried_edges.add((s, nb))
    activated_nodes_of_this_round = list(activated_nodes_of_this_round)
    A.extend(activated_nodes_of_this_round)
    return A, activated_nodes_of_this_round, cur_tried_edges

def _prop_success(G, src, dest):
    '''
    act_prob = 0.1
    for e in G.es():
        if (src, dest) == e.tuple:
            act_prob = e['act_prob']
            break
    '''
    return random.random() <= 0.1
但这里有一个例子,如果我运行两次Independent\u cascade\u igraph

independent_cascade_igraph(twitter_igraph, ['121817564'], steps=1)
[['121817564'],
 [514,
  1773,
  1540,
  1878,
  2057,
  1035,
  1550,
  2064,
  1042,
  533,
  1558,
  1048,
  1054,
  544,
  545,
  1061,
  1067,
  1885,
  1072,
  350,
  1592,
  1460,...

independent_cascade_igraph(twitter_igraph, ['121817564'], steps=1)
[['121817564'],
 [1027,
  2055,
  8,
  1452,
  1546,
  1038,
  532,
  1045,
  542,
  546,
  1059,
  549,
  1575,
  1576,
  2030,
  1067,
  1068,
  1071,
  564,
  573,
  575,
  1462,
  584,
  1293,
  1105,
  595,
  599,
  1722,
  1633,
  1634,
  614,
  1128,
  1131,
  1286,
  621,
  1647,
  1648,
  627,
  636,
  1662,
  1664,
  1665,
  130,
  1671,
  1677,
  656,
  1169,
  148,
  1686,
  1690,
  667,
  1186,
  163,
  1700,
  1191,
  1705,
  1711,...

因此,我希望从中得到的是,如果我有一个500个ID的列表,我希望第一个CPU计算前250个,第二个CPU计算最后250个,然后合并结果。我不确定我是否正确理解了多重处理

如前所述,例如,in*nix子进程继承RNG的状态。调用每个子进程,自己将其初始化为每个进程种子,或随机初始化。

尚未详细阅读您的程序,但我的总体感觉是,您可能有随机数生成器种子问题。如果在同一个CPU上运行两次该程序,则第二次运行时,随机数生成器的状态将不同。但是,如果在两个不同的CPU上运行它,可能会使用相同的默认种子初始化生成器,从而得到相同的结果

independent_cascade_igraph(twitter_igraph, ['121817564'], steps=1)
[['121817564'],
 [514,
  1773,
  1540,
  1878,
  2057,
  1035,
  1550,
  2064,
  1042,
  533,
  1558,
  1048,
  1054,
  544,
  545,
  1061,
  1067,
  1885,
  1072,
  350,
  1592,
  1460,...

independent_cascade_igraph(twitter_igraph, ['121817564'], steps=1)
[['121817564'],
 [1027,
  2055,
  8,
  1452,
  1546,
  1038,
  532,
  1045,
  542,
  546,
  1059,
  549,
  1575,
  1576,
  2030,
  1067,
  1068,
  1071,
  564,
  573,
  575,
  1462,
  584,
  1293,
  1105,
  595,
  599,
  1722,
  1633,
  1634,
  614,
  1128,
  1131,
  1286,
  621,
  1647,
  1648,
  627,
  636,
  1662,
  1664,
  1665,
  130,
  1671,
  1677,
  656,
  1169,
  148,
  1686,
  1690,
  667,
  1186,
  163,
  1700,
  1191,
  1705,
  1711,...