为什么我的python多处理代码返回相同的随机数结果?
我在分析一个大图表。因此,我将图形分成若干块,希望使用多核CPU时速度会更快。然而,我的模型是一个随机模型,所以每次运行的结果可能不一样。我正在测试这个想法,我一直得到相同的结果,所以我想知道我的代码是否正确 这是我的密码为什么我的python多处理代码返回相同的随机数结果?,python,multiprocessing,Python,Multiprocessing,我在分析一个大图表。因此,我将图形分成若干块,希望使用多核CPU时速度会更快。然而,我的模型是一个随机模型,所以每次运行的结果可能不一样。我正在测试这个想法,我一直得到相同的结果,所以我想知道我的代码是否正确 这是我的密码 from multiprocessing import Process, Queue # split a list into evenly sized chunks def chunks(l, n): return [l[i:i+n] for i in range
from multiprocessing import Process, Queue
# split a list into evenly sized chunks
def chunks(l, n):
return [l[i:i+n] for i in range(0, len(l), n)]
def multiprocessing_icm(queue, nodes):
queue.put(independent_cascade_igraph(twitter_igraph, nodes, steps=1))
def dispatch_jobs(data, job_number):
total = len(data)
chunk_size = total / job_number
slice = chunks(data, chunk_size)
jobs = []
processes = []
queue = Queue()
for i, s in enumerate(slice):
j = Process(target=multiprocessing_icm, args=(queue, s))
jobs.append(j)
for j in jobs:
j.start()
for j in jobs:
j.join()
return queue
dispatch_jobs(['121817564', '121817564'], 2)
如果您想知道什么是independent\u cascade\u igraph
。这是密码
def independent_cascade_igraph(G, seeds, steps=0):
# init activation probabilities
for e in G.es():
if 'act_prob' not in e.attributes():
e['act_prob'] = 0.1
elif e['act_prob'] > 1:
raise Exception("edge activation probability:", e['act_prob'], "cannot be larger than 1")
# perform diffusion
A = copy.deepcopy(seeds) # prevent side effect
if steps <= 0:
# perform diffusion until no more nodes can be activated
return _diffuse_all(G, A)
# perform diffusion for at most "steps" rounds
return _diffuse_k_rounds(G, A, steps)
def _diffuse_all(G, A):
tried_edges = set()
layer_i_nodes = [ ]
layer_i_nodes.append([i for i in A]) # prevent side effect
while True:
len_old = len(A)
(A, activated_nodes_of_this_round, cur_tried_edges) = _diffuse_one_round(G, A, tried_edges)
layer_i_nodes.append(activated_nodes_of_this_round)
tried_edges = tried_edges.union(cur_tried_edges)
if len(A) == len_old:
break
return layer_i_nodes
def _diffuse_k_rounds(G, A, steps):
tried_edges = set()
layer_i_nodes = [ ]
layer_i_nodes.append([i for i in A])
while steps > 0 and len(A) < G.vcount():
len_old = len(A)
(A, activated_nodes_of_this_round, cur_tried_edges) = _diffuse_one_round(G, A, tried_edges)
layer_i_nodes.append(activated_nodes_of_this_round)
tried_edges = tried_edges.union(cur_tried_edges)
if len(A) == len_old:
break
steps -= 1
return layer_i_nodes
def _diffuse_one_round(G, A, tried_edges):
activated_nodes_of_this_round = set()
cur_tried_edges = set()
for s in A:
for nb in G.successors(s):
if nb in A or (s, nb) in tried_edges or (s, nb) in cur_tried_edges:
continue
if _prop_success(G, s, nb):
activated_nodes_of_this_round.add(nb)
cur_tried_edges.add((s, nb))
activated_nodes_of_this_round = list(activated_nodes_of_this_round)
A.extend(activated_nodes_of_this_round)
return A, activated_nodes_of_this_round, cur_tried_edges
def _prop_success(G, src, dest):
'''
act_prob = 0.1
for e in G.es():
if (src, dest) == e.tuple:
act_prob = e['act_prob']
break
'''
return random.random() <= 0.1
但这里有一个例子,如果我运行两次Independent\u cascade\u igraph
independent_cascade_igraph(twitter_igraph, ['121817564'], steps=1)
[['121817564'],
[514,
1773,
1540,
1878,
2057,
1035,
1550,
2064,
1042,
533,
1558,
1048,
1054,
544,
545,
1061,
1067,
1885,
1072,
350,
1592,
1460,...
independent_cascade_igraph(twitter_igraph, ['121817564'], steps=1)
[['121817564'],
[1027,
2055,
8,
1452,
1546,
1038,
532,
1045,
542,
546,
1059,
549,
1575,
1576,
2030,
1067,
1068,
1071,
564,
573,
575,
1462,
584,
1293,
1105,
595,
599,
1722,
1633,
1634,
614,
1128,
1131,
1286,
621,
1647,
1648,
627,
636,
1662,
1664,
1665,
130,
1671,
1677,
656,
1169,
148,
1686,
1690,
667,
1186,
163,
1700,
1191,
1705,
1711,...
因此,我希望从中得到的是,如果我有一个500个ID的列表,我希望第一个CPU计算前250个,第二个CPU计算最后250个,然后合并结果。我不确定我是否正确理解了多重处理 如前所述,例如,in*nix子进程继承RNG的状态。调用每个子进程,自己将其初始化为每个进程种子,或随机初始化。尚未详细阅读您的程序,但我的总体感觉是,您可能有随机数生成器种子问题。如果在同一个CPU上运行两次该程序,则第二次运行时,随机数生成器的状态将不同。但是,如果在两个不同的CPU上运行它,可能会使用相同的默认种子初始化生成器,从而得到相同的结果
independent_cascade_igraph(twitter_igraph, ['121817564'], steps=1)
[['121817564'],
[514,
1773,
1540,
1878,
2057,
1035,
1550,
2064,
1042,
533,
1558,
1048,
1054,
544,
545,
1061,
1067,
1885,
1072,
350,
1592,
1460,...
independent_cascade_igraph(twitter_igraph, ['121817564'], steps=1)
[['121817564'],
[1027,
2055,
8,
1452,
1546,
1038,
532,
1045,
542,
546,
1059,
549,
1575,
1576,
2030,
1067,
1068,
1071,
564,
573,
575,
1462,
584,
1293,
1105,
595,
599,
1722,
1633,
1634,
614,
1128,
1131,
1286,
621,
1647,
1648,
627,
636,
1662,
1664,
1665,
130,
1671,
1677,
656,
1169,
148,
1686,
1690,
667,
1186,
163,
1700,
1191,
1705,
1711,...