使用多处理模块在python中并行执行函数_Python_Parallel Processing_Multiprocessing

使用多处理模块在python中并行执行函数

python parallel-processing

使用多处理模块在python中并行执行函数,python,parallel-processing,multiprocessing,Python,Parallel Processing,Multiprocessing,我使用python中的多处理模块使函数并行运行函数名为： Parallel_Solution_Combination_Method(subset, i): 子集参数是由染色体元组组成的列表元素染色体是我在同一脚本中定义的一个类。我在基于Lubuntu Linux的操作系统上运行。我用来尝试并行运行函数的代码是： pool = mp.Pool(processes=2) results = [pool.apply_async(Parallel_Solution_Combination_Me

我使用python中的多处理模块使函数并行运行

函数名为：

Parallel_Solution_Combination_Method(subset, i):

子集参数是由染色体元组组成的列表元素

染色体是我在同一脚本中定义的一个类。我在基于Lubuntu Linux的操作系统上运行。我用来尝试并行运行函数的代码是：

pool = mp.Pool(processes=2)
 results = [pool.apply_async(Parallel_Solution_Combination_Method, 
                             args=(subsets[i],i,)
                             )
                             for i in range(len(subsets))
           ]

然而，我要编码的问题是，每当我指定的进程数超过1时，结果都不是预期的，比如说，如果我正在传递一个大小为10的子集列表，并且使用：

processes=2

然后，前两个输出产生完全相同的值，输出3和4相同，依此类推，而如果我指定了进程的数量：

processes = 1

这本质上是一个顺序运行，然后结果是正确的，正如预期的那样（与正常的for循环相同，无需多处理）

我不知道为什么我的结果会变得混乱，即使我显式地从傻瓜循环的索引I指定的集合中发送不同的元组

 args=(subsets[i],i,)

我在一个有两个内核的硬件上运行，所以我希望我可以并行运行两个函数实例，但结果是它产生了重复的结果。我无法找出我的错误。请帮忙！！谢谢

def Parallel_Solution_Combination_Method(subset, counter):

print 'entered parallel sol comb'
child_chromosome = chromosome()   
combination_model_offset = 300

attempts = 0
while True:

        template1 = subset[0].record_template
        template2 = subset[1].record_template
        template_child = template1

        template_gap1 = find_allIndices(template1, '-')
        template_gap2 = find_allIndices(template2, '-')

        if(len(template_gap1) !=0 and len(template_gap2) != 0):
            template_gap_difference = find_different_indicies(template_gap1, template_gap2)
            if(len(template_gap_difference) != 0):        
                template_slice_point = random.choice(template_gap_difference)

                if(template_gap2[template_slice_point -1] < template_gap1[template_slice_point]):
                    #swap template1 template2 values as well as their respective gap indices
                    #so that in crossover the gaps would not collide with each other.
                    temp_template = template1
                    temp_gap = template_gap1

                    template1 = template2
                    template2  = temp_template
                    template_gap1 = template_gap2
                    template_gap2 = temp_gap

                #the crossing over takes the first part of the child sequence to be up until
                #the crossing point without including it. this way it ensures that the resulting
                #child sequence is different from both of the parents by at least one point.

                child_template_gap = template_gap1[:template_slice_point]+template_gap2[template_slice_point:]
                child_gap_part1 = child_template_gap[:template_slice_point]
                child_gap_part2 = child_template_gap[template_slice_point:]           

                if template_slice_point == 0:
                    template_child = template2
                else:
                    template_child = template1[:template_gap1[template_slice_point]]            

                    template_residues_part1 = str(template_child).translate(None, '-')
                    template_residues_part2 = str(template2).translate(None, '-')
                    template_residues_part2 = template_residues_part2[len(template_residues_part1):]


                    for i in range(template_gap1[template_slice_point-1], len(template1)):
                        if i in child_gap_part2:
                            template_child = template_child + '-'
                        else:
                            template_child = template_child + template_residues_part2[0:1]
                            template_residues_part2 = template_residues_part2[1:]                       


        target1 = subset[0].record_target
        target2 = subset[1].record_target
        target_child = target1

        target_gap1 = find_allIndices(target1, '-')
        target_gap2 = find_allIndices(target2, '-')    

        if(len(target_gap1) !=0 and len(target_gap2) != 0):
            target_gap_difference = find_different_indicies(target_gap1, target_gap2)
            if(len(target_gap_difference) !=0):
                target_slice_point = random.choice(target_gap_difference)        
                if(target_gap2[target_slice_point -1] < target_gap1[target_slice_point]):
                    #swap template1 template2 values as well as their respective gap indices
                    #so that in crossover the gaps would not collide with each other.
                    temp_target = target1
                    temp_gap = target_gap1

                    target1 = target2
                    target2  = temp_target
                    target_gap1 = target_gap2
                    target_gap2 = temp_gap

                #the crossing over takes the first part of the child sequence to be up until
                #the crossing point without including it. this way it ensures that the resulting
                #child sequence is different from both of the parents by at least one point.

                child_target_gap = target_gap1[:target_slice_point]+target_gap2[target_slice_point:]
                child_gap_part1 = child_target_gap[:target_slice_point]
                child_gap_part2 = child_target_gap[target_slice_point:]           

                if target_slice_point == 0:
                    target_child = target2
                else:
                    target_child = target1[:target_gap1[target_slice_point]]            

                    target_residues_part1 = str(target_child).translate(None, '-')
                    target_residues_part2 = str(target2).translate(None, '-')
                    target_residues_part2 = target_residues_part2[len(target_residues_part1):]


                    for i in range(target_gap1[target_slice_point-1], len(target1)):
                        if i in child_gap_part2:
                            target_child = target_child + '-'
                        else:
                            target_child = target_child + target_residues_part2[0:1]
                            target_residues_part2 = target_residues_part2[1:]    

        if not [False for y in Reference_Set if y.record_template == template_child and y.record_target == target_child] or attempts <= 100:
            break
        attempts +=1


child_chromosome.record_template = template_child
#print template_child                            
child_chromosome.record_target = target_child
#print target_child
generate_PIR(template_header, template_description, child_chromosome.record_template, target_header,target_description, child_chromosome.record_target)

output_values = start_model(template_id, target_id,'PIR_input.ali', combination_model_offset + counter)
child_chromosome.molpdf_score = output_values['molpdf']
#print output_values['molpdf']
mdl = complete_pdb(env, '1BBH.B99990'+ str(combination_model_offset + counter)+'.pdb')
child_chromosome.normalized_dope_score = mdl.assess_normalized_dope() 
#print mdl.assess_normalized_dope() 

return child_chromosome

请注意，所有这些都在同一个python脚本中。

您编写了

并行解决方案组合方法

函数吗？它做了什么？听起来它已经在并行执行某些操作了，除非它的名称不正确。另外，具体来说，您知道它是否使用

numpy

数组吗？函数Parallel_Solution_Combination_方法是我写的是的，我这样命名是为了区别于其他函数（名称不好），我认为使用了numpy数组。我使用的是biopython Seq对象（蛋白质序列）它们是基于Numpy对象的。当我使用多处理时，它们会导致任何问题吗？谢谢@mikemckernsi如果你提供了

并行解决方案组合方法的代码，或者至少是它的简化表示，你可能会得到更多的帮助。但是，根据我从你发布的首先，它是numpy数组和随机数生成器的组合。请参阅：。当然，我在这里编辑了文件，并添加了函数和我定义的类，它们作为函数的参数传递。谢谢@MikeMcKernsOk，代码很长……这就是您希望使并行运行不同于其他运行的原因吗ther:random.choice（template\u gap\u difference）？如果是这样，您可能只需要为函数的每次并行调用以不同的方式为随机数设定种子（正如我前面提到的）。
class chromosome():
"""basic solution represenation that holds alignments and it's evaluations"""
def __init__(self):

    self.record_template = ''
    self.record_target = ''
    self.molpdf_score = 0.0
    self.ga341_score = 0.0
    self.dope_score = 0.0
    self.normalized_dope_score = 0.0
    self.flag_value = 0
    self.distance_value = 0

def add_molpdf(self, molpdf):
    self.molpdf_score = molpdf
def add_ga341(self, ga341):
    self.ga341_score = ga341
def add_dope(self, dope):
    self.dope_score = dope
def add_normalized_dope(self, normalized_dope):
    self.normalized_dope_score = normalized_dope
def add_records(self, records):
    self.seq_records = records
    for rec in self.seq_records:            
        if rec.id == template_id:                 
            self.record_template = rec.seq
        elif rec.id == target_id:
            self.record_target = rec.seq  
def set_flag(self, flag):
    self.flag_value = flag
def add_distance(self, distance):
    self.distance_value = distance