Python 为什么在Linux下的py3k多处理中，我的线程比我要求的池中的进程多？_Python_Linux_Multithreading_Python 3.x_Multiprocessing

Python 为什么在Linux下的py3k多处理中，我的线程比我要求的池中的进程多？

python linux multithreading python-3.x

Python 为什么在Linux下的py3k多处理中，我的线程比我要求的池中的进程多？,python,linux,multithreading,python-3.x,multiprocessing,Python,Linux,Multithreading,Python 3.x,Multiprocessing,我正在尝试并行化一些工作，这些工作在我的mac上运行（mac OS 10.7下的Pyton 3.2.2），但在Linux集群上出现以下错误：我运行它时获得了4个内核，并访问了Python 3.2。错误消息将继续，直到我手动中断执行 Exception in thread Thread-2: Traceback (most recent call last): File "/n/sw/python-3.2/lib/python3.2/threading.py", line 736, in _b

我正在尝试并行化一些工作，这些工作在我的mac上运行（mac OS 10.7下的Pyton 3.2.2），但在Linux集群上出现以下错误：我运行它时获得了4个内核，并访问了Python 3.2。错误消息将继续，直到我手动中断执行

Exception in thread Thread-2:
Traceback (most recent call last):
  File "/n/sw/python-3.2/lib/python3.2/threading.py", line 736, in _bootstrap_inner
    self.run()
  File "/n/sw/python-3.2/lib/python3.2/threading.py", line 689, in run
    self._target(*self._args, **self._kwargs)
  File "/n/sw/python-3.2/lib/python3.2/multiprocessing/pool.py", line 338, in _handle_tasks
    put(task)
_pickle.PicklingError: Can't pickle <class 'function'>: attribute lookup builtins.function failed

Process PoolWorker-2:
Process PoolWorker-4:
Traceback (most recent call last):
Traceback (most recent call last):
  File "/n/sw/python-3.2/lib/python3.2/multiprocessing/process.py", line 259, in _bootstrap
  File "/n/sw/python-3.2/lib/python3.2/multiprocessing/process.py", line 259, in _bootstrap
Process PoolWorker-1:
Traceback (most recent call last):
  File "/n/sw/python-3.2/lib/python3.2/multiprocessing/process.py", line 259, in _bootstrap
Process PoolWorker-12:
Traceback (most recent call last):
  File "/n/sw/python-3.2/lib/python3.2/multiprocessing/process.py", line 259, in _bootstrap
    self.run()
  File "/n/sw/python-3.2/lib/python3.2/multiprocessing/process.py", line 114, in run
    self._target(*self._args, **self._kwargs)
  File "/n/sw/python-3.2/lib/python3.2/multiprocessing/pool.py", line 102, in worker
Process PoolWorker-11:
Traceback (most recent call last):
  File "/n/sw/python-3.2/lib/python3.2/multiprocessing/process.py", line 259, in _bootstrap
    self.run()
  File "/n/sw/python-3.2/lib/python3.2/multiprocessing/process.py", line 114, in run
    self._target(*self._args, **self._kwargs)
  File "/n/sw/python-3.2/lib/python3.2/multiprocessing/pool.py", line 102, in worker

好的，我“无意中”试图对集群上的并行运行进行cProfile，而我只是离线进行测试运行。代码运行良好，但分析出现了故障——对于并行脚本来说总是如此。它与群集或LSF无关。抱歉。

是否多次调用了

时间\u重叠\u投影\u图形\u并行

？否，只调用了一次。（如果有人想知道，它会将一个二分图投影到一个邻域图中。）顺便说一句，我重新提交了一个新的问题，因为它可能是多处理模块和/或我们的调度软件LSF的问题：记住接受你的“答案”，所以这个问题直到时间结束才会出现在未回答的名单上。谢谢，杰克。事实上，我试着接受它，但是有时间限制，周末我很忙。现在完成。

import csv
import networkx as nx
import time
import shutil
import datetime
import pydot
import os
import re
import logging
from operator import itemgetter
import numpy as np
from multiprocessing import Pool
import itertools

# Dictionary for edge attributes in projected graph:
# 0: overlap_length
# 1: overlap_start
# 2: overlap_end
# 3: cell
# 4: level

def chunks(l,n):
    """Divide a list of nodes `l` in `n` chunks"""
    l_c = iter(l)
    while 1:
        x = tuple(itertools.islice(l_c,n))
        if not x:
            return
        yield x

def overlaps(G,B,u,nbrs2):
    l = []
    for v in nbrs2:
        for mutual_cell in set(B[u]) & set(B[v]):
            for uspell in B.get_edge_data(u,mutual_cell).values():
                ustart = uspell[1]
                uend = uspell[2]
                for vspell in B.get_edge_data(v,mutual_cell).values():
                    vstart = vspell[1]
                    vend = vspell[2]
                    if uend > vstart and vend > ustart:
                        ostart = max(ustart,vstart)
                        oend = min(uend,vend)
                        olen = (oend-ostart+1)/86400
                        ocell = mutual_cell
                        if (v not in G[u] or ostart not in [ edict[1] for edict in G[u][v].values() ]):
                            l.append((u,v,{0: olen,1: ostart,2: oend,3: ocell}))
    return l

def _pmap1(arg_tuple):
    """Pool for multiprocess only accepts functions with one argument. This function
    uses a tuple as its only argument.
    """
    return overlaps(arg_tuple[0],arg_tuple[1],arg_tuple[2],arg_tuple[3])

def time_overlap_projected_graph_parallel(B, nodes):
    G=nx.MultiGraph()
    G.add_nodes_from((n,B.node[n]) for n in nodes)
    add_edges_from = nx.MultiGraph.add_edges_from
    get_edge_data = nx.MultiGraph.get_edge_data
    p = Pool(processes=4)
    node_divisor = len(p._pool)
    for u in nodes:
        unbrs = set(B[u])
        nbrs2 = set((n for nbr in unbrs for n in B[nbr])) - set([u])
        # iterate over subsets of neighbors - parallelize
        node_chunks = list(chunks(nbrs2,int(len(nbrs2)/int(node_divisor))))
        num_chunks = len(node_chunks)
        pedgelists = p.map(_pmap1,
                           zip([G]*num_chunks,
                               [B]*num_chunks,
                               [u]*num_chunks,
                               node_chunks))
        ll = []
        for l in pedgelists:
            ll.extend(l)
        G.add_edges_from(ll)
        # compile long list
           # add edges from long list in a single step
    return G