用Python包装的C++；类：在向量的每个元素中执行操作考虑C++类。第一个是分支类：_Python_C++_Multithreading_Parallel Processing_Cython

用Python包装的C++；类：在向量的每个元素中执行操作考虑C++类。第一个是分支类：

python c++ multithreading parallel-processing

用Python包装的C++；类：在向量的每个元素中执行操作考虑C++类。第一个是分支类：,python,c++,multithreading,parallel-processing,cython,Python,C++,Multithreading,Parallel Processing,Cython,在这个程序中，我随机添加和删除分支。每个分支都有两个属性prop1和prop2。有一个operation1函数在“prop1”上执行涉及getProperty和setProperty函数的操作，还有一个operation2函数在“prop2”上执行相同的操作我想要的是让一个处理器（或线程）执行每个操作。由于程序不断调用外部C++库，我应该使用线程而不是多处理器吗？p> 我应该如何实现并行化？我试着用这个来激励自己，但当我同时使用线程或多处理器时，我会得到一个较慢的程序 < P>我推荐使用线程>

在这个程序中，我随机添加和删除分支。每个分支都有两个属性prop1和prop2。有一个operation1函数在“prop1”上执行涉及getProperty和setProperty函数的操作，还有一个operation2函数在“prop2”上执行相同的操作

我想要的是让一个处理器（或线程）执行每个操作。由于程序不断调用外部C++库，我应该使用线程而不是多处理器吗？p>

我应该如何实现并行化？我试着用这个来激励自己，但当我同时使用线程或多处理器时，我会得到一个较慢的程序

< P>我推荐使用<代码>线程> /COD>——因为主要工作是在你包裹的C++函数中完成的，那么你应该能够释放吉尔并使所有的工作都能可靠地并行工作。p> 一个好的一般规则是尽可能少地创建新线程（这可能是一个缓慢的操作），然后向它们提供数据，直到完成。你在评论中说你根本不关心操作的执行顺序（很好！）。考虑到这一点，我建议将包含操作的lambda函数放入

队列中

，并让线程将其取出并运行：

然后，以下代码进入范围内i的

循环（TotalTime）：

代替范围内j的

（NumberOfBranchs）：

：

为了允许线程实际并行工作，您需要确保GIL在Cython包装中释放：

def operation1(a,b):
   with nogil:
      cplusplus_operation1(a,b)

使用

多处理

它的可移植性不强（在Windows上的工作方式不同）

如果代码>操作1/ 2 修改C++数据，那么您可能发现修改的数据没有在进程之间共享，而不需要特别的努力（这将使操作的要点失败）

是<代码>操作x <代码> python函数，Cython函数，C++函数还是别的什么？“<代码> >操作> <代码> > <代码> j+1 < />代码> < <代码>操作2>代码> >代码> j>代码>（反之亦然）？@ DavidW <代码>操作x/>代码是使用包裹的C++方法执行的Python函数，并执行简单的计算。当

的

操作2

仍在进行时，可以考虑启动

j+1的操作1
。我们遍历树的顺序一点也不重要。事实上，如果并行执行每个分支的计算，那将是最好的！假设prop1/2
是分支的长度和半径，以及operation1/2
轴向和径向生长机制。如果我们寻求最大的生物学相关性，每个分支的生长应该并行计算。@DavidW当然，属性不是完全独立的（例如：长度/半径比可能在某个范围内），分支也不是：一个分支的生长影响其他分支的生长（例如，将其他分支置于阴影中）。谢谢你的回答。我将尝试你的建议。我还尝试通过将N个分支划分为8个核心来并行化。当分支数较大时（~10个）⁶) 而且操作1/2
成本很高（例如：矩阵求逆）多处理
比线程化
快得多。然而，我对执行许多非常简单的计算感兴趣（特性变化主要由简单的微分方程驱动）有人建议我在CUDA上使用GPU并行。你怎么看？我认为多处理
和线程
在速度上的主要区别是1）进程之间的通信在多处理
中较慢如果使用Python，多处理
会更快，因为线程
只有在可以释放GIL（例如调用C++）时才能正常工作。听起来CUDA可能是一个不错的选择numba
forpython是一种相当简单的使用方法，如果是的话，不妨看看？如果需要经常向GPU传输数据，那么从GPU传输数据总是有点慢。大多数非专业GPU在float
方面比double要好得多，所以请考虑您需要的精度。最后，我使用CUDA和numba软件包获得了更好的结果。但是，我感谢您的回答，因为它阐明了如何在Python中使用多线程。为什么线程的数量应该与我拥有的内核的数量相似？
class Tree{
  vector<*Branch> tree;

  void addBranch(int index); //adds a branch to tree vector at index
  void removeBranch(int index); //removes branch at index and and its descendents
  double getProperty(int index, string name);//gets value of property name of branch at index
  void addProperty(int index, string name, double value);
  void setProperty(int index, string name, double value);
}

tree=PyTree()
for i in range(TotalTime):
  k=random.random()
  if k>0.1:
    tree.addBranch(random_index) #it is not important how we get the index
    tree.addProperty(random_index,'prop1',1)
    tree.addProperty(random_index,'prop2',1)
  k=random.random()
  if k>0.9:
    tree.removeBranch(random_index)
  for j in range(NumberOfBranches): #it's not important how we get the number of branches
    operation1(j,'prop1') # assume this functions were defined 
    operation2(j,'prop2')

q = Queue()

def thread_func():
    """A function for each thread to run.

    It waits to get an item off the queue and then runs that item.
    None is used to indicate that we're done. If we get None, we 
    put it pack on the Queue to ensure every thread terminates"""
    while True:
        f = q.get()
        if f is None:
            q.put(None)
            return     
        f()

# make and start the threads.
# "no_threads" should be something similar to the number of cores you have.
# 4-8 might be a good number to try?
threads = [ threading.Thread(target=thread_func) for n in range(no_threads) ]
[ t.start() for t in  threads ]

# put the required operations on the Queue
for j in (NumberOfBranches):
    # note the awkward syntax to 
    # ensure we capture j: http://stackoverflow.com/a/7514158/4657412
    q.put(lambda j=j: operation1(x,"prop1"))
    q.put(lambda j=j: operation2(x,"prop2"))

q.put(None) # to terminate

# wait for threads to finish
[ t.join() for t in threads ]

def operation1(a,b):
   with nogil:
      cplusplus_operation1(a,b)