Python 为什么我的numpy代码与线程不并行？_Python_Numpy_Python Multithreading

Python 为什么我的numpy代码与线程不并行？

python numpy

Python 为什么我的numpy代码与线程不并行？,python,numpy,python-multithreading,Python,Numpy,Python Multithreading,我需要在光栅（矩阵）上为几个点邻域执行一些计算。我的想法是在并行线程中进行这些计算，然后将得到的光栅相加。我的问题是执行似乎不是并行运行的。当我将点数乘以2时，执行时间将延长2倍。我做错了什么 from threading import Lock, Thread import numpy as np import time SIZE = 1000000 THREADS = 8 my_lock=Lock() results = np.zeros(SIZE,dtype=np.float64) d

我需要在光栅（矩阵）上为几个点邻域执行一些计算。我的想法是在并行线程中进行这些计算，然后将得到的光栅相加。我的问题是执行似乎不是并行运行的。当我将点数乘以2时，执行时间将延长2倍。我做错了什么

from threading import Lock, Thread
import numpy as np
import time

SIZE = 1000000
THREADS = 8
my_lock=Lock()
results = np.zeros(SIZE,dtype=np.float64)

def do_job(j):
    global results
    s_time = time.time()  
    print("Starting... "+str(j))

    #do some calculations
    c_r=np.zeros(SIZE,dtype=np.float64)
    for i in range(SIZE):
        c_r[i]=np.exp(-0.001*i)

    print("\t Calculation at job "+str(j)+" lasted: {:3.3f}".format(time.time()-s_time))

    #sum up the results
    if my_lock.acquire(blocking=True):
        results = np.add(results,c_r)
        my_lock.release()

    print("\t Job "+str(j)+" lasted: {:3.3f}".format(time.time()-s_time))



def main():
    global THREADS
    s_time = time.time()  
    threads=[]

    while THREADS>0:

        p = Thread(target=do_job,args=(THREADS,))
        threads.append(p)
        p.start()
        THREADS = THREADS-1

    print("Start finished after : {:3.3f}".format(time.time()-s_time))
    for p in threads:
        p.join()

    print("Total run diuration: {:3.3f}".format(time.time()-s_time))


if __name__ == "__main__":
    main()

当我使用THREADS=4运行代码时，我得到：

Starting... 4
Starting... 3
Starting... 2
Starting... 1
Start finished after : 0.069
         Calculation at job 4 lasted: 5.805
         Job 4 lasted: 5.887
         Calculation at job 3 lasted: 6.230
         Job 3 lasted: 6.237
         Calculation at job 1 lasted: 6.585
         Job 1 lasted: 6.595
         Calculation at job 2 lasted: 6.737
         Job 2 lasted: 6.738
Total run diuration: 6.760

当我切换到THREADS=8时，执行时间大约增加了一倍：

Starting... 8
Starting... 7
Starting... 6
Starting... 5
Starting... 4
Starting... 3
Starting... 1
Start finished after : 0.182
Starting... 2
         Calculation at job 7 lasted: 11.883
         Job 7 lasted: 11.939
         Calculation at job 8 lasted: 13.096
         Job 8 lasted: 13.144
         Calculation at job 1 lasted: 13.548
         Job 1 lasted: 13.576
         Calculation at job 3 lasted: 13.723
         Job 3 lasted: 13.748
         Calculation at job 2 lasted: 14.231
         Job 2 lasted: 14.268
         Calculation at job 5 lasted: 14.698
         Job 5 lasted: 14.708
         Calculation at job 4 lasted: 15.000
         Job 4 lasted: 15.015
         Calculation at job 6 lasted: 15.133
         Job 6 lasted: 15.135
Total run diuration: 15.136

您被全局解释器锁（GIL）击中，请参阅

一次只能有一个“线程”进入解释器。您的代码主要在Python解释器执行的

for i in range（SIZE）

循环中工作。上下文切换只能在IO操作或调用C函数（释放GIL）时发生。此外，与线程执行的操作相比，在线程之间切换的成本更高。这就是为什么添加更多线程会降低执行速度

根据numpy文档，许多操作都会释放GIL，因此，如果您将操作矢量化，迫使程序在numpy中花费更多时间，则可以从线程中获得优势

见帖子：

尝试从以下位置修改：

for i in range(SIZE):
        c_r[i]=np.exp(-0.001*i)

致：

因为python的线程基本上仍然在单个线程中运行。如果您有许多等待时间长的IO操作，但没有计算，那么这将提供优势。改用

多处理

包。这里有一篇文章，谢谢。我没有意识到吉尔。经过您建议的修改，执行情况急剧上升：）

c_r = np.exp(-0.001*np.arange(SIZE))