如何减少python中循环所需的时间?

如何减少python中循环所需的时间?,python,list,for-loop,numpy,append,Python,List,For Loop,Numpy,Append,我有以下python代码: H1 = [[0.04,0.03,0.01,0.002],[0.02,0.04,0.001,0.5]] H2 = [[0.06,0.02,0.02,0.004],[0.8,0.09,0.6,0.1]] D1 = [0.01,0.02,0.1,0.01] D2 = [0.1,0.3,0.01,0.4] Tp = np.sum(D1) Tn = np.sum(D2) T = [] append2 = T.append E =

我有以下python代码:

H1 = [[0.04,0.03,0.01,0.002],[0.02,0.04,0.001,0.5]]
H2 = [[0.06,0.02,0.02,0.004],[0.8,0.09,0.6,0.1]]    

D1 = [0.01,0.02,0.1,0.01]    
D2 = [0.1,0.3,0.01,0.4] 

Tp = np.sum(D1)    
Tn = np.sum(D2) 

T = []    
append2 = T.append   
E = []    
append3 = E.append   

for h1,h2 in itertools.izip(H1,H2) 
    Err = []    
    append1 = Err.append
    for v in h1:    

        L1 = [1 if i>=v else 0 for i in h1]    
        L2 = [1 if i>=v else 0 for i in h2]    

        Sp = np.dot(D1,L1)     
        Sn = np.dot(D2,L2)    

        err = min(Sp+Tn-Sn, Sn+Tp-Sp)    
        append1(err)  

    b = np.argmin(Err)    
    append2(h1[b])    
    append3(Err[b])
这只是一个示例代码。我需要运行内部for循环大约20000次(这里只运行两次)。但是内部for循环需要很多时间才能实际使用。 在测线分析器中,它显示测线
Sp=np.dot(D1,L1)
Sn=np.dot(D2,L2)
b=np.argmin(Err)
最耗时。 如何减少上述代码所花费的时间

任何帮助都将不胜感激


谢谢

在你的理解清单中有一些很容易实现的东西:

L1 = [1 if i>=v else 0 for i in h1]
L2 = [1 if i>=v else 0 for i in h2]
上述内容可以写成:

L1 = [i>=v for i in h1]
L2 = [i>=v for i in h2]
因为布尔是整数的一个子类,
True
False
已经是1和0了,只是穿着华丽的衣服而已

err = min(Sp+Tn-Sn, Sn+Tp-Sp)    
append1(err)  
您可以将以上两行组合起来,以避免变量赋值和访问

如果将代码放在函数中,所有局部变量的使用都会稍微快一点。此外,您使用的任何全局函数或方法(例如,
min
np.dot
)都可以使用默认参数在函数签名中转换为局部变量
np.dot
是一个特别慢的调用(超出操作本身所需的时间),因为它涉及属性查找。这类似于您已经使用list
append
方法进行的优化


现在我想这一切都不会对性能产生太大影响,因为您的问题似乎是“我如何让NumPy更快?”(其他人都在为您排忧解难),但它们可能会产生一些影响,值得一做。

在您的理解列表中有一些低挂果实:

L1 = [1 if i>=v else 0 for i in h1]
L2 = [1 if i>=v else 0 for i in h2]
上述内容可以写成:

L1 = [i>=v for i in h1]
L2 = [i>=v for i in h2]
因为布尔是整数的一个子类,
True
False
已经是1和0了,只是穿着华丽的衣服而已

err = min(Sp+Tn-Sn, Sn+Tp-Sp)    
append1(err)  
您可以将以上两行组合起来,以避免变量赋值和访问

如果将代码放在函数中,所有局部变量的使用都会稍微快一点。此外,您使用的任何全局函数或方法(例如,
min
np.dot
)都可以使用默认参数在函数签名中转换为局部变量
np.dot
是一个特别慢的调用(超出操作本身所需的时间),因为它涉及属性查找。这类似于您已经使用list
append
方法进行的优化


现在我想这一切都不会对性能产生太大影响,因为您的问题似乎是“我如何让NumPy更快?”(其他人都在为您排忧解难),但它们可能会产生一些影响,值得一做。

如果将NumPy函数与NumPy数组而不是列表一起使用,您可以获得相当大的速度提升。大多数numpy函数都会在内部将列表转换为数组,这会给运行时增加大量开销。下面是一个简单的例子:

In [16]: a = range(10)

In [17]: b = range(10)

In [18]: aa = np.array(a)

In [19]: bb = np.array(b)

In [20]: %timeit np.dot(a, b)
10000 loops, best of 3: 54 us per loop

In [21]: %timeit np.dot(aa, bb)
100000 loops, best of 3: 3.4 us per loop
numpy.dot
在这种情况下,使用数组调用时运行速度加快16倍。另外,当您使用numpy阵列时,您将能够简化一些代码,这也将有助于它运行得更快。例如,如果
h1
是一个数组,
L1=[1 if i>=v else 0 For i in h1]
可以写为
h1>v
,它返回一个数组,并且应该运行得更快。贝娄,我已经用数组替换了你的列表,这样你就可以看到它是什么样子了

import numpy as np

H1 = np.array([[0.04,0.03,0.01,0.002],[0.02,0.04,0.001,0.5]])
H2 = np.array([[0.06,0.02,0.02,0.004],[0.8,0.09,0.6,0.1]])

D1 = np.array([0.01,0.02,0.1,0.01])
D2 = np.array([0.1,0.3,0.01,0.4])

Tp = np.sum(D1)    
Tn = np.sum(D2) 

T = np.zeros(H1.shape[0])
E = np.zeros(H1.shape[0])

for i in range(len(H1)):
    h1 = H1[i]
    h2 = H2[i]
    Err = np.zeros(len(h1))

    for j in range(len(h1)):    
        v = h1[j]

        L1 = h1 > v
        L2 = h2 > v

        Sp = np.dot(D1, L1)     
        Sn = np.dot(D2, L2)    

        err = min(Sp+Tn-Sn, Sn+Tp-Sp)    
        Err[j] = err

    b = np.argmin(Err)
    T[i] = h1[b]
    E[i] = Err[b]

一旦您对numpy数组更熟悉了,您可能希望研究至少使用它来表示内部循环。对于某些应用程序,使用广播可以比python循环更有效。祝你好运,希望这会有所帮助。

如果你将numpy函数与numpy数组而不是列表一起使用,你可以获得相当大的速度提升。大多数numpy函数都会在内部将列表转换为数组,这会给运行时增加大量开销。下面是一个简单的例子:

In [16]: a = range(10)

In [17]: b = range(10)

In [18]: aa = np.array(a)

In [19]: bb = np.array(b)

In [20]: %timeit np.dot(a, b)
10000 loops, best of 3: 54 us per loop

In [21]: %timeit np.dot(aa, bb)
100000 loops, best of 3: 3.4 us per loop
numpy.dot
在这种情况下,使用数组调用时运行速度加快16倍。另外,当您使用numpy阵列时,您将能够简化一些代码,这也将有助于它运行得更快。例如,如果
h1
是一个数组,
L1=[1 if i>=v else 0 For i in h1]
可以写为
h1>v
,它返回一个数组,并且应该运行得更快。贝娄,我已经用数组替换了你的列表,这样你就可以看到它是什么样子了

import numpy as np

H1 = np.array([[0.04,0.03,0.01,0.002],[0.02,0.04,0.001,0.5]])
H2 = np.array([[0.06,0.02,0.02,0.004],[0.8,0.09,0.6,0.1]])

D1 = np.array([0.01,0.02,0.1,0.01])
D2 = np.array([0.1,0.3,0.01,0.4])

Tp = np.sum(D1)    
Tn = np.sum(D2) 

T = np.zeros(H1.shape[0])
E = np.zeros(H1.shape[0])

for i in range(len(H1)):
    h1 = H1[i]
    h2 = H2[i]
    Err = np.zeros(len(h1))

    for j in range(len(h1)):    
        v = h1[j]

        L1 = h1 > v
        L2 = h2 > v

        Sp = np.dot(D1, L1)     
        Sn = np.dot(D2, L2)    

        err = min(Sp+Tn-Sn, Sn+Tp-Sp)    
        Err[j] = err

    b = np.argmin(Err)
    T[i] = h1[b]
    E[i] = Err[b]

一旦您对numpy数组更熟悉了,您可能希望研究至少使用它来表示内部循环。对于某些应用程序,使用广播可以比python循环更有效。祝你好运,希望这会有所帮助。

如果我正确理解了指令
np.dot()
在两个维度1列表上的作用,那么在我看来,以下代码应该与您的代码相同。
请你测试一下它的速度好吗

它的原理是使用索引而不是列表的元素,并使用定义为函数默认值的列表的特性

H1 = [[0.04,0.03,0.01,0.002],[0.02,0.04,0.001,0.5]]
H2 = [[0.06,0.02,0.02,0.004],[0.8,0.09,0.6,0.1]]    

D1 = [0.01,0.02,0.1,0.01]    
D2 = [0.1,0.3,0.01,0.4] 

Tp = np.sum(D1)    
Tn = np.sum(D2) 

T,E = [],[]    
append2 = T.append      
append3 = E.append 

ONE,TWO = [],[]

def zoui(v, ONE=ONE,TWO=TWO,
         D1=D1,D2=D2,Tp=Tp,Tn=Tn,tu0123 = (0,1,2,3)):
    diff =  sum(D1[i] if ONE[i]>=v else 0 for i in tu0123)\
           -sum(D2[i] if TWO[i]>=v else 0 for i in tu0123)
    #or maybe
    #diff =  sum(D1[i] * ONE[i]>=v for i in tu0123)\
    #       -sum(D2[i] * TWO[i]>=v for i in tu0123)

    return min(Tn+diff,Tp-diff)

for n in xrange(len(H1)):
    ONE[:] = H1[n]
    TWO[:] = H2[n]
    Err = map(zoui,ONE)
    b = np.argmin(Err)    
    append2(ONE[b])    
    append3(Err[b])

如果我正确理解了指令
np.dot()
在两个维度1列表上的作用,那么在我看来,下面的代码应该与您的代码相同。
请你测试一下它的速度好吗

它的原理是使用索引而不是列表的元素,并使用定义为函数默认值的列表的特性

H1 = [[0.04,0.03,0.01,0.002],[0.02,0.04,0.001,0.5]]
H2 = [[0.06,0.02,0.02,0.004],[0.8,0.09,0.6,0.1]]    

D1 = [0.01,0.02,0.1,0.01]    
D2 = [0.1,0.3,0.01,0.4] 

Tp = np.sum(D1)    
Tn = np.sum(D2) 

T,E = [],[]    
append2 = T.append      
append3 = E.append 

ONE,TWO = [],[]

def zoui(v, ONE=ONE,TWO=TWO,
         D1=D1,D2=D2,Tp=Tp,Tn=Tn,tu0123 = (0,1,2,3)):
    diff =  sum(D1[i] if ONE[i]>=v else 0 for i in tu0123)\
           -sum(D2[i] if TWO[i]>=v else 0 for i in tu0123)
    #or maybe
    #diff =  sum(D1[i] * ONE[i]>=v for i in tu0123)\
    #       -sum(D2[i] * TWO[i]>=v for i in tu0123)

    return min(Tn+diff,Tp-diff)

for n in xrange(len(H1)):
    ONE[:] = H1[n]
    TWO[:] = H2[n]
    Err = map(zoui,ONE)
    b = np.argmin(Err)    
    append2(ONE[b])    
    append3(Err[b])

您需要将数据保存在ndarray类型中。在列表上执行numpy操作时,每次都必须构造一个新数组。我修改了您的代码以运行可变次数,并发现它在10000次迭代中也是~1s。将数据类型更改为Ndarray将其减少了大约两倍,我认为仍有一些改进需要改进(此版本的第一个版本有一个错误,使其执行速度过快)


您需要将数据保存在ndarray类型中。在列表上执行numpy操作时,每次都必须构造一个新数组。我修改了您的代码,使其在一个数据库中运行的次数可变