如何减少python中循环所需的时间?
我有以下python代码:如何减少python中循环所需的时间?,python,list,for-loop,numpy,append,Python,List,For Loop,Numpy,Append,我有以下python代码: H1 = [[0.04,0.03,0.01,0.002],[0.02,0.04,0.001,0.5]] H2 = [[0.06,0.02,0.02,0.004],[0.8,0.09,0.6,0.1]] D1 = [0.01,0.02,0.1,0.01] D2 = [0.1,0.3,0.01,0.4] Tp = np.sum(D1) Tn = np.sum(D2) T = [] append2 = T.append E =
H1 = [[0.04,0.03,0.01,0.002],[0.02,0.04,0.001,0.5]]
H2 = [[0.06,0.02,0.02,0.004],[0.8,0.09,0.6,0.1]]
D1 = [0.01,0.02,0.1,0.01]
D2 = [0.1,0.3,0.01,0.4]
Tp = np.sum(D1)
Tn = np.sum(D2)
T = []
append2 = T.append
E = []
append3 = E.append
for h1,h2 in itertools.izip(H1,H2)
Err = []
append1 = Err.append
for v in h1:
L1 = [1 if i>=v else 0 for i in h1]
L2 = [1 if i>=v else 0 for i in h2]
Sp = np.dot(D1,L1)
Sn = np.dot(D2,L2)
err = min(Sp+Tn-Sn, Sn+Tp-Sp)
append1(err)
b = np.argmin(Err)
append2(h1[b])
append3(Err[b])
这只是一个示例代码。我需要运行内部for循环大约20000次(这里只运行两次)。但是内部for循环需要很多时间才能实际使用。
在测线分析器中,它显示测线Sp=np.dot(D1,L1)
,Sn=np.dot(D2,L2)
和b=np.argmin(Err)
最耗时。
如何减少上述代码所花费的时间
任何帮助都将不胜感激
谢谢 在你的理解清单中有一些很容易实现的东西:
L1 = [1 if i>=v else 0 for i in h1]
L2 = [1 if i>=v else 0 for i in h2]
上述内容可以写成:
L1 = [i>=v for i in h1]
L2 = [i>=v for i in h2]
因为布尔是整数的一个子类,True
和False
已经是1和0了,只是穿着华丽的衣服而已
err = min(Sp+Tn-Sn, Sn+Tp-Sp)
append1(err)
您可以将以上两行组合起来,以避免变量赋值和访问
如果将代码放在函数中,所有局部变量的使用都会稍微快一点。此外,您使用的任何全局函数或方法(例如,min
,np.dot
)都可以使用默认参数在函数签名中转换为局部变量np.dot
是一个特别慢的调用(超出操作本身所需的时间),因为它涉及属性查找。这类似于您已经使用listappend
方法进行的优化
现在我想这一切都不会对性能产生太大影响,因为您的问题似乎是“我如何让NumPy更快?”(其他人都在为您排忧解难),但它们可能会产生一些影响,值得一做。在您的理解列表中有一些低挂果实:
L1 = [1 if i>=v else 0 for i in h1]
L2 = [1 if i>=v else 0 for i in h2]
上述内容可以写成:
L1 = [i>=v for i in h1]
L2 = [i>=v for i in h2]
因为布尔是整数的一个子类,True
和False
已经是1和0了,只是穿着华丽的衣服而已
err = min(Sp+Tn-Sn, Sn+Tp-Sp)
append1(err)
您可以将以上两行组合起来,以避免变量赋值和访问
如果将代码放在函数中,所有局部变量的使用都会稍微快一点。此外,您使用的任何全局函数或方法(例如,min
,np.dot
)都可以使用默认参数在函数签名中转换为局部变量np.dot
是一个特别慢的调用(超出操作本身所需的时间),因为它涉及属性查找。这类似于您已经使用listappend
方法进行的优化
现在我想这一切都不会对性能产生太大影响,因为您的问题似乎是“我如何让NumPy更快?”(其他人都在为您排忧解难),但它们可能会产生一些影响,值得一做。如果将NumPy函数与NumPy数组而不是列表一起使用,您可以获得相当大的速度提升。大多数numpy函数都会在内部将列表转换为数组,这会给运行时增加大量开销。下面是一个简单的例子:
In [16]: a = range(10)
In [17]: b = range(10)
In [18]: aa = np.array(a)
In [19]: bb = np.array(b)
In [20]: %timeit np.dot(a, b)
10000 loops, best of 3: 54 us per loop
In [21]: %timeit np.dot(aa, bb)
100000 loops, best of 3: 3.4 us per loop
numpy.dot
在这种情况下,使用数组调用时运行速度加快16倍。另外,当您使用numpy阵列时,您将能够简化一些代码,这也将有助于它运行得更快。例如,如果h1
是一个数组,L1=[1 if i>=v else 0 For i in h1]
可以写为h1>v
,它返回一个数组,并且应该运行得更快。贝娄,我已经用数组替换了你的列表,这样你就可以看到它是什么样子了
import numpy as np
H1 = np.array([[0.04,0.03,0.01,0.002],[0.02,0.04,0.001,0.5]])
H2 = np.array([[0.06,0.02,0.02,0.004],[0.8,0.09,0.6,0.1]])
D1 = np.array([0.01,0.02,0.1,0.01])
D2 = np.array([0.1,0.3,0.01,0.4])
Tp = np.sum(D1)
Tn = np.sum(D2)
T = np.zeros(H1.shape[0])
E = np.zeros(H1.shape[0])
for i in range(len(H1)):
h1 = H1[i]
h2 = H2[i]
Err = np.zeros(len(h1))
for j in range(len(h1)):
v = h1[j]
L1 = h1 > v
L2 = h2 > v
Sp = np.dot(D1, L1)
Sn = np.dot(D2, L2)
err = min(Sp+Tn-Sn, Sn+Tp-Sp)
Err[j] = err
b = np.argmin(Err)
T[i] = h1[b]
E[i] = Err[b]
一旦您对numpy数组更熟悉了,您可能希望研究至少使用它来表示内部循环。对于某些应用程序,使用广播可以比python循环更有效。祝你好运,希望这会有所帮助。如果你将numpy函数与numpy数组而不是列表一起使用,你可以获得相当大的速度提升。大多数numpy函数都会在内部将列表转换为数组,这会给运行时增加大量开销。下面是一个简单的例子:
In [16]: a = range(10)
In [17]: b = range(10)
In [18]: aa = np.array(a)
In [19]: bb = np.array(b)
In [20]: %timeit np.dot(a, b)
10000 loops, best of 3: 54 us per loop
In [21]: %timeit np.dot(aa, bb)
100000 loops, best of 3: 3.4 us per loop
numpy.dot
在这种情况下,使用数组调用时运行速度加快16倍。另外,当您使用numpy阵列时,您将能够简化一些代码,这也将有助于它运行得更快。例如,如果h1
是一个数组,L1=[1 if i>=v else 0 For i in h1]
可以写为h1>v
,它返回一个数组,并且应该运行得更快。贝娄,我已经用数组替换了你的列表,这样你就可以看到它是什么样子了
import numpy as np
H1 = np.array([[0.04,0.03,0.01,0.002],[0.02,0.04,0.001,0.5]])
H2 = np.array([[0.06,0.02,0.02,0.004],[0.8,0.09,0.6,0.1]])
D1 = np.array([0.01,0.02,0.1,0.01])
D2 = np.array([0.1,0.3,0.01,0.4])
Tp = np.sum(D1)
Tn = np.sum(D2)
T = np.zeros(H1.shape[0])
E = np.zeros(H1.shape[0])
for i in range(len(H1)):
h1 = H1[i]
h2 = H2[i]
Err = np.zeros(len(h1))
for j in range(len(h1)):
v = h1[j]
L1 = h1 > v
L2 = h2 > v
Sp = np.dot(D1, L1)
Sn = np.dot(D2, L2)
err = min(Sp+Tn-Sn, Sn+Tp-Sp)
Err[j] = err
b = np.argmin(Err)
T[i] = h1[b]
E[i] = Err[b]
一旦您对numpy数组更熟悉了,您可能希望研究至少使用它来表示内部循环。对于某些应用程序,使用广播可以比python循环更有效。祝你好运,希望这会有所帮助。如果我正确理解了指令
np.dot()
在两个维度1列表上的作用,那么在我看来,以下代码应该与您的代码相同。请你测试一下它的速度好吗 它的原理是使用索引而不是列表的元素,并使用定义为函数默认值的列表的特性
H1 = [[0.04,0.03,0.01,0.002],[0.02,0.04,0.001,0.5]]
H2 = [[0.06,0.02,0.02,0.004],[0.8,0.09,0.6,0.1]]
D1 = [0.01,0.02,0.1,0.01]
D2 = [0.1,0.3,0.01,0.4]
Tp = np.sum(D1)
Tn = np.sum(D2)
T,E = [],[]
append2 = T.append
append3 = E.append
ONE,TWO = [],[]
def zoui(v, ONE=ONE,TWO=TWO,
D1=D1,D2=D2,Tp=Tp,Tn=Tn,tu0123 = (0,1,2,3)):
diff = sum(D1[i] if ONE[i]>=v else 0 for i in tu0123)\
-sum(D2[i] if TWO[i]>=v else 0 for i in tu0123)
#or maybe
#diff = sum(D1[i] * ONE[i]>=v for i in tu0123)\
# -sum(D2[i] * TWO[i]>=v for i in tu0123)
return min(Tn+diff,Tp-diff)
for n in xrange(len(H1)):
ONE[:] = H1[n]
TWO[:] = H2[n]
Err = map(zoui,ONE)
b = np.argmin(Err)
append2(ONE[b])
append3(Err[b])
如果我正确理解了指令
np.dot()
在两个维度1列表上的作用,那么在我看来,下面的代码应该与您的代码相同。请你测试一下它的速度好吗 它的原理是使用索引而不是列表的元素,并使用定义为函数默认值的列表的特性
H1 = [[0.04,0.03,0.01,0.002],[0.02,0.04,0.001,0.5]]
H2 = [[0.06,0.02,0.02,0.004],[0.8,0.09,0.6,0.1]]
D1 = [0.01,0.02,0.1,0.01]
D2 = [0.1,0.3,0.01,0.4]
Tp = np.sum(D1)
Tn = np.sum(D2)
T,E = [],[]
append2 = T.append
append3 = E.append
ONE,TWO = [],[]
def zoui(v, ONE=ONE,TWO=TWO,
D1=D1,D2=D2,Tp=Tp,Tn=Tn,tu0123 = (0,1,2,3)):
diff = sum(D1[i] if ONE[i]>=v else 0 for i in tu0123)\
-sum(D2[i] if TWO[i]>=v else 0 for i in tu0123)
#or maybe
#diff = sum(D1[i] * ONE[i]>=v for i in tu0123)\
# -sum(D2[i] * TWO[i]>=v for i in tu0123)
return min(Tn+diff,Tp-diff)
for n in xrange(len(H1)):
ONE[:] = H1[n]
TWO[:] = H2[n]
Err = map(zoui,ONE)
b = np.argmin(Err)
append2(ONE[b])
append3(Err[b])
您需要将数据保存在ndarray类型中。在列表上执行numpy操作时,每次都必须构造一个新数组。我修改了您的代码以运行可变次数,并发现它在10000次迭代中也是~1s。将数据类型更改为Ndarray将其减少了大约两倍,我认为仍有一些改进需要改进(此版本的第一个版本有一个错误,使其执行速度过快)
您需要将数据保存在ndarray类型中。在列表上执行numpy操作时,每次都必须构造一个新数组。我修改了您的代码,使其在一个数据库中运行的次数可变