Python 计算交点时避免numpy循环
我想加速下面的计算,处理Python 计算交点时避免numpy循环,python,arrays,performance,numpy,vectorization,Python,Arrays,Performance,Numpy,Vectorization,我想加速下面的计算,处理r光线和n球体。以下是我到目前为止得到的信息: # shape of mu1 and mu2 is (r, n) # shape of rays is (r, 3) # note that intersections has 2n columns because for every sphere one can # get up to two intersections (secant, tangent, no intersection) intersections =
r
光线和n
球体。以下是我到目前为止得到的信息:
# shape of mu1 and mu2 is (r, n)
# shape of rays is (r, 3)
# note that intersections has 2n columns because for every sphere one can
# get up to two intersections (secant, tangent, no intersection)
intersections = np.empty((r, 2*n, 3))
for col in range(n):
intersections[:, col, :] = rays * mu1[:, col][:, np.newaxis]
intersections[:, col + n, :] = rays * mu2[:, col][:, np.newaxis]
# [...]
# calculate euclidean distance from the center of gravity (0,0,0)
distances = np.empty((r, 2 * n))
for col in range(n):
distances[:, col] = np.linalg.norm(intersections[:, col], axis=1)
distances[:, col + n] = np.linalg.norm(intersections[:, col + n], axis=1)
我试图通过避免
for
-循环来加快速度,但却不知道如何正确地广播数组,所以我只需要一个函数调用。非常感谢您的帮助。以下是使用-
最后一步可以用类似的方法来代替-
distances = np.sqrt(np.einsum('ijk,ijk->ij',intersections,intersections))
mu = np.hstack((mu1,mu2))
distances = np.sqrt(np.einsum('ij,ij,ik,ik->ij',mu,mu,rays,rays))
或者用np.einsum
替换几乎整个内容,以另一种矢量化的方式,如下所示-
distances = np.sqrt(np.einsum('ijk,ijk->ij',intersections,intersections))
mu = np.hstack((mu1,mu2))
distances = np.sqrt(np.einsum('ij,ij,ik,ik->ij',mu,mu,rays,rays))
运行时测试和验证输出-
def original_app(mu1,mu2,rays):
intersections = np.empty((r, 2*n, 3))
for col in range(n):
intersections[:, col, :] = rays * mu1[:, col][:, np.newaxis]
intersections[:, col + n, :] = rays * mu2[:, col][:, np.newaxis]
distances = np.empty((r, 2 * n))
for col in range(n):
distances[:, col] = np.linalg.norm(intersections[:, col], axis=1)
distances[:, col + n] = np.linalg.norm(intersections[:, col + n], axis=1)
return distances
def vectorized_app1(mu1,mu2,rays):
intersections = np.hstack((mu1,mu2))[...,None]*rays[:,None,:]
return np.sqrt((intersections**2).sum(2))
def vectorized_app2(mu1,mu2,rays):
intersections = np.hstack((mu1,mu2))[...,None]*rays[:,None,:]
return np.sqrt(np.einsum('ijk,ijk->ij',intersections,intersections))
def vectorized_app3(mu1,mu2,rays):
mu = np.hstack((mu1,mu2))
return np.sqrt(np.einsum('ij,ij,ik,ik->ij',mu,mu,rays,rays))
时间安排-
In [101]: # Inputs
...: r = 1000
...: n = 1000
...: mu1 = np.random.rand(r, n)
...: mu2 = np.random.rand(r, n)
...: rays = np.random.rand(r, 3)
In [102]: np.allclose(original_app(mu1,mu2,rays),vectorized_app1(mu1,mu2,rays))
Out[102]: True
In [103]: np.allclose(original_app(mu1,mu2,rays),vectorized_app2(mu1,mu2,rays))
Out[103]: True
In [104]: np.allclose(original_app(mu1,mu2,rays),vectorized_app3(mu1,mu2,rays))
Out[104]: True
In [105]: %timeit original_app(mu1,mu2,rays)
...: %timeit vectorized_app1(mu1,mu2,rays)
...: %timeit vectorized_app2(mu1,mu2,rays)
...: %timeit vectorized_app3(mu1,mu2,rays)
...:
1 loops, best of 3: 306 ms per loop
1 loops, best of 3: 215 ms per loop
10 loops, best of 3: 140 ms per loop
10 loops, best of 3: 136 ms per loop
谢谢!我采用了
np.einsum
方法,在整个过程中实现了将近2倍的加速program@rldw很好,很乐意帮忙!