如何加快a"；至于；使用列表索引时循环？（Python）_Python_Performance_Numpy_Loops_Vectorization

如何加快a"；至于；使用列表索引时循环？（Python）

python performance numpy loops

如何加快a"；至于；使用列表索引时循环？（Python）,python,performance,numpy,loops,vectorization,Python,Performance,Numpy,Loops,Vectorization,我尝试通过使用Numpy函数或向量而不是for循环来加速此代码： sommes = [] for j in range(vertices.shape[0]): terme = new_vertices[j] - new_vertices[vertex_neighbors[j]] somme_j = np.sum(terme) sommes.append(somme_j) E_int = np.sum(sommes) （这是迭代算法的一部分，有很多“顶点”，所以我认为fo

我尝试通过使用Numpy函数或向量而不是for循环来加速此代码：

sommes = []
for j in range(vertices.shape[0]):
    terme = new_vertices[j] - new_vertices[vertex_neighbors[j]]
    somme_j = np.sum(terme)
    sommes.append(somme_j)
E_int = np.sum(sommes)

（这是迭代算法的一部分，有很多“顶点”，所以我认为for循环需要的时间太长了。）

例如，要在j=0时计算“terme”，我有：

In: new_vertices[0]
Out: array([ 10.2533888 , -42.32279717,  68.27230793])

In: vertex_neighbors[0]
Out: [1280, 2, 1511, 511, 1727, 1887, 759, 509, 1023]

In: new_vertices[vertex_neighbors[0]]
Out: array([[ 10.47121043, -42.00123956,  68.218715  ],
            [ 10.2533888 , -43.26905874,  62.59473849],
            [ 10.69773735, -41.26464083,  68.09594854],
            [ 10.37030712, -42.16729601,  68.24639107],
            [ 10.12158146, -42.46624547,  68.29621598],
            [  9.81850836, -42.71158695,  68.33710623],
            [  9.97615447, -42.59625943,  68.31788497],
            [ 10.37030712, -43.11676015,  62.54960623],
            [ 10.55512696, -41.82622703,  68.18954624]])

In: new_vertices[0] - new_vertices[vertex_neighbors[0]]
Out: array([[-0.21782162, -0.32155761,  0.05359293],
             [ 0.        ,  0.94626157,  5.67756944],
             [-0.44434855, -1.05815634,  0.17635939],
             [-0.11691832, -0.15550116,  0.02591686],
             [ 0.13180734,  0.1434483 , -0.02390805],
             [ 0.43488044,  0.38878979, -0.0647983 ],
             [ 0.27723434,  0.27346227, -0.04557704],
             [-0.11691832,  0.79396298,  5.7227017 ],
             [-0.30173816, -0.49657014,  0.08276169]])

问题是新的顶点[vertex\u Neights[j]]并不总是具有相同的大小。例如，当j=7时：

In: new_vertices[7]
Out: array([ 10.74106112, -63.88592276, -70.15593947])

In: vertex_neighbors[7]
Out: [1546, 655, 306, 1879, 920, 925]

In: new_vertices[vertex_neighbors[7]]
Out: array([[  9.71830698, -69.07323638, -83.10229623],
           [ 10.71123017, -64.06983438, -70.09345104],
           [  9.74836003, -68.88820555, -83.16187474],
           [ 10.78982867, -63.70552665, -70.2169896 ],
           [  9.74627177, -60.87823935, -60.13032811],
           [  9.79419242, -60.69528267, -60.182843  ]])

In: new_vertices[7] - new_vertices[vertex_neighbors[7]]
Out: array([[  1.02275414,   5.18731363,  12.94635676],
             [  0.02983095,   0.18391163,  -0.06248843],
             [  0.99270108,   5.0022828 ,  13.00593527],
             [ -0.04876756,  -0.18039611,   0.06105013],
             [  0.99478934,  -3.00768341, -10.02561137],
             [  0.94686869,  -3.19064009,  -9.97309648]])

没有for循环是否可能？我的想法快用完了，所以任何帮助都将不胜感激

谢谢。

是的，这是可能的。其思想是使用
np.repeat
创建一个向量，在该向量中项目重复的次数可变。代码如下：

#如果迭代之间的索引保持不变（预计算），则以下两行只能执行一次计数=np.数组（[len（e）表示顶点_邻域中的e]）展平索引=np。连接（顶点相邻） E_int=np.sum（np.repeat（新的_顶点，计数，轴=0）-新的_顶点[展平索引]）

以下是一个基准：

将numpy导入为np 不定期进口* n=32768 顶点=np.random.rand（n，3）指数=[] count=np.random.randint（1，10，size=n）对于范围（n）中的i： index.append（np.random.randint（0，n，size=count[i]）） def初始_版本（顶点、顶点_邻居）：躯体=[] 对于范围内的j（顶点.形状[0]）： terme=顶点[j]-顶点[顶点邻域[j]] somme_j=np.和（terme） sommes.append（somme_j）返回np.和（sommes） def优化_版本（顶点、顶点_邻居）： #可以预先计算以下两行计数=np.数组（[len（e）表示索引中的e]）展平索引=np.连接（索引）返回np.sum（np.repeat（顶点、计数、轴=0）-顶点[展平索引]） def更多优化版本（顶点、顶点邻域、计数、展平索引）：返回np.sum（np.repeat（顶点、计数、轴=0）-顶点[展平索引]）时间步长=20 a=时间（）对于范围内的t（时间步）： res=初始版本（顶点、索引） b=时间（）打印（“V1:时间：”，b-a）打印（“V1：结果”，分辨率） a=时间（）对于范围内的t（时间步）： res=优化的_版本（顶点、索引） b=时间（）打印（“V2:时间：”，b-a）打印（“V2：结果”，分辨率） a=时间（）计数=np.数组（[len（e）表示索引中的e]）展平索引=np.连接（索引）对于范围内的t（时间步）： res=更优化的版本（顶点、索引、计数、展平索引） b=时间（）打印（“V3:time:，b-a）打印（“V3：结果”，分辨率）
以下是我的机器上的基准测试结果：

V1: time: 3.656714916229248 V1: result -395.8416223057596 V2: time: 0.19800186157226562 V2: result -395.8416223057595 V3: time: 0.07983255386352539 V3: result -395.8416223057595
如您所见，此优化版本比参考实现快18倍，预计算索引的版本比参考实现快46倍
请注意，优化的版本应该需要更多的RAM（特别是当每个顶点的邻居数较大时）