Warning: file_get_contents(/data/phpspider/zhask/data//catemap/0/performance/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Performance 如何避免pytorch中的for循环?是否有一个有效计算的函数?_Performance_Pytorch - Fatal编程技术网

Performance 如何避免pytorch中的for循环?是否有一个有效计算的函数?

Performance 如何避免pytorch中的for循环?是否有一个有效计算的函数?,performance,pytorch,Performance,Pytorch,我的Pytorch神经网络中有以下代码: cos = nn.CosineSimilarity(dim=1) d = torch.zeros(batch_sz, n, n).to(device="cuda") for i in range(n): for j in range(n): d[:, i, j] = cos(q[:, i, :], k[:, j, :]) q和k都是(批次号为sz,n,m)。 这段代码显然减慢了我的程序的速度,我想知道

我的Pytorch神经网络中有以下代码:

cos = nn.CosineSimilarity(dim=1)

d = torch.zeros(batch_sz, n, n).to(device="cuda")
    
for i in range(n):
   for j in range(n):
       d[:, i, j] = cos(q[:, i, :], k[:, j, :])
q
k
都是
(批次号为sz,n,m)
。 这段代码显然减慢了我的程序的速度,我想知道Pytorch是否提供了任何可能使其更高效的功能


非常感谢

我不确定如何使用
nn.CosineSimilarity
进行矢量化,但您可以使用此矢量化实现。它计算余弦相似性的方法与PyTorch的内部模块相同

import torch
import torch.nn as nn
import time

# some dummy inputs
n=20
m=30
batch_sz = 10

k = torch.rand(batch_sz, n, m)
q = torch.rand(batch_sz, n, m)
d = torch.zeros(batch_sz, n, n)


cos = nn.CosineSimilarity(dim=1)

for i in range(n):
    for j in range(n):
        d[:, i, j] = cos(q[:, i, :], k[:, j, :])



# dot product (numerator)
out = torch.bmm(q, k.transpose(1,2))

# computing the denominator in the next 5 steps

# compute the norm and restore dimensions
q_norm = q.norm(dim=2).unsqueeze(2)
k_norm = k.norm(dim=2).unsqueeze(1)

# This repeats the norms along dim 2 for q and dim 1 for k
q_norm_expanded = q_norm.expand(batch_sz, n, n)
k_norm_expanded = k_norm.expand(batch_sz, n, n)

# we compute the product. 
norms = q_norm_expanded* k_norm_expanded

# cosine similarity
out = out/(norms+1e-9)

print(torch.allclose(d, out))
扩展和倍增范数的过程实际上是计算外积。因此,您也可以使用以下操作:

norms = torch.bmm(q_norm, k_norm)
而不是

q_norm_expanded = q_norm.expand(batch_sz, n, n)
k_norm_expanded = k_norm.expand(batch_sz, n, n)

norms = q_norm_expanded* k_norm_expanded
我刚刚意识到,你可以在手前对向量进行规范化,以得到一个更简洁、计算更稳定的版本

q_norm = q.norm(dim=2)+1e-9
k_norm = k.norm(dim=2)+1e-9
q = q/q_norm.unsqueeze(2)
k = k/k_norm.unsqueeze(2)

out = torch.bmm(q, k.transpose(1,2))