Python PyTorch：从一批图像中选择矢量化面片_Python_Pytorch_Vectorization_Torchvision

Python PyTorch：从一批图像中选择矢量化面片

python pytorch

Python PyTorch：从一批图像中选择矢量化面片,python,pytorch,vectorization,torchvision,Python,Pytorch,Vectorization,Torchvision,假设我有一批图像作为张量，例如： images = torch.zeros(64, 3, 1024, 1024) 现在，我想从每个图像中选择一个补丁。所有面片大小相同，但批处理中每个图像的起始位置不同 size_x = 100 size_y = 100 start_x = torch.zeros(64) start_y = torch.zeros(64) 我可以达到这样的预期效果： result = [] for i in range(arr.shape[0]): result.ap

假设我有一批图像作为张量，例如：

images = torch.zeros(64, 3, 1024, 1024)

现在，我想从每个图像中选择一个补丁。所有面片大小相同，但批处理中每个图像的起始位置不同

size_x = 100
size_y = 100
start_x = torch.zeros(64)
start_y = torch.zeros(64)

我可以达到这样的预期效果：

result = []
for i in range(arr.shape[0]):
    result.append(arr[i, :, start_x[i]:start_x[i]+size_x, start_y[i]:start_y[i]+size_y])
result = torch.stack(result, dim=0)

问题是——有没有可能在没有循环的情况下更快地完成同样的事情？也许有某种形式的高级索引，或者一个PyTorch函数可以做到这一点？

您可以使用torch.take来摆脱for循环。但首先，应该使用此函数创建一个索引数组

def convert_inds(img_a,img_b,patch_a,patch_b,start_x,start_y):
    
    all_patches = np.zeros((len(start_x),3,patch_a,patch_b))
    
    patch_src = np.zeros((patch_a,patch_b))
    inds_src = np.arange(patch_b)
    patch_src[:] = inds_src
    for ind,info in enumerate(zip(start_x,start_y)):
        
        x,y = info
        if x + patch_a + 1 > img_a: return False
        if y + patch_b + 1 > img_b: return False
        start_ind = img_b * x + y
        end_ind = img_b * (x + patch_a -1) + y
        col_src = np.linspace(start_ind,end_ind,patch_b)[:,None]
        all_patches[ind,:] = patch_src + col_src
        
    return all_patches.astype(np.int)

如您所见，此函数本质上为要切片的每个面片创建索引。有了这个功能，这个问题可以通过

size_x = 100
size_y = 100
start_x = torch.zeros(64)
start_y = torch.zeros(64)

images = torch.zeros(64, 3, 1024, 1024)
selected_inds = convert_inds(1024,1024,100,100,start_x,start_y)
selected_inds = torch.tensor(selected_inds)
res = torch.take(images,selected_inds)

更新

OP的观察是正确的，上面的方法并不比天真的方法快。为了避免每次都建立索引，这里有另一个基于展开的解决方案

首先，构建所有可能面片的张量

# create all possible patches
all_patches = images.unfold(2,size_x,1).unfold(3,size_y,1)

然后，从所有_面片中切片所需的面片

谢谢你的回答！然而，我的目标是通过消除循环，而不仅仅是消除循环本身，使代码更快。由于您的解决方案需要在每次选择具有一组新面片位置的面片时建立索引，因此我认为它不会比原始解决方案快。@DLunin不客气：您的观察是正确的，我使用“展开”更新了我的帖子，其思想是首先创建所有可能面片的张量，然后直接用start_x和start_y对其进行切片。

img_ind = torch.arange(images.shape[0])
selected_patches = all_patches[img_ind,:,start_x,start_y,:,:]