Python 3D张量的非重叠块上的3D Conv,然后重新映射它们(Pytork)
嗨,我有一个大小为(128128)的3D张量作为模型的输入。当它进入模型时,它具有形状(8、4、128、128、128),即(批次、通道、H、W、D) 我想分离通道,并对这个(128128)输入的(32,32,32)块执行卷积。然后我希望获取conv权重,并将其与conv的输入值相乘,然后将它们重新映射到(128128)块 我目前效率低下的解决方案(使用许多for循环,在NumBySensor和scikit image之间进行转换)如下所示,但是它需要的时间太长,需要的内存太多。用张量做这个的最好方法是什么Python 3D张量的非重叠块上的3D Conv,然后重新映射它们(Pytork),python,machine-learning,deep-learning,computer-vision,pytorch,Python,Machine Learning,Deep Learning,Computer Vision,Pytorch,嗨,我有一个大小为(128128)的3D张量作为模型的输入。当它进入模型时,它具有形状(8、4、128、128、128),即(批次、通道、H、W、D) 我想分离通道,并对这个(128128)输入的(32,32,32)块执行卷积。然后我希望获取conv权重,并将其与conv的输入值相乘,然后将它们重新映射到(128128)块 我目前效率低下的解决方案(使用许多for循环,在NumBySensor和scikit image之间进行转换)如下所示,但是它需要的时间太长,需要的内存太多。用张量做这个的最
from skimage.util.shape import view_as_blocks
class LFBlock(nn.Module):
def __init__(self, input_shape=(128,128,128), kernel_size=(1,1,1), blk_div=4):
super(LFBlock,self).__init__()
# Divides the (128,128,128)//4 -> (32,32,32)
self.block_shape = (input_shape[0]//blk_div, input_shape[1]//blk_div, input_shape[2]//blk_div)
self.num_blocks = (input_shape[0]//self.block_shape[0])*(input_shape[0]//self.block_shape[0])*\
(input_shape[0]//self.block_shape[0])
conv_list = []
for n in range(self.num_blocks):
conv_list.append(nn.Conv3d(1,1, kernel_size=kernel_size, stride=1, padding=0, bias=True))
self.conv1x1s = nn.ModuleList(conv_list)
def forward(self, lf_in):
# Batch
for i in range(lf_in.shape[0]):
# Modality
for ch in range(lf_in.shape[1]):
x_lf = lf_in[i,ch,:]
lf_blocks = view_as_blocks(x_lf.cpu().numpy(), block_shape=self.block_shape)
# Do Conv3d on each block
for x in range(len(lf_blocks)):
for y in range(len(lf_blocks)):
for z in range(len(lf_blocks)):
conv_idx = x*len(lf_blocks) + y*len(lf_blocks) + z
# Convolve the block, then multiply with the weight of the block.
tensor_img = torch.from_numpy(lf_blocks[x,y,z])[None, None,:]
conv = self.conv1x1s[conv_idx](tensor_img.cuda())
# w * x.
# view_as_blocks returns a view so modifications are done in-place
lf_blocks[x,y,z] = tensor_img.cpu()*self.conv1x1s[conv_idx].weight.data.cpu()
# Linearly sum the modalities together
# out = w0*x0 + w1*x1 + w2*x2 + w3*x3
out = (lf_in[:,0]+lf_in[:,1]+lf_in[:,2]+lf_in[:,3])[:,None]
return out
感谢您的帮助。谢谢大家!