Deep learning Pytorch中自定义LSTM模型的输出尺寸
我在PyTorch中有一个自定义的LSTM模型,如下所示:Deep learning Pytorch中自定义LSTM模型的输出尺寸,deep-learning,pytorch,lstm,Deep Learning,Pytorch,Lstm,我在PyTorch中有一个自定义的LSTM模型,如下所示: hidden_size = 32 num_layers = 1 num_classes = 2 class customModel(nn.Module): def __init__(self, input_size, hidden_size, num_layers, num_classes): super(customModel, self).__init__() self.hidden_s
hidden_size = 32
num_layers = 1
num_classes = 2
class customModel(nn.Module):
def __init__(self, input_size, hidden_size, num_layers, num_classes):
super(customModel, self).__init__()
self.hidden_size = hidden_size
self.num_layers = num_layers
self.bilstm = nn.LSTM(input_size, hidden_size, num_layers, batch_first=True, bidirectional=True)
self.fcl = nn.Linear(hidden_size*2, num_classes)
def forward(self, x):
# Set initial hidden and cell states
h0 = torch.zeros(self.num_layers*2, x.size(0), self.hidden_size).to(device)
c0 = torch.zeros(self.num_layers*2, x.size(0), self.hidden_size).to(device)
# Forward propagate LSTM
out, hidden = self.bilstm(x, (h0, c0))
fw_bilstm = out[-1, :, :self.hidden_size]
bk_bilstm = out[0, :, :self.hidden_size]
concat_fw_bw = torch.cat((fw_bilstm, bk_bilstm), dim = 1)
fc = self.fcl(concat_fw_bw)
x = F.softmax(F.relu(fc))
return x
我可以将类型为torch.Tensor
的输入传递到此模型。输入长度为67349
,每个长度为300
尺寸向量
在模型初始化和预测之后,我得到一个长度为1
的输出向量
model = customModel(300, hidden_size, num_layers, num_classes)
output = model(input_torch)
当我打印出来时,输出显示张量([[0.5020,0.4980]],grad_fn=)
为什么输出长度为1
?似乎我不应该在我的模型中使用barch_first=True
,但是改变它需要其他输入维度的改变,我不知道该怎么做
请建议如何获得长度为67349
(输入长度)而不是1
的矢量输出
解释
我看到@gorjan建议对网络的forward
方法进行一些修改。因此,我想进一步澄清我试图构建的内容
我已经对模块中的
def forward(…)
方法进行了注释,请查看:
def forward(self, x):
# Set initial hidden and cell states
h0 = torch.zeros(self.num_layers*2, x.size(0), self.hidden_size).to(device)
c0 = torch.zeros(self.num_layers*2, x.size(0), self.hidden_size).to(device)
# Forward propagate LSTM
out, hidden = self.bilstm(x, (h0, c0)) # out is of size [batch_size, sequence_length, hidden_size * num_directions]
fw_bilstm = out[-1, :, :self.hidden_size] # This is wrong: You are taking only last batch element
bk_bilstm = out[0, :, :self.hidden_size] # This is wrong: You are taking only the first batch element
concat_fw_bw = torch.cat((fw_bilstm, bk_bilstm), dim = 1) # This is not needed: If you want to obtain the hidden states for all elements in the sequence
fc = self.fcl(concat_fw_bw) # Because of the above mentioned issues, this is wrong as well.
x = F.softmax(F.relu(fc)) # This is wrong: Never stack activation on top of activation.
return x
现在,根据你的提问:
请建议如何获得长度为67349(输入长度)而不是1的矢量输出
我假设您希望获取批处理中每个元素的隐藏状态。以下是你应该如何组织你的前传:
def forward(self, x):
# Set initial hidden and cell states
h0 = torch.zeros(self.num_layers*2, x.size(0), self.hidden_size).to(device)
c0 = torch.zeros(self.num_layers*2, x.size(0), self.hidden_size).to(device)
# Forward propagate LSTM
out, hidden = self.bilstm(x, (h0, c0)) # out is of size [batch_size, sequence_length, hidden_size * num_directions]
fc = self.fcl(out) # fc is of size [batch_size, sequence_length, num_classes]
x = F.softmax(fc) # Just softmax so that you can get the probabilities for each of your classes
return x
如果我们测试更新后的模型,结果如下:
# Assuming 32 elements in the batch, each elements has 177 elements in the sequence, and each sequence element has size 300
inputs = torch.rand(32, 177, 300)
# Obtaining the outputs from the model
outputs = model(inputs)
# The size is as expected: torch.Size([32, 177, 2])
print(outputs.shape)
还有一件事要记住,你说:
输入长度为67349,每个长度为300维向量
这是一个非常长的序列。你的模特表现会很差,我想你的训练会永远持续下去。但是,这是一个完全不同的问题,应该在单独的线程中讨论。请参阅我的解释。您可以发布预期的输入大小和预期的输出大小,以及对大小的说明吗?