Python Pytorch BiLSTM词性标记问题:运行时错误:输入。大小(-1)必须等于输入大小。期望6,得到12
我有一个nlp数据集,根据Pytorch官方教程,我将数据集更改为单词_to_idx和标记_to_idx,如下所示:Python Pytorch BiLSTM词性标记问题:运行时错误:输入。大小(-1)必须等于输入大小。期望6,得到12,python,python-3.x,nlp,pytorch,Python,Python 3.x,Nlp,Pytorch,我有一个nlp数据集,根据Pytorch官方教程,我将数据集更改为单词_to_idx和标记_to_idx,如下所示: word_to_idx = {'I': 0, 'have': 1, 'used': 2, 'transfers': 3, 'on': 4, 'three': 5, 'occasions': 6, 'now': 7, 'and': 8, 'each': 9, 'time': 10} tag_to_idx = {'PRON': 0, 'VERB': 1, 'NOUN': 2, '
word_to_idx = {'I': 0, 'have': 1, 'used': 2, 'transfers': 3, 'on': 4, 'three': 5, 'occasions': 6, 'now': 7, 'and': 8, 'each': 9, 'time': 10}
tag_to_idx = {'PRON': 0, 'VERB': 1, 'NOUN': 2, 'ADP': 3, 'NUM': 4, 'ADV': 5, 'CONJ': 6, 'DET': 7, 'ADJ': 8, 'PRT': 9, '.': 10, 'X': 11}
我想用BiLSTM完成词性标注任务。这是我的BiLSTM代码:
class LSTMTagger(nn.Module):
def __init__(self, embedding_dim, hidden_dim, vocab_size, tagset_size):
super(LSTMTagger, self).__init__()
self.hidden_dim = hidden_dim
self.word_embeddings = nn.Embedding(vocab_size, tagset_size)
# The LSTM takes word embeddings as inputs, and outputs hidden states
self.lstm = nn.LSTM(embedding_dim, hidden_dim, bidirectional=True)
# The linear layer that maps from hidden state space to tag space
self.hidden2tag = nn.Linear(in_features=hidden_dim * 2, out_features=tagset_size)
def forward(self, sentence):
embeds = self.word_embeddings(sentence)
lstm_out, _ = self.lstm(embeds.view(len(sentence), 1, -1))
tag_space = self.hidden2tag(lstm_out.view(len(sentence), -1))
# tag_scores = F.softmax(tag_space, dim=1)
tag_scores = F.log_softmax(tag_space, dim=1)
return tag_scores
然后我在Pycharm中运行培训代码,如:
EMBEDDING_DIM = 6
HIDDEN_DIM = 6
NUM_EPOCHS = 3
model = LSTMTagger(embedding_dim=EMBEDDING_DIM,
hidden_dim=HIDDEN_DIM,
vocab_size=len(word_to_idx),
tagset_size=len(tag_to_idx))
loss_function = nn.NLLLoss()
optimizer = optim.SGD(model.parameters(), lr=0.1)
# See what the scores are before training
with torch.no_grad():
inputs = prepare_sequence(training_data[0][0], word_to_idx)
tag_scores = model(inputs)
print(tag_scores)
print(tag_scores.size())
但是,它显示了行tag\u scores=model(inputs)
和行lstm\u out、\uu=self.lstm(embeddes.view(len(句子),1,-1))
的错误。
错误是:
Traceback (most recent call last):
line 140, in <module>
tag_scores = model(inputs)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in __call__
result = self.forward(*input, **kwargs)
line 115, in forward
lstm_out, _ = self.lstm(embeds.view(len(sentence), 1, -1))
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in __call__
result = self.forward(*input, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/torch/nn/modules/rnn.py", line 559, in forward
return self.forward_tensor(input, hx)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/torch/nn/modules/rnn.py", line 539, in forward_tensor
output, hidden = self.forward_impl(input, hx, batch_sizes, max_batch_size, sorted_indices)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/torch/nn/modules/rnn.py", line 519, in forward_impl
self.check_forward_args(input, hx, batch_sizes)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/torch/nn/modules/rnn.py", line 490, in check_forward_args
self.check_input(input, batch_sizes)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/torch/nn/modules/rnn.py", line 153, in check_input
self.input_size, input.size(-1)))
RuntimeError: input.size(-1) must be equal to input_size. Expected 6, got 12
回溯(最近一次呼叫最后一次):
第140行,输入
tag_分数=模型(输入)
文件“/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site packages/torch/nn/modules/module.py”,第493行,在__
结果=自我转发(*输入,**kwargs)
第115行,向前
lstm_out,_=self.lstm(嵌入.view(len(句子),1,-1))
文件“/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site packages/torch/nn/modules/module.py”,第493行,在__
结果=自我转发(*输入,**kwargs)
文件“/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site packages/torch/nn/modules/rnn.py”,第559行,向前
返回自前向张量(输入,hx)
文件“/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site packages/torch/nn/modules/rnn.py”,第539行,前向张量
输出,隐藏=self.forward\u impl(输入,hx,批次大小,最大批次大小,排序索引)
文件“/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/torch/nn/modules/rnn.py”,第519行,向前推进
自检转发参数(输入、hx、批次大小)
文件“/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site packages/torch/nn/modules/rnn.py”,第490行,在check\u forward\u args中
自检输入(输入、批量大小)
文件“/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site packages/torch/nn/modules/rnn.py”,第153行,检查输入
self.input_size,input.size(-1)))
运行时错误:input.size(-1)必须等于input\u size。期望6,得到12
我不知道如何调试它。有人能帮我解决这个问题吗?提前谢谢 错误在这里:
self.word\u embeddings=nn.Embedding(语音大小、标记集大小)
不要使用嵌入维度,而是使用标记数12,而不是LSTM层所期望的6。非常感谢!通过将嵌入尺寸更改为12来解决。