Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/330.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python Pybert:变形输入_Python_Pytorch_Huggingface Transformers - Fatal编程技术网

Python Pybert:变形输入

Python Pybert:变形输入,python,pytorch,huggingface-transformers,Python,Pytorch,Huggingface Transformers,我遇到了在大型输入序列上评估huggingface的BERT模型(“BERT-base-uncased”)的问题 model = BertModel.from_pretrained('bert-base-uncased', output_hidden_states=True) token_ids = [101, 1014, 1016, ...] # len(token_ids) == 33286 token_tensors = torch.tensor([token_ids]) # shape

我遇到了在大型输入序列上评估huggingface的BERT模型(“BERT-base-uncased”)的问题

model = BertModel.from_pretrained('bert-base-uncased', output_hidden_states=True)
token_ids = [101, 1014, 1016, ...] # len(token_ids) == 33286
token_tensors = torch.tensor([token_ids]) # shape == [1, 33286]
segment_tensors = torch.tensor([[1] * len(token_ids)]) # shape == [1, 33286]
model(token_tensors, segment_tensors)

Traceback
self.model(token_tensors, segment_tensors)
  File "/home/.../python3.8/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/.../python3.8/site-packages/transformers/modeling_bert.py", line 824, in forward
    embedding_output = self.embeddings(
  File "/home/.../python3.8/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/.../python3.8/site-packages/transformers/modeling_bert.py", line 211, in forward
    embeddings = inputs_embeds + position_embeddings + token_type_embeddings
RuntimeError: The size of tensor a (33286) must match the size of tensor b (512) at non-singleton dimension 1
我注意到
model.embeddings.positional\u embeddings.weight.shape==(512768)
。也就是说,当我将输入大小限制为
模型(标记张量[:,:10],段张量[:,:10])
时,它会起作用。我误解了
标记张量
段张量
的形状。我认为它们的大小应该是
(批量大小、序列长度)


感谢您的帮助

我刚刚发现huggingface的预训练BERT模型的最大输入长度为512()

我刚刚发现huggingface的预训练BERT模型的最大输入长度为512()