Python 3.x 每次随机输出的伯特预训练模型
我试图在huggingface bert transformer之后添加一个附加层,所以我在我的Python 3.x 每次随机输出的伯特预训练模型,python-3.x,pytorch,huggingface-transformers,bert-language-model,Python 3.x,Pytorch,Huggingface Transformers,Bert Language Model,我试图在huggingface bert transformer之后添加一个附加层,所以我在我的nn.Module网络中使用了BertForSequenceClassification。但是,与直接加载模型相比,我看到模型给了我随机输出 模式1: from transformers import BertForSequenceClassification model = BertForSequenceClassification.from_pretrained('bert-base-uncas
nn.Module
网络中使用了BertForSequenceClassification
。但是,与直接加载模型相比,我看到模型给了我随机输出
模式1:
from transformers import BertForSequenceClassification
model = BertForSequenceClassification.from_pretrained('bert-base-uncased', num_labels = 5) # as we have 5 classes
import torch
from transformers import BertTokenizer
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
input_ids = torch.tensor(tokenizer.encode(texts[0], add_special_tokens=True, max_length = 512)).unsqueeze(0) # Batch size 1
print(model(input_ids))
输出:
torch.Size([1512])
火炬尺寸([1,5])
(张量([-0.3729,-0.2192,0.1183,0.0778,-0.2820]],
grad_fn=
伯特是否有一些具体的参数,如果有,如何获得可再现的输出
为什么这两个模型给了我不同的输出?是不是我做错了什么
原因是由于Bert的分类器层的随机初始化
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
(dropout): Dropout(p=0.1, inplace=False)
(classifier): Linear(in_features=768, out_features=5, bias=True)
)
在最后一层中有一个分类器
,该层添加在bert base
之后。现在,您希望培训该层以完成下游任务
如果您想获得更多信息:
model, li = BertForSequenceClassification.from_pretrained('bert-base-uncased', num_labels = 5, output_loading_info=True) # as we have 5 classes
print(li)
您可以看到分类器。缺少权重
和偏差
,因此每次调用BertForSequenceClassification.from_pretrained('bert-base-uncased',num_labels=5)
torch.Size([1, 512])
torch.Size([1, 5])
(tensor([[-0.3729, -0.2192, 0.1183, 0.0778, -0.2820]],
grad_fn=<AddmmBackward>),)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
(dropout): Dropout(p=0.1, inplace=False)
(classifier): Linear(in_features=768, out_features=5, bias=True)
)
model, li = BertForSequenceClassification.from_pretrained('bert-base-uncased', num_labels = 5, output_loading_info=True) # as we have 5 classes
print(li)
{'missing_keys': ['classifier.weight', 'classifier.bias'], 'unexpected_keys': ['cls.predictions.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.LayerNorm.bias'], 'error_msgs': []}