Pytorch BertForSequenceClassification类和我的自定义Bert分类之间的度量不匹配_Pytorch_Huggingface Transformers

Pytorch BertForSequenceClassification类和我的自定义Bert分类之间的度量不匹配

pytorch

Pytorch BertForSequenceClassification类和我的自定义Bert分类之间的度量不匹配,pytorch,huggingface-transformers,Pytorch,Huggingface Transformers,我通过在Bert模型（见下文）的顶部添加一个分类器层，实现了我的自定义Bert二进制分类模型类。然而，当我使用官方的BertForSequenceClassification模型进行训练时，准确度/衡量标准明显不同，这让我怀疑我是否在课堂上遗漏了一些东西我几乎没有疑问：从预先训练的加载正式的BertForSequenceClassification时，分类器权重是否也从预先训练的模型初始化，还是随机初始化？因为在我的自定义类中，它们是随机初始化的类别分类（nn.Module）： def u

我通过在Bert模型（见下文）的顶部添加一个分类器层，实现了我的自定义Bert二进制分类模型类。然而，当我使用官方的BertForSequenceClassification模型进行训练时，准确度/衡量标准明显不同，这让我怀疑我是否在课堂上遗漏了一些东西

我几乎没有疑问：

从预先训练的加载正式的

BertForSequenceClassification

时，分类器权重是否也从预先训练的模型初始化，还是随机初始化？因为在我的自定义类中，它们是随机初始化的
类别分类（nn.Module）：
def uuu init uuuu（自，编码器='bert-base-uncased'，
数字标签，
隐藏\u退出\u问题）：
超级（MyCustomBertClassification，self）。\uu初始化
self.config=AutoConfig.from_pretrained（编码器）
self.encoder=AutoModel.from_config（self.config）
self.dropout=nn.dropout（隐藏的dropout\u prob）
self.classifier=nn.Linear（self.config.hidden\u size，num\u标签）
def转发（自身，输入已发送）：
输出=self.encoder（输入\标识=输入\发送['input \标识]，
注意屏蔽=发送的输入[“注意屏蔽”]，
令牌类型标识=已发送的输入['令牌类型标识']，
返回（dict=True）
合并的_输出=自退出（输出[1]）
#对于这两项任务
logits=self.classifier（池输出）
返回登录
当您使用该方法时，每个模型都会通过一条警告消息告诉您哪些图层是随机初始化的：
从变压器导入，用于顺序分类
b=BertForSequenceClassification.from_pretrained（'bert-base-uncased'））

输出：
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier.weight', 'classifier.bias']

Does from_config provides pretrained weights: False
Does from_pretrained provides pretrained weights: True

True

您的实现与BertForSequenceClassification之间的区别在于根本不使用任何预训练权重。该方法不会从状态加载预训练权重：
导入火炬
从transformers导入AutoModelForSequenceClassification，AutoConfig
b2=AutoModelForSequenceClassification.from_config（AutoConfig.from_pretrained（'bert-base-uncased'））
b3=序列分类的自动模型。来自预训练（'bert-base-uncased'））
打印（“Does from_config提供预训练权重：{}”。格式（torch.equal（b.bert.embeddings.word_embeddings.weight，b2.base_model.embeddings.word_embeddings.weight）））
打印（“Does from_pretrained提供预训练权重：{}”。格式（torch.equal（b.bert.embedings.word_embedings.weight，b3.base_model.embedings.word_embedings.weight）））

输出：
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier.weight', 'classifier.bias']

Does from_config provides pretrained weights: False
Does from_pretrained provides pretrained weights: True

True

因此，您可能希望将您的类更改为：
类别分类（nn.Module）：
def uuu init uuuu（自，编码器='bert-base-uncased'，
num_labels=2，
隐藏\u辍学\u概率=0.1）：
超级（MyCustomBertClassification，self）。\uu初始化
self.config=AutoConfig.from_pretrained（编码器）
self.encoder=AutoModel.from_pretrained（编码器）
self.dropout=nn.dropout（隐藏的dropout\u prob）
self.classifier=nn.Linear（self.config.hidden\u size，num\u标签）
def转发（自身，输入已发送）：
输出=self.encoder（输入\标识=输入\发送['input \标识]，
注意屏蔽=发送的输入[“注意屏蔽”]，
令牌类型标识=已发送的输入['令牌类型标识']，
返回（dict=True）
合并的_输出=自退出（输出[1]）
#对于这两项任务
logits=self.classifier（池输出）
返回登录
myB=MyCustomBertClassification（）
打印（torch.equal（b.bert.embedings.word_embedings.weight，myB.encoder.embedings.word_embedings.weight））

输出：
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier.weight', 'classifier.bias']

Does from_config provides pretrained weights: False
Does from_pretrained provides pretrained weights: True

True

是的，这确实是一个主要问题，使用来自_config
的而不是来自_pretrained
的。然而，我仍然看到官方类存在一些性能滞后，这让我想知道在分类器网络的初始化过程中是否有特殊的事情发生？想法？你可以检查代码。但我不确定这是否能解释这种差异@伪戒酒者