Python 更改配置并加载在下游任务上微调的拥抱面模型
我将HuggingFace模型用于Python 更改配置并加载在下游任务上微调的拥抱面模型,python,pytorch,bert-language-model,huggingface-transformers,Python,Pytorch,Bert Language Model,Huggingface Transformers,我将HuggingFace模型用于TokenClassification任务。我有以下label2id映射。我正在使用库的3.3.0版 label2id = { "B-ADD": 4, "B-ARRESTED": 7, "B-CRIME": 2, "B-INCIDENT_DATE": 3, "B-SUSPECT": 9, "B-VICTIM
TokenClassification
任务。我有以下label2id映射。我正在使用库的3.3.0版
label2id = {
"B-ADD": 4,
"B-ARRESTED": 7,
"B-CRIME": 2,
"B-INCIDENT_DATE": 3,
"B-SUSPECT": 9,
"B-VICTIMS": 1,
"B-WPN": 5,
"I-ADD": 8,
"I-ARRESTED": 13,
"I-CRIME": 11,
"I-INCIDENT_DATE": 10,
"I-SUSPECT": 14,
"I-VICTIMS": 12,
"I-WPN": 6,
"O": 0
}
以下场景运行良好,模型加载正确
from transformers import AutoModelForTokenClassification, AutoTokenizer, AutoConfig
pretrained_model_name = "bert-base-cased"
config = AutoConfig.from_pretrained(pretrained_model_name)
id2label = {y:x for x,y in label2id.items()}
config.label2id = label2id
config.id2label = id2label
config._num_labels = len(label2id)
model = AutoModelForTokenClassification.from_pretrained(pretrained_model_name, config=config)
model
我得到以下输出。最后一层已经用15个神经元正确初始化(要预测的标记类别的数量)
但是如果我将预训练的模型名称
更改为“dbmdz/bert-large-cased-fineten-conll03-english”
,我会得到以下错误
loading weights file https://cdn.huggingface.co/dbmdz/bert-large-cased-finetuned-conll03-english/pytorch_model.bin from cache at C:\Users\anu10961/.cache\torch\transformers\4b02c1fe04cf7f7e6972536150e9fb329c7b3d5720b82afdac509bd750c705d2.6dcb154688bb97608a563afbf68ba07ae6f7beafd9bd98b5a043cd269fcc02b4
All model checkpoint weights were used when initializing BertForTokenClassification.
All the weights of BertForTokenClassification were initialized from the model checkpoint at dbmdz/bert-large-cased-finetuned-conll03-english.
If your task is similar to the task the model of the checkpoint was trained on, you can already use BertForTokenClassification for predictions without further training.
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-15-2969a8092bf4> in <module>
----> 1 model = AutoModelForTokenClassification.from_pretrained(pretrained_model_name, config=config)
C:\ProgramData\Anaconda3\envs\arcgis183\lib\site-packages\transformers\modeling_auto.py in from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
1372 if type(config) in MODEL_FOR_TOKEN_CLASSIFICATION_MAPPING.keys():
1373 return MODEL_FOR_TOKEN_CLASSIFICATION_MAPPING[type(config)].from_pretrained(
-> 1374 pretrained_model_name_or_path, *model_args, config=config, **kwargs
1375 )
1376
C:\ProgramData\Anaconda3\envs\arcgis183\lib\site-packages\transformers\modeling_utils.py in from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
1047 raise RuntimeError(
1048 "Error(s) in loading state_dict for {}:\n\t{}".format(
-> 1049 model.__class__.__name__, "\n\t".join(error_msgs)
1050 )
1051 )
RuntimeError: Error(s) in loading state_dict for BertForTokenClassification:
size mismatch for classifier.weight: copying a param with shape torch.Size([9, 1024]) from checkpoint, the shape in current model is torch.Size([15, 1024]).
size mismatch for classifier.bias: copying a param with shape torch.Size([9]) from checkpoint, the shape in current model is torch.Size([15]).
但我仍然觉得我们可以更改此模型的最后一层,并将其用于我的特定任务(尽管我需要先训练模型,然后再使用它进行推断)一旦模型的一部分位于已保存的预训练模型中,就不能更改其超参数。通过设置预先训练好的模型和配置,您是说您想要一个分类为15个类的模型,并且您想要用一个使用9个类但不起作用的模型进行初始化 如果我理解正确,那么您希望从不同的分类器初始化基础的BERT。可以做到这一点的解决办法是:
从Transformers导入AutoModel,AutoModelForTokenClassification
bert=自动建模。来自预训练('dbmdz/bert-large-cased-finetuned-conll03-english'))
分类器=AutoModelForTokenClassification.from_config(配置)
classifier.bert=bert
我想第三行应该是classifier=AutoModelForTokenClassification。从_config(config)你说得对,谢谢。我在最初的问题上没有问这个问题,但是如果我想让上面的代码足够通用,以便它可以与任何微调模型一起工作(无论架构是BERT、RoBERTa、XLNet等),该怎么办
loading weights file https://cdn.huggingface.co/dbmdz/bert-large-cased-finetuned-conll03-english/pytorch_model.bin from cache at C:\Users\anu10961/.cache\torch\transformers\4b02c1fe04cf7f7e6972536150e9fb329c7b3d5720b82afdac509bd750c705d2.6dcb154688bb97608a563afbf68ba07ae6f7beafd9bd98b5a043cd269fcc02b4
All model checkpoint weights were used when initializing BertForTokenClassification.
All the weights of BertForTokenClassification were initialized from the model checkpoint at dbmdz/bert-large-cased-finetuned-conll03-english.
If your task is similar to the task the model of the checkpoint was trained on, you can already use BertForTokenClassification for predictions without further training.
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-15-2969a8092bf4> in <module>
----> 1 model = AutoModelForTokenClassification.from_pretrained(pretrained_model_name, config=config)
C:\ProgramData\Anaconda3\envs\arcgis183\lib\site-packages\transformers\modeling_auto.py in from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
1372 if type(config) in MODEL_FOR_TOKEN_CLASSIFICATION_MAPPING.keys():
1373 return MODEL_FOR_TOKEN_CLASSIFICATION_MAPPING[type(config)].from_pretrained(
-> 1374 pretrained_model_name_or_path, *model_args, config=config, **kwargs
1375 )
1376
C:\ProgramData\Anaconda3\envs\arcgis183\lib\site-packages\transformers\modeling_utils.py in from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
1047 raise RuntimeError(
1048 "Error(s) in loading state_dict for {}:\n\t{}".format(
-> 1049 model.__class__.__name__, "\n\t".join(error_msgs)
1050 )
1051 )
RuntimeError: Error(s) in loading state_dict for BertForTokenClassification:
size mismatch for classifier.weight: copying a param with shape torch.Size([9, 1024]) from checkpoint, the shape in current model is torch.Size([15, 1024]).
size mismatch for classifier.bias: copying a param with shape torch.Size([9]) from checkpoint, the shape in current model is torch.Size([15]).
label2id = {
"B-LOC": 7,
"B-MISC": 1,
"B-ORG": 5,
"B-PER": 3,
"I-LOC": 8,
"I-MISC": 2,
"I-ORG": 6,
"I-PER": 4,
"O": 0
}