Python 复制一层';s从一个Huggingbert模型到另一个Huggingbert模型的权重

Python 复制一层';s从一个Huggingbert模型到另一个Huggingbert模型的权重,python,bert-language-model,huggingface-transformers,Python,Bert Language Model,Huggingface Transformers,我有一个预先训练过的模型,我这样加载: from transformers import BertForSequenceClassification, AdamW, BertConfig, BertModel model = BertForSequenceClassification.from_pretrained( "bert-base-uncased", # Use the 12-layer BERT model, with an uncased vocab.

我有一个预先训练过的模型,我这样加载:

from transformers import BertForSequenceClassification, AdamW, BertConfig, BertModel
model = BertForSequenceClassification.from_pretrained(
    "bert-base-uncased", # Use the 12-layer BERT model, with an uncased vocab.
    num_labels = 2, # The number of output labels--2 for binary classification.
                    # You can increase this for multi-class tasks.   
    output_attentions = False, # Whether the model returns attentions weights.
    output_hidden_states = False, # Whether the model returns all hidden-states.
)
我想创建一个具有相同架构和随机初始权重的新模型,但嵌入层除外:

==== Embedding Layer ====

bert.embeddings.word_embeddings.weight                  (30522, 768)
bert.embeddings.position_embeddings.weight                (512, 768)
bert.embeddings.token_type_embeddings.weight                (2, 768)
bert.embeddings.LayerNorm.weight                              (768,)
bert.embeddings.LayerNorm.bias                                (768,)
我似乎可以这样做,以创建具有相同架构的新模型,但所有权重都是随机的:

configuration   = model.config
untrained_model = BertForSequenceClassification(configuration)

那么,如何将
模型的嵌入层权重复制到新的
未经训练的\u模型中呢?

权重和偏差只是张量,您可以简单地使用以下方法复制它们:

从transformers导入BertForSequenceClassification,BertConfig
jetfire=BertForSequenceClassification.from_pretrained('bert-base-cased'))
config=BertConfig.from_pretrained('bert-base-cased'))
optimus=BertForSequenceClassification(配置)
parts=['bert.embedings.word\u embedings.weight'
,'bert.embedings.position_embedings.weight'
,'bert.embeddings.token\u type\u embeddings.weight'
,'bert.嵌入.分层形式.重量'
,'bert.embeddings.LayerNorm.bias']
def JOLTECTRIFY(jetfire、optimus、零件):
target=dict(optimus.named_parameters())
source=dict(jetfire.named_parameters())
对于部分中的部分:
目标[part].data.copy(源[part].data)
Joltecritify(jetfire、optimus、零件)

这似乎有效。谢谢我唯一不同于你的是我仍然使用
configuration=model.config;untrained_model=BertForSequenceClassification(configuration)
要复制预先训练的模型,您也可以这样做。我这样做只是为了说明你可以使用任何东西,比如
config
@拉塞尔里奇