Python 复制一层';s从一个Huggingbert模型到另一个Huggingbert模型的权重
我有一个预先训练过的模型,我这样加载:Python 复制一层';s从一个Huggingbert模型到另一个Huggingbert模型的权重,python,bert-language-model,huggingface-transformers,Python,Bert Language Model,Huggingface Transformers,我有一个预先训练过的模型,我这样加载: from transformers import BertForSequenceClassification, AdamW, BertConfig, BertModel model = BertForSequenceClassification.from_pretrained( "bert-base-uncased", # Use the 12-layer BERT model, with an uncased vocab.
from transformers import BertForSequenceClassification, AdamW, BertConfig, BertModel
model = BertForSequenceClassification.from_pretrained(
"bert-base-uncased", # Use the 12-layer BERT model, with an uncased vocab.
num_labels = 2, # The number of output labels--2 for binary classification.
# You can increase this for multi-class tasks.
output_attentions = False, # Whether the model returns attentions weights.
output_hidden_states = False, # Whether the model returns all hidden-states.
)
我想创建一个具有相同架构和随机初始权重的新模型,但嵌入层除外:
==== Embedding Layer ====
bert.embeddings.word_embeddings.weight (30522, 768)
bert.embeddings.position_embeddings.weight (512, 768)
bert.embeddings.token_type_embeddings.weight (2, 768)
bert.embeddings.LayerNorm.weight (768,)
bert.embeddings.LayerNorm.bias (768,)
我似乎可以这样做,以创建具有相同架构的新模型,但所有权重都是随机的:
configuration = model.config
untrained_model = BertForSequenceClassification(configuration)
那么,如何将
模型的嵌入层权重复制到新的未经训练的\u模型中呢?权重和偏差只是张量,您可以简单地使用以下方法复制它们:
从transformers导入BertForSequenceClassification,BertConfig
jetfire=BertForSequenceClassification.from_pretrained('bert-base-cased'))
config=BertConfig.from_pretrained('bert-base-cased'))
optimus=BertForSequenceClassification(配置)
parts=['bert.embedings.word\u embedings.weight'
,'bert.embedings.position_embedings.weight'
,'bert.embeddings.token\u type\u embeddings.weight'
,'bert.嵌入.分层形式.重量'
,'bert.embeddings.LayerNorm.bias']
def JOLTECTRIFY(jetfire、optimus、零件):
target=dict(optimus.named_parameters())
source=dict(jetfire.named_parameters())
对于部分中的部分:
目标[part].data.copy(源[part].data)
Joltecritify(jetfire、optimus、零件)
这似乎有效。谢谢我唯一不同于你的是我仍然使用configuration=model.config;untrained_model=BertForSequenceClassification(configuration)
要复制预先训练的模型,您也可以这样做。我这样做只是为了说明你可以使用任何东西,比如config
@拉塞尔里奇