Python 如何获取BERT预训练模型中的中间层参数？_Python_Pytorch_Bert Language Model_Huggingface Transformers

Python 如何获取BERT预训练模型中的中间层参数？

python pytorch

Python 如何获取BERT预训练模型中的中间层参数？,python,pytorch,bert-language-model,huggingface-transformers,Python,Pytorch,Bert Language Model,Huggingface Transformers,我试过： import torch import transformers tokenizer = transformers.AlbertTokenizer.from_pretrained('albert-base-v2', do_lower_case=True) transformer = transformers.AlbertModel.from_pretrained("albert-base-v2") 但是，它为所有层提供了参数：我需要访问out_feature

我试过：

import torch
import transformers
tokenizer = transformers.AlbertTokenizer.from_pretrained('albert-base-v2', do_lower_case=True)
transformer = transformers.AlbertModel.from_pretrained("albert-base-v2")

但是，它为所有层提供了参数：

我需要访问out_features=768最后一个线性函数的输入

（（pooler）：线性（输入特征=768，输出特征=768，偏差=True））

您到底需要什么，层的参数还是层的输入？我需要输出功能编号“768”。好的，然后您可以通过键入-

transformer.pooler.weight.shape[0]

轻松获得它，非常感谢。下面的代码如何：

导入变形金刚

从变形金刚导入BertTokenizer，BertModel

代币器=自动烹饪器。来自预训练（“nghuyong/ernie tiny”）

变压器=自动建模Skedlm。来自预训练（“nghuyong/ernie tiny”）我想要最后一层的输出值数=50006。

transformer.cls.predictions.decoder.out\u features

您到底需要什么，层的参数还是层的输入？我需要out\u features数“768”。好的，然后您可以通过键入-

transformer.pooler.weight.shape[0]轻松获得它

非常感谢。下面的代码如何：

导入变形金刚

从变形金刚导入BertTokenizer，BertModel

代币器=自动烹饪器。来自预训练（“nghuyong/ernie tiny”）

变压器=自动建模Skedlm。来自预训练（“nghuyong/ernie tiny”）我想要最后一层的输出值的数量=50006。

transformer.cls.predictions.decoder.out\u功能

transformer.num_parameters

<bound method ModuleUtilsMixin.num_parameters of AlbertModel(
  (embeddings): AlbertEmbeddings(
    (word_embeddings): Embedding(30000, 128, padding_idx=0)
    (position_embeddings): Embedding(512, 128)
    (token_type_embeddings): Embedding(2, 128)
    (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
    (dropout): Dropout(p=0, inplace=False)
  )
  (encoder): AlbertTransformer(
    (embedding_hidden_mapping_in): Linear(in_features=128, out_features=768, bias=True)
    (albert_layer_groups): ModuleList(
      (0): AlbertLayerGroup(
        (albert_layers): ModuleList(
          (0): AlbertLayer(
            (full_layer_layer_norm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
            (attention): AlbertAttention(
              (query): Linear(in_features=768, out_features=768, bias=True)
              (key): Linear(in_features=768, out_features=768, bias=True)
              (value): Linear(in_features=768, out_features=768, bias=True)
              (attention_dropout): Dropout(p=0, inplace=False)
              (output_dropout): Dropout(p=0, inplace=False)
              (dense): Linear(in_features=768, out_features=768, bias=True)
              (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
            )
            (ffn): Linear(in_features=768, out_features=3072, bias=True)
            (ffn_output): Linear(in_features=3072, out_features=768, bias=True)
            (dropout): Dropout(p=0, inplace=False)
          )
        )
      )
    )
  )
  (pooler): Linear(in_features=768, out_features=768, bias=True)
  (pooler_activation): Tanh()
)>