Pytorch 无分类层的huggingbert模型

Pytorch 无分类层的huggingbert模型,pytorch,huggingface-transformers,bert-language-model,Pytorch,Huggingface Transformers,Bert Language Model,我想从vgg16和bert联合嵌入分类 huggingfacetransformers的问题是它有一个分类层,它有num\u标签维度 但是,我需要BertPooler(768维)的输出,我将使用它作为扩展模型的文本嵌入 from transformers import BertForSequenceClassification model = BertForSequenceClassification.from_pretrained('bert-base-uncased') 这给出了以下模型

我想从vgg16和
bert
联合嵌入分类

huggingfacetransformers
的问题是它有一个分类层,它有
num\u标签
维度

但是,我需要BertPooler(768维)的输出,我将使用它作为扩展模型的文本嵌入

from transformers import BertForSequenceClassification

model = BertForSequenceClassification.from_pretrained('bert-base-uncased')
这给出了以下模型:

BertForSequenceClassification(
...
...
        (11): BertLayer(
          (attention): BertAttention(
            (self): BertSelfAttention(
              (query): Linear(in_features=768, out_features=768, bias=True)
              (key): Linear(in_features=768, out_features=768, bias=True)
              (value): Linear(in_features=768, out_features=768, bias=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
            (output): BertSelfOutput(
              (dense): Linear(in_features=768, out_features=768, bias=True)
              (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
          )
          (intermediate): BertIntermediate(
            (dense): Linear(in_features=768, out_features=3072, bias=True)
          )
          (output): BertOutput(
            (dense): Linear(in_features=3072, out_features=768, bias=True)
            (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
            (dropout): Dropout(p=0.1, inplace=False)
          )
        )
      )
    )
    (pooler): BertPooler(
      (dense): Linear(in_features=768, out_features=768, bias=True)
      (activation): Tanh()
    )
  )
  (dropout): Dropout(p=0.1, inplace=False)
  (classifier): Linear(in_features=768, out_features=2, bias=True)
)
如何摆脱
分类器

from transformers import BertModel
model = BertModel.from_pretrained('bert-base-uncased')
输出

(11): BertLayer(
        (attention): BertAttention(
          (self): BertSelfAttention(
            (query): Linear(in_features=768, out_features=768, bias=True)
            (key): Linear(in_features=768, out_features=768, bias=True)
            (value): Linear(in_features=768, out_features=768, bias=True)
            (dropout): Dropout(p=0.1, inplace=False)
          )
          (output): BertSelfOutput(
            (dense): Linear(in_features=768, out_features=768, bias=True)
            (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
            (dropout): Dropout(p=0.1, inplace=False)
          )
        )
        (intermediate): BertIntermediate(
          (dense): Linear(in_features=768, out_features=3072, bias=True)
        )
        (output): BertOutput(
          (dense): Linear(in_features=3072, out_features=768, bias=True)
          (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
          (dropout): Dropout(p=0.1, inplace=False)
        )
      )
    )
  )
  (pooler): BertPooler(
    (dense): Linear(in_features=768, out_features=768, bias=True)
    (activation): Tanh()
  )
)

检查BertModel定义。

使用联合嵌入,您是否要连接vgg16和bert向量并用于训练?是的,类似的东西。我们必须进行实验。