基于句子分类的tensorflow实现HuggingFace-BERT

基于句子分类的tensorflow实现HuggingFace-BERT,tensorflow,text-classification,huggingface-transformers,bert-language-model,Tensorflow,Text Classification,Huggingface Transformers,Bert Language Model,我试图训练一个真实灾难推文预测模型(Kaggle竞赛),使用拥抱脸伯特模型对推文进行分类 我已经学习了很多教程,使用了很多bert模型,但是没有一个能够在COlab中运行,并且克服了错误 我的代码是: !pip install transformers import tensorflow as tf import numpy as np import pandas as pd from tensorflow.keras.layers import Dense, Dropout from tens

我试图训练一个真实灾难推文预测模型(Kaggle竞赛),使用拥抱脸伯特模型对推文进行分类

我已经学习了很多教程,使用了很多bert模型,但是没有一个能够在COlab中运行,并且克服了错误

我的代码是:

!pip install transformers
import tensorflow as tf
import numpy as np
import pandas as pd
from tensorflow.keras.layers import Dense, Dropout
from tensorflow.keras.optimizers import Adam, SGD
from tensorflow.keras.callbacks import ModelCheckpoint
from transformers import DistilBertTokenizer, RobertaTokenizer


train = pd.read_csv("/content/drive/My Drive/Kaggle_disaster/train.csv")
test = pd.read_csv("/content/drive/My Drive/Kaggle_disaster/test.csv")


roberta = 'distilbert-base-uncased'
tokenizer = DistilBertTokenizer.from_pretrained(roberta, do_lower_case = True, add_special_tokens = True, max_length = 128, pad_to_max_length = True)


def tokenize(sentences, tokenizer):
  input_ids, input_masks, input_segments = [], [], []
  for sentence in sentences:
    inputs = tokenizer.encode_plus(sentence, add_special_tokens = True, max_length = 128, pad_to_max_length = True, return_attention_mask = True, return_token_type_ids = True)
    input_ids.append(inputs['input_ids'])
    input_masks.append(inputs['attention_mask'])
    input_segments.append(inputs['token_type_ids'])
  return np.asarray(input_ids, dtype = "int32"), np.asarray(input_masks, dtype = "int32"), np.asarray(input_segments, dtype = "int32")


input_ids, input_masks, input_segments = tokenize(train.text.values, tokenizer)

from transformers import TFDistilBertForSequenceClassification, DistilBertConfig, TFDistilBertModel

distil_bert = 'distilbert-base-uncased'

config = DistilBertConfig(dropout=0.2, attention_dropout=0.2)
config.output_hidden_states = False
transformer_model = TFDistilBertModel.from_pretrained(distil_bert, config = config)

input_ids_in = tf.keras.layers.Input(shape=(128,), name='input_token', dtype=tf.int32)
input_masks_in = tf.keras.layers.Input(shape=(128,), name='masked_token', dtype=tf.int32) 
embedding_layer = transformer_model(input_ids_in, attention_mask=input_masks_in)[0]
X = tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(50, return_sequences=True, dropout=0.1, recurrent_dropout=0.1))(embedding_layer)
X = tf.keras.layers.GlobalMaxPool1D()(X)
X = tf.keras.layers.Dense(50, activation='relu')(X)
X = tf.keras.layers.Dropout(0.2)(X)
X = tf.keras.layers.Dense(1, activation='sigmoid')(X)
model = tf.keras.Model(inputs=[input_ids_in, input_masks_in], outputs = X)
model.compile(Adam(lr = 1e-5), loss = 'binary_crossentropy', metrics = ['accuracy'])
for layer in model.layers[:3]:
  layer.trainable = False

bert_input = [
    input_ids,
    input_masks
]


checkpoint = ModelCheckpoint('/content/drive/My Drive/disaster_model/model_hugging_face.h5', monitor = 'val_loss', save_best_only= True)


train_history = model.fit(
    bert_input,
    validation_split = 0.2,
    batch_size = 16,
    epochs = 10,
    callbacks = [checkpoint]
)

在colab中运行上述代码时,我得到以下错误:

Epoch 1/10
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-91-9df711c91040> in <module>()
      9     batch_size = 16,
     10     epochs = 10,
---> 11     callbacks = [checkpoint]
     12 )

10 frames
/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/func_graph.py in wrapper(*args, **kwargs)
    966           except Exception as e:  # pylint:disable=broad-except
    967             if hasattr(e, "ag_error_metadata"):
--> 968               raise e.ag_error_metadata.to_exception(e)
    969             else:
    970               raise

ValueError: in user code:

    /usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py:571 train_function  *
        outputs = self.distribute_strategy.run(
    /usr/local/lib/python3.6/dist-packages/tensorflow/python/distribute/distribute_lib.py:951 run  **
        return self._extended.call_for_each_replica(fn, args=args, kwargs=kwargs)
    /usr/local/lib/python3.6/dist-packages/tensorflow/python/distribute/distribute_lib.py:2290 call_for_each_replica
        return self._call_for_each_replica(fn, args, kwargs)
    /usr/local/lib/python3.6/dist-packages/tensorflow/python/distribute/distribute_lib.py:2649 _call_for_each_replica
        return fn(*args, **kwargs)
    /usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py:541 train_step  **
        self.trainable_variables)
    /usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py:1804 _minimize
        trainable_variables))
    /usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/optimizer_v2/optimizer_v2.py:521 _aggregate_gradients
        filtered_grads_and_vars = _filter_grads(grads_and_vars)
    /usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/optimizer_v2/optimizer_v2.py:1219 _filter_grads
        ([v.name for _, v in grads_and_vars],))

    ValueError: No gradients provided for any variable: ['tf_distil_bert_model_23/distilbert/embeddings/word_embeddings/weight:0', 'tf_distil_bert_model_23/distilbert/embeddings/position_embeddings/embeddings:0', 'tf_distil_bert_model_23/distilbert/embeddings/LayerNorm/gamma:0', 'tf_distil_bert_model_23/distilbert/embeddings/LayerNorm/beta:0', 'tf_distil_bert_model_23/distilbert/transformer/layer_._0/attention/q_lin/kernel:0', 'tf_distil_bert_model_23/distilbert/transformer/layer_._0/attention/q_lin/bias:0', 'tf_distil_bert_model_23/distilbert/transformer/layer_._0/attention/k_lin/kernel:0', 'tf_distil_bert_model_23/distilbert/transformer/layer_._0/attention/k_lin/bias:0', 'tf_distil_bert_model_23/distilbert/transformer/layer_._0/attention/v_lin/kernel:0', 'tf_distil_bert_model_23/distilbert/transformer/layer_._0/attention/v_lin/bias:0', 'tf_distil_bert_model_23/distilbert/transformer/layer_._0/attention/out_lin/kernel:0', 'tf_distil_bert_model_23/distilbert/transformer/layer_._0/attention/out_lin/bias:0', 'tf_distil_bert_model_23/distilbert/transformer/layer_._0/sa_layer_norm/gamma:0', 'tf_distil_bert_model_23/distilbert/transformer/layer_._0/sa_layer_norm/beta:0', 'tf_distil_bert_model_23/distilbert/transformer/layer_._0/ffn/lin1/kernel:0', 'tf_distil_bert_model_23/distilbert/transformer/layer_._0/ffn/lin1/bias:0', 'tf_distil_bert_model_23/distilbert/transformer/layer_._0/ffn/lin2/kernel:0', 'tf_distil_bert_model_23/distilbert/transformer/layer_._0/ffn/lin2/bias:0', 'tf_...

1/10纪元
---------------------------------------------------------------------------
ValueError回溯(最近一次调用上次)
在()
9批次尺寸=16,
10个时代=10,
--->11回调=[检查点]
12 )
10帧
/包装器中的usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/func_graph.py(*args,**kwargs)
966例外情况为e:#pylint:disable=broad Exception
967如果hasattr(e,“ag\u错误\u元数据”):
-->968将e.ag\u错误\u元数据引发到\u异常(e)
969其他:
970加薪
ValueError:在用户代码中:
/usr/local/lib/python3.6/dist包/tensorflow/python/keras/engine/training.py:571 train_函数*
输出=self.distribution\u strategy.run(
/usr/local/lib/python3.6/dist包/tensorflow/python/distribute/distribute_lib.py:951运行**
返回self.\u扩展。为每个\u副本调用\u(fn,args=args,kwargs=kwargs)
/usr/local/lib/python3.6/dist包/tensorflow/python/distribute/distribute_lib.py:2290为每个复制副本调用
返回自我。为每个副本(fn、ARG、kwargs)调用
/usr/local/lib/python3.6/dist-packages/tensorflow/python/distribute/distribute\u-lib.py:2649\u为每个复制副本调用
返回fn(*args,**kwargs)
/usr/local/lib/python3.6/dist包/tensorflow/python/keras/engine/training.py:541 train\u步骤**
可自我训练的变量)
/usr/local/lib/python3.6/dist包/tensorflow/python/keras/engine/training.py:1804
可训练变量)
/usr/local/lib/python3.6/dist包/tensorflow/python/keras/optimizer\u v2/optimizer\u v2.py:521\u aggregate\u gradients
过滤的梯度和变量=\u过滤梯度(梯度和变量)
/usr/local/lib/python3.6/dist包/tensorflow/python/keras/optimizer\u v2/optimizer\u v2.py:1219\u filter\u grads
([v.name代表u,v在grads和vars中],)
ValueError:没有为任何变量提供渐变:['tf_distil_bert_model_23/distilbert/embedings/word_embedings/weight:0','tf_distil_bert_model_23/distilbert/embedings/position_embedings/position_embedings/embedings:0','tf_distil_bert_bert_model_model_23/distilbert/LayerNorm/gamma:0','tf_distilbert_bert_bert/LayerNorm/LayerNorm/beta注意/q_-lin/kernel:0','tf_-distil_-bert_模型23/distilbert/transformer/layer.'u 0/注意/q_-lin/bias:0','tf_-distil_-bert_模型23/distilbert/transformer/layer.'tf_-distil_-bert_模型23/distilbert/transformer/layer.'u 0/注意/k_-lin/bias:0','tf_-distil_-bert模型23/distilbert/transformer/layern/内核:0',tf\u distil\u bert\u model\u 23/distilbert/transformer/layer\u 0/注意/v\u lin/bias:0',tf\u distil\u bert\u model\u 23/distilbert/transformer/layer\u 0',tf\u distil\u bert\u bert\u model\u 23/transformer/layer\u 0/注意/out\u lin/bias:0',tf distilbert\u bert\u bert\u bert\u model\u 23/layer\u gammar\u 0',“tf\u distil\u bert\u model\u 23/distilbert/transformer/layer\u 0/sa\u layer\u norm/beta:0”、“tf\u distil\u bert\u model\u 23/distilbert/transformer/layer\u 0/ffn/lin1/kernel:0”、“tf\u distil\u bert\u model\u model\u 23/distilbert/distilbert/transformer/layer\u 23/distilbert/transformer/layer\u 0”、“tf\u distilbert\u bert/layer\r\u 0/distilbert\r\t模型”、“0/distilbert/distilu/变压器/层0/ffn/lin2/偏置:0','tf\u0。。。

遵循本教程,使用BERT进行文本分类:

它在GoogleColab(使用GPU)和Kaggle上有工作代码,用于使用BERT进行二进制、多类和多标签文本分类

希望有帮助

  • 你需要使用gpu
  • 试试这个=

  • 使用torch.no_grad():

    在fit中…我没有看到您的目标…model.fit(X,y)谢谢您的帮助!在传递标签数据后,我面临新的错误纪元1/10381/381[======================================================]-预计到达时间:0s-损失:0.4822-准确度:0.7778----------------------------------------------------------------------未在()中实施错误回溯(最近一次调用)(批处理大小=16,11个时间段=10,-->12次回调=[检查点]13)未实施错误:请相应地更新问题,并且不要在评论中发布进一步的问题。hlw,您是否有任何git链接或colab链接?是,对于二进制文本分类:,对于多类文本分类:,对于多标签文本分类: