Nlp 使用BERT强制所有标签为零的关键短语提取。嵌入的使用方式似乎有问题
我使用Bert嵌入从文档中提取关键短语。我使用的是Bert嵌入,后面是基于跨度的特征。列车数据具有使用词性标签识别的候选短语。以下是实施细节Nlp 使用BERT强制所有标签为零的关键短语提取。嵌入的使用方式似乎有问题,nlp,topic-modeling,bert-language-model,huggingface-transformers,ner,Nlp,Topic Modeling,Bert Language Model,Huggingface Transformers,Ner,我使用Bert嵌入从文档中提取关键短语。我使用的是Bert嵌入,后面是基于跨度的特征。列车数据具有使用词性标签识别的候选短语。以下是实施细节 encoder = TFBertModel.from_pretrained("bert-base-uncased") input_ids = tf.keras.layers.Input(shape=(max_len,), dtype=tf.int32) attention_mask = tf.keras.layers.Input(s
encoder = TFBertModel.from_pretrained("bert-base-uncased")
input_ids = tf.keras.layers.Input(shape=(max_len,), dtype=tf.int32)
attention_mask = tf.keras.layers.Input(shape=(max_len,), dtype=tf.int32)
embedding = encoder(input_ids, attention_mask=attention_mask)[0]
bilstm1 = tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(40,
#kernel_initializer=tf.keras.initializers.RandomNormal(mean=0.02,stddev=0.25),
dropout = 0.1,
return_sequences=True),
merge_mode=None)(embedding)
pos_mask = tf.keras.layers.Input(shape=(2,146),dtype='int32')
mask_start = pos_mask[0][0]
mask_end = pos_mask[0][1]
start_rep_fr = tf.gather(bilstm1[0],mask_start,axis=1)
start_rep_bk = tf.gather(bilstm1[1],mask_start,axis=1)
end_rep_fr = tf.gather(bilstm1[0],mask_end,axis=1)
end_rep_bk = tf.gather(bilstm1[0],mask_end,axis=1)
span_fe_diff_fr = start_rep_fr-end_rep_fr
span_fe_prod_fr = tf.math.multiply(start_rep_fr,end_rep_fr)
span_fe_diff_bk = start_rep_bk-end_rep_bk
span_fe_prod_bk = tf.math.multiply(start_rep_bk,end_rep_bk)
span_fe = tf.keras.layers.concatenate([start_rep_fr,
end_rep_fr,
start_rep_bk,
end_rep_bk,
span_fe_diff_fr,
span_fe_diff_bk,
span_fe_prod_fr,
span_fe_prod_bk
],2)
bilstm2 = tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(10,return_sequences=True,dropout = 0.1,
#kernel_initializer=tf.keras.initializers.(mean=0.0,stddev=0.05),
),
merge_mode='ave',
input_shape=(146,40*4))(span_fe)
output = tf.keras.layers.Dense(2,activation='softmax')(bilstm2)
kpe_model = tf.keras.models.Model(inputs=[input_ids,attention_mask,pos_mask], outputs=output)
kpe_model.layers[3].trainable = False
opt = tf.keras.optimizers.Adam(learning_rate=0.00005)
kpe_model.compile(optimizer=opt,
loss=loss_function,
metrics=[ac_metrics])
输出表示候选短语成为关键短语的概率。不确定此处哪部分不正确。该模型在2-3步内收敛,并强制所有概率