Nlp 使用Huggingface TextClassificationPipeline时如何设置标签名称?
我正在使用一个经过微调的Huggingface模型(在我的公司数据上)来进行班级预测。现在,这个Nlp 使用Huggingface TextClassificationPipeline时如何设置标签名称?,nlp,huggingface-transformers,Nlp,Huggingface Transformers,我正在使用一个经过微调的Huggingface模型(在我的公司数据上)来进行班级预测。现在,这个管道预测的标签默认为LABEL\u 0,LABEL\u 1等等。是否有方法将标签映射提供给textclassionpipeline对象,以便输出可以反映相同的映射 环境: tensorflow==2.3.1 变压器==4.3.2 示例代码: import os os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3' # or any {'0', '1', '2'} f
管道
预测的标签默认为LABEL\u 0
,LABEL\u 1
等等。是否有方法将标签映射提供给textclassionpipeline
对象,以便输出可以反映相同的映射
环境:
- tensorflow==2.3.1
- 变压器==4.3.2
import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3' # or any {'0', '1', '2'}
from transformers import TextClassificationPipeline, TFAutoModelForSequenceClassification, AutoTokenizer
MODEL_DIR = "path\to\my\fine-tuned\model"
# Feature extraction pipeline
model = TFAutoModelForSequenceClassification.from_pretrained(MODEL_DIR)
tokenizer = AutoTokenizer.from_pretrained(MODEL_DIR)
pipeline = TextClassificationPipeline(model=model,
tokenizer=tokenizer,
framework='tf',
device=0)
result = pipeline("It was a good watch. But a little boring.")[0]
输出:
In [2]: result
Out[2]: {'label': 'LABEL_1', 'score': 0.8864616751670837}
添加此类映射的最简单方法是编辑模型的config.json,以包含:
id2label
字段,如下所示:
{
"_name_or_path": "distilbert-base-uncased",
"activation": "gelu",
"architectures": [
"DistilBertForMaskedLM"
],
"id2label": [
"negative",
"positive"
],
"attention_dropout": 0.1,
.
.
}
model = TFAutoModelForSequenceClassification.from_pretrained(MODEL_DIR, id2label={0: 'negative', 1: 'positive'})
设置此映射的代码内方法是在from_pretrained
调用中添加id2label
参数,如下所示:
{
"_name_or_path": "distilbert-base-uncased",
"activation": "gelu",
"architectures": [
"DistilBertForMaskedLM"
],
"id2label": [
"negative",
"positive"
],
"attention_dropout": 0.1,
.
.
}
model = TFAutoModelForSequenceClassification.from_pretrained(MODEL_DIR, id2label={0: 'negative', 1: 'positive'})
下面是我为将其添加到transformers.XForSequenceClassification文档中而提出的建议。此类配置参数的设置实际上已经有了很好的文档记录