Nlp 使用Huggingface TextClassificationPipeline时如何设置标签名称？_Nlp_Huggingface Transformers

Nlp 使用Huggingface TextClassificationPipeline时如何设置标签名称？

nlp

Nlp 使用Huggingface TextClassificationPipeline时如何设置标签名称？,nlp,huggingface-transformers,Nlp,Huggingface Transformers,我正在使用一个经过微调的Huggingface模型（在我的公司数据上）来进行班级预测。现在，这个管道预测的标签默认为LABEL\u 0，LABEL\u 1等等。是否有方法将标签映射提供给textclassionpipeline对象，以便输出可以反映相同的映射环境： tensorflow==2.3.1 变压器==4.3.2 示例代码： import os os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3' # or any {'0', '1', '2'} f

我正在使用一个经过微调的Huggingface模型（在我的公司数据上）来进行班级预测。现在，这个

管道

预测的标签默认为

LABEL\u 0

，

LABEL\u 1

等等。是否有方法将标签映射提供给

textclassionpipeline

对象，以便输出可以反映相同的映射

环境：

tensorflow==2.3.1
变压器==4.3.2

示例代码：

import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'  # or any {'0', '1', '2'}

from transformers import TextClassificationPipeline, TFAutoModelForSequenceClassification, AutoTokenizer

MODEL_DIR = "path\to\my\fine-tuned\model"

# Feature extraction pipeline
model = TFAutoModelForSequenceClassification.from_pretrained(MODEL_DIR)
tokenizer = AutoTokenizer.from_pretrained(MODEL_DIR)

pipeline = TextClassificationPipeline(model=model,
                                      tokenizer=tokenizer,
                                      framework='tf',
                                      device=0)

result = pipeline("It was a good watch. But a little boring.")[0]

输出：

In [2]: result
Out[2]: {'label': 'LABEL_1', 'score': 0.8864616751670837}

添加此类映射的最简单方法是编辑模型的config.json，以包含：

id2label

字段，如下所示：

{
  "_name_or_path": "distilbert-base-uncased",
  "activation": "gelu",
  "architectures": [
    "DistilBertForMaskedLM"
  ],
  "id2label": [
    "negative",
    "positive"
  ],
  "attention_dropout": 0.1,
  .
  .
}

model = TFAutoModelForSequenceClassification.from_pretrained(MODEL_DIR, id2label={0: 'negative', 1: 'positive'})

设置此映射的代码内方法是在

from_pretrained

调用中添加

id2label

参数，如下所示：

{
  "_name_or_path": "distilbert-base-uncased",
  "activation": "gelu",
  "architectures": [
    "DistilBertForMaskedLM"
  ],
  "id2label": [
    "negative",
    "positive"
  ],
  "attention_dropout": 0.1,
  .
  .
}

model = TFAutoModelForSequenceClassification.from_pretrained(MODEL_DIR, id2label={0: 'negative', 1: 'positive'})

下面是我为将其添加到transformers.XForSequenceClassification文档中而提出的建议。

此类配置参数的设置实际上已经有了很好的文档记录