Deep learning 如何使用HuggingFace nlp库'；可乐的胶水_Deep Learning_Nlp_Language Model_Bert Language Model_Huggingface Tokenizers

Deep learning 如何使用HuggingFace nlp库'；可乐的胶水

deep-learning nlp

Deep learning 如何使用HuggingFace nlp库'；可乐的胶水,deep-learning,nlp,language-model,bert-language-model,huggingface-tokenizers,Deep Learning,Nlp,Language Model,Bert Language Model,Huggingface Tokenizers,我一直在尝试使用HuggingFace nlp库的GLUE度量来检查给定的句子是否是语法英语句子。但我遇到了一个错误，无法继续到目前为止我所尝试的引用和预测是两个文本句子 !pip install transformers 我得到的错误 ValueError Traceback (most recent call last) <ipython-input-9-4c3a3ce7b583> in <module&

我一直在尝试使用HuggingFace nlp库的GLUE度量来检查给定的句子是否是语法英语句子。但我遇到了一个错误，无法继续

到目前为止我所尝试的

引用和预测是两个文本句子

!pip install transformers

我得到的错误


ValueError                                Traceback (most recent call last)
<ipython-input-9-4c3a3ce7b583> in <module>()
----> 1 glue_score = glue_metric.compute(encoded_prediction, encoded_reference)

6 frames
/usr/local/lib/python3.6/dist-packages/nlp/metric.py in compute(self, predictions, references, timeout, **metrics_kwargs)
    198         predictions = self.data["predictions"]
    199         references = self.data["references"]
--> 200         output = self._compute(predictions=predictions, references=references, **metrics_kwargs)
    201         return output
    202 

/usr/local/lib/python3.6/dist-packages/nlp/metrics/glue/27b1bc63e520833054bd0d7a8d0bc7f6aab84cc9eed1b576e98c806f9466d302/glue.py in _compute(self, predictions, references)
    101             return pearson_and_spearman(predictions, references)
    102         elif self.config_name in ["mrpc", "qqp"]:
--> 103             return acc_and_f1(predictions, references)
    104         elif self.config_name in ["sst2", "mnli", "mnli_mismatched", "mnli_matched", "qnli", "rte", "wnli", "hans"]:
    105             return {"accuracy": simple_accuracy(predictions, references)}

/usr/local/lib/python3.6/dist-packages/nlp/metrics/glue/27b1bc63e520833054bd0d7a8d0bc7f6aab84cc9eed1b576e98c806f9466d302/glue.py in acc_and_f1(preds, labels)
     60 def acc_and_f1(preds, labels):
     61     acc = simple_accuracy(preds, labels)
---> 62     f1 = f1_score(y_true=labels, y_pred=preds)
     63     return {
     64         "accuracy": acc,

/usr/local/lib/python3.6/dist-packages/sklearn/metrics/_classification.py in f1_score(y_true, y_pred, labels, pos_label, average, sample_weight, zero_division)
   1097                        pos_label=pos_label, average=average,
   1098                        sample_weight=sample_weight,
-> 1099                        zero_division=zero_division)
   1100 
   1101 

/usr/local/lib/python3.6/dist-packages/sklearn/metrics/_classification.py in fbeta_score(y_true, y_pred, beta, labels, pos_label, average, sample_weight, zero_division)
   1224                                                  warn_for=('f-score',),
   1225                                                  sample_weight=sample_weight,
-> 1226                                                  zero_division=zero_division)
   1227     return f
   1228 

/usr/local/lib/python3.6/dist-packages/sklearn/metrics/_classification.py in precision_recall_fscore_support(y_true, y_pred, beta, labels, pos_label, average, warn_for, sample_weight, zero_division)
   1482         raise ValueError("beta should be >=0 in the F-beta score")
   1483     labels = _check_set_wise_labels(y_true, y_pred, average, labels,
-> 1484                                     pos_label)
   1485 
   1486     # Calculate tp_sum, pred_sum, true_sum ###

/usr/local/lib/python3.6/dist-packages/sklearn/metrics/_classification.py in _check_set_wise_labels(y_true, y_pred, average, labels, pos_label)
   1314             raise ValueError("Target is %s but average='binary'. Please "
   1315                              "choose another average setting, one of %r."
-> 1316                              % (y_type, average_options))
   1317     elif pos_label not in (None, 1):
   1318         warnings.warn("Note that pos_label (set to %r) is ignored when "

ValueError: Target is multiclass but average='binary'. Please choose another average setting, one of [None, 'micro', 'macro', 'weighted'].


ValueError回溯（最近一次调用上次）
在（）
---->1粘合分数=粘合度量.compute（编码的粘合预测，编码的粘合参考）
6帧
/计算中的usr/local/lib/python3.6/dist-packages/nlp/metric.py（self、预测、引用、超时、**度量值）
198预测=自我数据[“预测”]
199参考=自身数据[“参考”]
-->200输出=自我计算（预测=预测，参考=参考，**度量值）
201返回输出
202
/usr/local/lib/python3.6/dist-packages/nlp/metrics/glue/27b1bc63e520833054bd0d7a8d0bc7f6ab84cc9eed1b576e98c806f9466d302/glue.py in_compute（自我、预测、参考）
101返回皮尔逊和斯皮尔曼（预测、参考）
102[“mrpc”，“qqp”]中的elif self.config_名称：
-->103返回acc_和_f1（预测、参考）
104[“sst2”、“mnli”、“mnli\u不匹配”、“mnli\u匹配”、“qnli”、“rte”、“wnli”、“hans”中的elif self.config\u名称]：
105返回{“准确度”：简单的准确度（预测、参考）}
/usr/local/lib/python3.6/dist-packages/nlp/metrics/glue/27b1bc63e520833054bd0d7a8d0bc7f6ab84cc9eed1b576e98c806f9466d302/glue.py在附件和附件f1中（preds，标签）
60 def acc_和_f1（preds，标签）：
61 acc=简单精度（预测、标签）
--->62 f1=f1_分数（y_真=标签，y_pred=preds）
63返回{
64“准确度”：根据，
/usr/local/lib/python3.6/dist-packages/sklearn/metrics//u classification.py在f1\u分数中（y\u真、y\u pred、标签、位置标签、平均值、样本重量、零划分）
1097位置标签=位置标签，平均值=平均值，
1098样品重量=样品重量，
->1099零除法=零除法）
1100
1101
/usr/local/lib/python3.6/dist-packages/sklearn/metrics//u classification.py在fbeta评分中（y_真、y_pred、beta、标签、位置标签、平均值、样本重量、零划分）
1224 warn_for=（'f分数'），
1225样品重量=样品重量，
->1226零除法=零除法）
1227返回f
1228
/usr/local/lib/python3.6/dist-packages/sklearn/metrics//u classification.py精度、召回率、核心支持率（y_真、y_pred、beta、标签、位置标签、平均值、警告、样本重量、零分区）
1482 raise VALUE ERROR（“F-beta分数中的beta应大于等于0”）
1483 labels=\u check\u set\u wise\u labels（y\u true，y\u pred，average，labels，
->1484 pos_标签）
1485
1486#计算tp#u和、pred#u和、true#u和###
/usr/local/lib/python3.6/dist-packages/sklearn/metrics//u classification.py in\u check\u set\u wise\u标签（y\u true、y\u pred、average、labels、pos\u标签）
1314 raise VALUETERROR（“目标为%s，但平均值为='binary'。请”
1315“选择另一个平均设置，即%r中的一个。”
->1316%（y_类型，平均_选项））
1317 elif pos_标签不在（无，1）：
1318警告。警告（“注意，当”
ValueError:目标为多类，但average='binary'。请选择另一个平均设置，即[None'，micro'，macro'，weighted']中的一个。

然而，我能够通过上述相同的解决方法获得“stsb”的结果（pearson和Spearman）。

非常感谢（cola）的一些帮助和解决方法。谢谢。

一般来说，如果您在HuggingFace中看到此错误，您正在尝试使用f分数作为一个度量，用于两个以上类别的文本分类问题。请选择其他度量，如“准确性”

关于这一具体问题：

不管您输入了什么，它都在尝试计算f分数。从中，您应该将度量名称设置为：

metric_name = "pearson" if task == "stsb" else "matthews_correlation" if task == "cola" else "accuracy"

一般来说，如果您在HuggingFace中看到此错误，您将尝试使用f分数作为一个有两个以上类的文本分类问题的度量。请选择其他度量，如“准确性”

关于这一具体问题：

不管您输入了什么，它都在尝试计算f分数。从中，您应该将度量名称设置为：

metric_name = "pearson" if task == "stsb" else "matthews_correlation" if task == "cola" else "accuracy"

我无法重现此问题。请您扩展您的代码以给我一个导致此问题的完整示例好吗？@cronoik，当然可以。我将添加代码作为编辑。我收到一条不同的错误消息（引用太长）。当我从引用中删除令牌时，它运行正常。我无法重现此问题。请您扩展您的代码，以给我一个导致此问题的完整示例？@cronoik，当然。我将添加代码作为编辑。我收到另一条错误消息（引用太长）。当我从引用中删除令牌时，它运行正常。


ValueError                                Traceback (most recent call last)
<ipython-input-9-4c3a3ce7b583> in <module>()
----> 1 glue_score = glue_metric.compute(encoded_prediction, encoded_reference)

6 frames
/usr/local/lib/python3.6/dist-packages/nlp/metric.py in compute(self, predictions, references, timeout, **metrics_kwargs)
    198         predictions = self.data["predictions"]
    199         references = self.data["references"]
--> 200         output = self._compute(predictions=predictions, references=references, **metrics_kwargs)
    201         return output
    202 

/usr/local/lib/python3.6/dist-packages/nlp/metrics/glue/27b1bc63e520833054bd0d7a8d0bc7f6aab84cc9eed1b576e98c806f9466d302/glue.py in _compute(self, predictions, references)
    101             return pearson_and_spearman(predictions, references)
    102         elif self.config_name in ["mrpc", "qqp"]:
--> 103             return acc_and_f1(predictions, references)
    104         elif self.config_name in ["sst2", "mnli", "mnli_mismatched", "mnli_matched", "qnli", "rte", "wnli", "hans"]:
    105             return {"accuracy": simple_accuracy(predictions, references)}

/usr/local/lib/python3.6/dist-packages/nlp/metrics/glue/27b1bc63e520833054bd0d7a8d0bc7f6aab84cc9eed1b576e98c806f9466d302/glue.py in acc_and_f1(preds, labels)
     60 def acc_and_f1(preds, labels):
     61     acc = simple_accuracy(preds, labels)
---> 62     f1 = f1_score(y_true=labels, y_pred=preds)
     63     return {
     64         "accuracy": acc,

/usr/local/lib/python3.6/dist-packages/sklearn/metrics/_classification.py in f1_score(y_true, y_pred, labels, pos_label, average, sample_weight, zero_division)
   1097                        pos_label=pos_label, average=average,
   1098                        sample_weight=sample_weight,
-> 1099                        zero_division=zero_division)
   1100 
   1101 

/usr/local/lib/python3.6/dist-packages/sklearn/metrics/_classification.py in fbeta_score(y_true, y_pred, beta, labels, pos_label, average, sample_weight, zero_division)
   1224                                                  warn_for=('f-score',),
   1225                                                  sample_weight=sample_weight,
-> 1226                                                  zero_division=zero_division)
   1227     return f
   1228 

/usr/local/lib/python3.6/dist-packages/sklearn/metrics/_classification.py in precision_recall_fscore_support(y_true, y_pred, beta, labels, pos_label, average, warn_for, sample_weight, zero_division)
   1482         raise ValueError("beta should be >=0 in the F-beta score")
   1483     labels = _check_set_wise_labels(y_true, y_pred, average, labels,
-> 1484                                     pos_label)
   1485 
   1486     # Calculate tp_sum, pred_sum, true_sum ###

/usr/local/lib/python3.6/dist-packages/sklearn/metrics/_classification.py in _check_set_wise_labels(y_true, y_pred, average, labels, pos_label)
   1314             raise ValueError("Target is %s but average='binary'. Please "
   1315                              "choose another average setting, one of %r."
-> 1316                              % (y_type, average_options))
   1317     elif pos_label not in (None, 1):
   1318         warnings.warn("Note that pos_label (set to %r) is ignored when "

ValueError: Target is multiclass but average='binary'. Please choose another average setting, one of [None, 'micro', 'macro', 'weighted'].

metric_name = "pearson" if task == "stsb" else "matthews_correlation" if task == "cola" else "accuracy"