Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/python-3.x/19.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 3.x 未通过spacy debug data CLI正确加载文本分类器训练数据 背景_Python 3.x_Command Line Interface_Spacy - Fatal编程技术网

Python 3.x 未通过spacy debug data CLI正确加载文本分类器训练数据 背景

Python 3.x 未通过spacy debug data CLI正确加载文本分类器训练数据 背景,python-3.x,command-line-interface,spacy,Python 3.x,Command Line Interface,Spacy,我试图在Google Colab笔记本中的Spacy中训练一个多类(标签是互斥的)文本分类模型。 课程是 肯定的 否定的 中立的 我将训练数据格式化为指定的注释格式 下面是我做的注释示例 [. . ["Happy #MothersDay to all ... ", {'cats': {'NEUTRAL': 1.0}}], ["Happy mothers day ..", {"cats": {"POSITIVE": 1.0}}], . .] 问题 当我尝试使用spacy CLI中的选项和以下

我试图在Google Colab笔记本中的Spacy中训练一个
多类
(标签是互斥的)文本分类模型。 课程是

  • 肯定的
  • 否定的
  • 中立的
我将训练数据格式化为指定的注释格式

下面是我做的注释示例

[.
.
["Happy #MothersDay to all ... ", {'cats': {'NEUTRAL': 1.0}}],
["Happy mothers day ..", {"cats": {"POSITIVE": 1.0}}],
.
.]
问题 当我尝试使用spacy CLI中的选项和以下命令调试数据时(在Jupyter笔记本中完成)

我得到以下输出

=========================== Data format validation ===========================
✔ Corpus is loadable

=============================== Training stats ===============================
Training pipeline: textcat
Starting with blank model 'en'
0 training docs
0 evaluation docs
✘ No evaluation docs
✔ No overlap between training and evaluation data
✘ Low number of examples to train from a blank model (0)

============================== Vocab & Vectors ==============================
ℹ 0 total words in the data (0 unique)
ℹ No word vectors present in the model

============================ Text Classification ============================
ℹ Text Classification: 0 new label(s), 0 existing label(s)
ℹ The train data contains only instances with mutually-exclusive
classes.

================================== Summary ==================================
✔ 2 checks passed
✘ 2 errors
它无法正确读取数据,但我已经检查了文件,我至少有1000多个样本,如上面所述

链接到和JSON


我在我的数据中找不到任何错误,有人能指出错误吗?提前谢谢

spacy debug data命令需要spacy内部JSON训练格式的数据,如下所述:


这里有一些例子:。同一目录中的转换脚本显示如何从JSONL格式转换,该格式与示例脚本中使用的
TRAIN\u DATA
-类型格式非常相似。

spacy debug DATA命令需要spacy内部JSON训练格式的数据,如下所述:

这里有一些例子:。同一目录中的转换脚本显示了如何从与示例脚本中使用的
TRAIN\u DATA
-类型格式非常相似的JSONL格式转换

=========================== Data format validation ===========================
✔ Corpus is loadable

=============================== Training stats ===============================
Training pipeline: textcat
Starting with blank model 'en'
0 training docs
0 evaluation docs
✘ No evaluation docs
✔ No overlap between training and evaluation data
✘ Low number of examples to train from a blank model (0)

============================== Vocab & Vectors ==============================
ℹ 0 total words in the data (0 unique)
ℹ No word vectors present in the model

============================ Text Classification ============================
ℹ Text Classification: 0 new label(s), 0 existing label(s)
ℹ The train data contains only instances with mutually-exclusive
classes.

================================== Summary ==================================
✔ 2 checks passed
✘ 2 errors