Scikit learn Scikit学习:\u count\u vocab正在抛出空词汇错误

Scikit learn Scikit学习:\u count\u vocab正在抛出空词汇错误,scikit-learn,Scikit Learn,我正在传递两个字符串,例如:$1-2$$3-4$5-6$&$7-8$$9-10$$10-11$ 在这种情况下,count_vocab函数抛出错误: empty vocabulary: perhaps the document contains only stop words" 那么$symbol有问题吗 不考虑$1-2 $作为令牌吗?< /P> < P>令牌的定义由构造函数: toknyType < /Cord>参数(正则表达式)确定: Regular expression denoting

我正在传递两个字符串,例如:
$1-2$$3-4$5-6$
&
$7-8$$9-10$$10-11$

在这种情况下,count_vocab函数抛出错误:

empty vocabulary: perhaps the document contains only stop words"
那么$symbol有问题吗


不考虑$1-2 $作为令牌吗?< /P> < P>令牌的定义由构造函数:<代码> toknyType < /Cord>参数(正则表达式)确定:

Regular expression denoting what constitutes a "token", only used
if `tokenize == 'word'`. The default regexp select tokens of 2
or more alphanumeric characters (punctuation is completely ignored
and always treated as a token separator).
这显然与您拥有的不匹配,因此为您的数据定义不同的RE