Python 朴素贝叶斯分类器不适用于情绪分析_Python_Pandas_Scikit Learn_Naivebayes

Python 朴素贝叶斯分类器不适用于情绪分析

python pandas scikit-learn

Python 朴素贝叶斯分类器不适用于情绪分析,python,pandas,scikit-learn,naivebayes,Python,Pandas,Scikit Learn,Naivebayes,我试图训练一个朴素的贝叶斯分类器来预测电影评论是好是坏。我遵循本教程，但在尝试训练模型时遇到错误：在培训模型之前，我一直遵循所有步骤。我的数据和代码如下所示： Reviews Labels 0 For fans of Chris Farley, this is probably his... 1 1 Fantastic, Madonna at her fine

我试图训练一个朴素的贝叶斯分类器来预测电影评论是好是坏。我遵循本教程，但在尝试训练模型时遇到错误：

在培训模型之前，我一直遵循所有步骤。我的数据和代码如下所示：

                                                 Reviews  Labels
0      For fans of Chris Farley, this is probably his...       1
1      Fantastic, Madonna at her finest, the film is ...       1
2      From a perspective that it is possible to make...       1
3      What is often neglected about Harold Lloyd is ...       1
4      You'll either love or hate movies such as this...       1
                                              ...     ...
14995  This is perhaps the worst movie I have ever se...       0
14996  I was so looking forward to seeing this film t...       0
14997  It pains me to see an awesome movie turn into ...       0
14998  "Grande Ecole" is not an artful exploration of...       0
14999  I felt like I was watching an example of how n...       0

gnb = MultinomialNB()
gnb.fit(all_train_set['Reviews'], all_train_set['Labels'])

但是，当尝试拟合模型时，我收到以下错误：

ValueError: could not convert string to float: 'For fans of Chris Farley, this is probably his best film. David Spade pl

如果有人能帮我决定为什么本教程出现了问题，我将不胜感激

非常感谢

事实上，使用Scikit，您必须在调用分类器之前将文本转换为数字。例如，可以通过使用或来实现

如果您想使用更现代的word嵌入，可以使用该软件包（在终端中使用

pip install zeugma

安装），例如

我希望有帮助

这是一篇写得很差的教程。分类器的输入应该是features数据框，而不是原始数据框。它确实让我很困惑，否则处理任何东西的意义是什么

from zeugma.embeddings import EmbeddingTransformer

embedding = EmbeddingTransformer('glove')

X = embedding.transform(all_train_set['Reviews'])
y = all_train_set['Labels']

gnb = MultinomialNB()
gnb.fit(X, y)