Python 朴素贝叶斯分类器不适用于情绪分析
我试图训练一个朴素的贝叶斯分类器来预测电影评论是好是坏。 我遵循本教程,但在尝试训练模型时遇到错误: 在培训模型之前,我一直遵循所有步骤。我的数据和代码如下所示:Python 朴素贝叶斯分类器不适用于情绪分析,python,pandas,scikit-learn,naivebayes,Python,Pandas,Scikit Learn,Naivebayes,我试图训练一个朴素的贝叶斯分类器来预测电影评论是好是坏。 我遵循本教程,但在尝试训练模型时遇到错误: 在培训模型之前,我一直遵循所有步骤。我的数据和代码如下所示: Reviews Labels 0 For fans of Chris Farley, this is probably his... 1 1 Fantastic, Madonna at her fine
Reviews Labels
0 For fans of Chris Farley, this is probably his... 1
1 Fantastic, Madonna at her finest, the film is ... 1
2 From a perspective that it is possible to make... 1
3 What is often neglected about Harold Lloyd is ... 1
4 You'll either love or hate movies such as this... 1
... ...
14995 This is perhaps the worst movie I have ever se... 0
14996 I was so looking forward to seeing this film t... 0
14997 It pains me to see an awesome movie turn into ... 0
14998 "Grande Ecole" is not an artful exploration of... 0
14999 I felt like I was watching an example of how n... 0
gnb = MultinomialNB()
gnb.fit(all_train_set['Reviews'], all_train_set['Labels'])
但是,当尝试拟合模型时,我收到以下错误:
ValueError: could not convert string to float: 'For fans of Chris Farley, this is probably his best film. David Spade pl
如果有人能帮我决定为什么本教程出现了问题,我将不胜感激
非常感谢事实上,使用Scikit,您必须在调用分类器之前将文本转换为数字。例如,可以通过使用或来实现 如果您想使用更现代的word嵌入,可以使用该软件包(在终端中使用
pip install zeugma
安装),例如
我希望有帮助 这是一篇写得很差的教程。分类器的输入应该是features数据框,而不是原始数据框。它确实让我很困惑,否则处理任何东西的意义是什么
from zeugma.embeddings import EmbeddingTransformer
embedding = EmbeddingTransformer('glove')
X = embedding.transform(all_train_set['Reviews'])
y = all_train_set['Labels']
gnb = MultinomialNB()
gnb.fit(X, y)