Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/295.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 基于sciket学习的监督机器学习_Python_Scikit Learn_Supervised Learning - Fatal编程技术网

Python 基于sciket学习的监督机器学习

Python 基于sciket学习的监督机器学习,python,scikit-learn,supervised-learning,Python,Scikit Learn,Supervised Learning,这是我第一次做有监督的机器学习。这是一个相当高级的话题(至少对我来说),我发现很难指定一个问题,因为我不确定到底出了什么问题 # Create a training list and test list (looks something like this): train = [('this hostel was nice',2),('i hate this hostel',1)] test = [('had a wonderful time',2),('terrible experience'

这是我第一次做有监督的机器学习。这是一个相当高级的话题(至少对我来说),我发现很难指定一个问题,因为我不确定到底出了什么问题

# Create a training list and test list (looks something like this):
train = [('this hostel was nice',2),('i hate this hostel',1)]
test = [('had a wonderful time',2),('terrible experience',1)]

# Loading modules
from sklearn.naive_bayes import MultinomialNB
from sklearn.feature_extraction.text import CountVectorizer
from sklearn import metrics

# Use a BOW representation of the reviews
vectorizer = CountVectorizer(stop_words='english') 
train_features = vectorizer.fit_transform([r[0] for r in train]) 
test_features = vectorizer.fit([r[0] for r in test])

# Fit a naive bayes model to the training data
nb = MultinomialNB()
nb.fit(train_features, [r[1] for r in train])

# Use the classifier to predict classification of test dataset
predictions = nb.predict(test_features)
actual=[r[1] for r in test]
这里我得到了一个错误:

float() argument must be a string or a number, not 'CountVectorizer'
这让我很困惑,因为我在评论中压缩的原始评级是:

type(ratings_new[0])
int

你应该换线

test_features = vectorizer.fit([r[0] for r in test])
致:


原因是您已经使用了训练数据来拟合矢量器,所以不需要在测试数据上再次拟合它。相反,您需要对其进行转换

您是否有发生错误的堆栈跟踪和/或行号?这是您要查找的信息吗?在()--->1 predictions=nb.predict(test\u特性)中回溯(最近一次调用last)
test_features = vectorizer.transform([r[0] for r in test])