Python 显示数值误差的精度分数和精度分数_Python_Machine Learning_Scikit Learn_Linear Regression

Python 显示数值误差的精度分数和精度分数

python machine-learning scikit-learn

Python 显示数值误差的精度分数和精度分数,python,machine-learning,scikit-learn,linear-regression,Python,Machine Learning,Scikit Learn,Linear Regression,我对机器学习和使用波士顿数据集进行预测还不熟悉。除了精准度和精准度之外，一切都很好。这就是我所做的： import pandas as pd import sklearn from sklearn.linear_model import LinearRegression from sklearn import preprocessing,cross_validation, svm from sklearn.datasets import load_boston import numpy as

我对机器学习和使用波士顿数据集进行预测还不熟悉。除了精准度和精准度之外，一切都很好。这就是我所做的：

import pandas as pd 
import sklearn 
from sklearn.linear_model import LinearRegression
from sklearn import preprocessing,cross_validation, svm
from sklearn.datasets import load_boston
import numpy as np
from sklearn.metrics import accuracy_score, f1_score, precision_score, recall_score, classification_report, confusion_matrix

boston = load_boston()
df = pd.DataFrame(boston.data)
df.columns= boston.feature_names
df['Price']= boston.target

X = np.array(df.drop(['Price'],axis=1), dtype=np.float64)
X = preprocessing.scale(X)

y = np.array(df['Price'], dtype=np.float64)

print (len(X[:,6:7]),len(y))

X_train,X_test,y_train,y_test=cross_validation.train_test_split(X,y,test_size=0.30)

clf =LinearRegression()
clf.fit(X_train,y_train)
y_predict = clf.predict(X_test)

print(y_predict,len(y_predict))
print (accuracy_score(y_test, y_predict))
print(precision_score(y_test, y_predict,average = 'macro'))

现在我得到以下错误：

文件“LinearRegression.py”，第33行，在

 accuracy = accuracy_score(y_test, y_predict)    File "/usr/local/lib/python2.7/dist-packages/sklearn/metrics/classification.py",

第172行，准确度评分

 y_type, y_true, y_pred = _check_targets(y_true, y_pred)

文件 “/usr/local/lib/python2.7/dist packages/sklearn/metrics/classification.py”，第89行，输入检查目标

 raise ValueError("{0} is not supported".format(y_type))

 ValueError: continuous is not supported

您使用的是线性回归模型

clf = LinearRegression()

它预测连续值。例：1.2，1.3

而

准确度评分（y\u测试，y\u预测）

需要布尔值。1或0（真或假）或分类值，如1、2、3、4等。。其中数字作为类别

这就是为什么会出现错误

如何解决这个问题

因为您试图预测波士顿数据的

价格

，这是一个连续值。我建议您将误差度量从精度更改为RMSE或

替换：

print(accuracy_score(y_test, y_predict))

与：

这将解决您的问题。

但即使我将分类器更改为Svm或RandomForestClassifier，我也会得到相同的结果。它们的预测方式相同吗？@harshi RandomForestClassifier预测1和0，或类似于0,1,2,3,4的类别。这里你需要谨慎的预测价格，可能是1.2或0.24。“它们是连续的值。”哈希读了更多关于分类问题和回归问题之间的区别的资料。这会有帮助：我发现我可以使用“clf.score（X_检验，y_检验）”来提高准确性。但是计算RMSE并不能为我提供准确性。那么我如何计算精度分数呢？@harshi clf.score（）方法无法获得准确度<代码>线性回归返回分数中的R平方系数。我认为你关于分类和回归的概念不清楚。

from sklearn.metrics import mean_squared_error
print(mean_squared_error(y_test, y_predict))