Python sklearn模型返回的平均绝对误差为0,为什么?

Python sklearn模型返回的平均绝对误差为0,为什么?,python,pandas,machine-learning,scikit-learn,Python,Pandas,Machine Learning,Scikit Learn,玩弄sklearn,我想用Open、High、Low价格和成交量来预测几天的TSLA收盘价。我用了一个非常基本的模型来预测收盘价,他们应该是100%准确的,我不知道为什么。0%的错误让我感觉好像没有正确设置模型 代码: 从操作系统导入X_确定 从numpy.lib.shape_base导入沿_轴应用_ 作为pd进口熊猫 从sklearn.tree导入决策树 从sklearn.metrics导入平均绝对误差 tsla_data_path=“/Users/simon/Documents/Python

玩弄
sklearn
,我想用
Open
High
Low
价格和
成交量来预测几天的TSLA
收盘价。我用了一个非常基本的模型来预测收盘价,他们应该是100%准确的,我不知道为什么。0%的错误让我感觉好像没有正确设置模型

代码:

从操作系统导入X_确定
从numpy.lib.shape_base导入沿_轴应用_
作为pd进口熊猫
从sklearn.tree导入决策树
从sklearn.metrics导入平均绝对误差
tsla_data_path=“/Users/simon/Documents/PythonVS/ML/tsla.csv”
tsla_数据=pd.read_csv(tsla_数据路径)
tsla_功能=[“开放”、“高”、“低”、“音量”]
y=tsla_数据。关闭
X=tsla_数据[tsla_特征]
#定义模型
特斯拉模型=决策树累加器(随机状态=1)
#拟合模型
特斯拉_模型拟合(X,y)
#打印结果
打印('对以下五个日期进行预测')
打印(X.head())
打印('UUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUU
打印('预测是')
打印(tesla_model.predict(X.head()))
打印('UUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUU
打印('错误为')
打印(平均绝对误差(y.head(),特斯拉模型预测(X.head()))
输出:

making predictions for the following five dates
        Open       High        Low    Volume
0  67.054001  67.099998  65.419998  39737000
1  66.223999  66.786003  65.713997  27778000
2  66.222000  66.251999  65.500000  12328000
3  65.879997  67.276001  65.737999  30372500
4  66.524002  67.582001  66.438004  32868500
________________________________________________
the predictions are
[65.783997 66.258003 65.987999 66.973999 67.239998]
________________________________________________
the error is
0.0
数据:


在用于训练模型的数据集上测量模型的性能是一个错误

如果您想对您的性能有一个合适的评估指标,您应该将数据集拆分为两个数据集。一个用于训练模型,另一个用于测量其性能。您可以使用
sklearn.model\u selection.train\u test\u split()
拆分数据集,如下所示:

tesla_model = DecisionTreeRegressor(random_state = 1)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
tesla_model.fit(X_train, X_test)
mae = mean_absolute_error(y_test,tesla_model.predict(X_test))

看看这本维基百科,它用ML解释了不同的数据集。

您正在使用输入到
fit
的相同数据集进行预测。
tesla_model = DecisionTreeRegressor(random_state = 1)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
tesla_model.fit(X_train, X_test)
mae = mean_absolute_error(y_test,tesla_model.predict(X_test))