Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/305.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/9/three.js/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python Sklearn线性回归产生不正确的系数值_Python_Machine Learning_Scikit Learn_Linear Regression - Fatal编程技术网

Python Sklearn线性回归产生不正确的系数值

Python Sklearn线性回归产生不正确的系数值,python,machine-learning,scikit-learn,linear-regression,Python,Machine Learning,Scikit Learn,Linear Regression,我试图找到线性方程的斜率和y截距系数。我创建了一个测试域和范围,以确保我收到的数字是正确的。方程应为y=2x+1,但模型表示斜率为24,y截距为40.3125。该模型准确地预测了我给出的每一个值,但我质疑如何才能得到正确的值 import matplotlib.pyplot as plt import numpy as np from sklearn import datasets, linear_model from sklearn.metrics import mean_squared_er

我试图找到线性方程的斜率和y截距系数。我创建了一个测试域和范围,以确保我收到的数字是正确的。方程应为y=2x+1,但模型表示斜率为24,y截距为40.3125。该模型准确地预测了我给出的每一个值,但我质疑如何才能得到正确的值

import matplotlib.pyplot as plt
import numpy as np
from sklearn import datasets, linear_model
from sklearn.metrics import mean_squared_error, r2_score
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

X = np.arange(0, 40)
y = (2 * X) + 1

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=.2, random_state=0)
X_train = [[i] for i in X_train]
X_test = [[i] for i in X_test]

sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)

regr = linear_model.LinearRegression()

regr.fit(X_train, y_train)

y_pred = regr.predict(X_test)

print('Coefficients: \n', regr.coef_)
print('Y-intercept: \n', regr.intercept_)
print('Mean squared error: %.2f'
      % mean_squared_error(y_test, y_pred))
print('Coefficient of determination: %.2f'
      % r2_score(y_test, y_pred))

plt.scatter(X_test, y_test,  color='black')
plt.plot(X_test, y_pred, color='blue', linewidth=3)
print(X_test)

plt.xticks()
plt.yticks()

plt.show()

这是因为您扩展了培训和测试数据。因此,即使您生成了
y
作为
X
的线性函数,您还是通过标准化将
X\u序列
X\u测试
转换为另一个尺度(减去平均值,除以标准偏差)

如果我们运行您的代码,但忽略了缩放数据的行,那么您将得到预期的结果

X=np.arange(0,40)
y=(2*X)+1
X_序列,X_测试,y_序列,y_测试=序列测试分割(X,y,测试大小=.2,随机状态=0)
X_-train=[[i]表示X_-train中的i]
X_测试=[[i]表示X_测试中的i]
#跳过X_列和X_测试的缩放
#sc=StandardScaler()
#X_序列=sc.fit_变换(X_序列)
#X_测试=sc.transform(X_测试)
regr=线性模型。线性回归()
重新装配(X_系列、y_系列)
y_pred=重新预测(X_检验)
打印('系数:\n',再生系数\)
>系数:
[2.]
打印('Y-截距:\n',再截距u3;)
>Y截距:
1