回归图错误(python)
因此,我的程序读取MPG与重量的关系,并绘制一个它应该是什么样子的图形,但正如您所看到的,该图形看起来并不正确回归图错误(python),python,pandas,scikit-learn,Python,Pandas,Scikit Learn,因此,我的程序读取MPG与重量的关系,并绘制一个它应该是什么样子的图形,但正如您所看到的,该图形看起来并不正确 import numpy as np import pandas as pd import matplotlib.pyplot as plt #read txt file dataframe= pd.read_table('auto_data71.txt',delim_whitespace=True,names=['MPG','Cylinder','Displacement','Ho
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
#read txt file
dataframe= pd.read_table('auto_data71.txt',delim_whitespace=True,names=['MPG','Cylinder','Displacement','Horsepower','Weight','acceleration','Model year','Origin','Car Name'])
dataframe.dropna(inplace=True)
#filter the un-necessary columns
X = dataframe.iloc[:,4:5].values
Y = dataframe.iloc[:,0:1].values
#scale data
from sklearn.preprocessing import StandardScaler
sc_X = StandardScaler()
sc_Y= StandardScaler()
X = sc_X.fit_transform(X)
Y = sc_Y.fit_transform(Y)
#split data into train and test set
from sklearn.model_selection import train_test_split
x_train,x_test,y_train,y_test = train_test_split(X,Y,test_size=0.2)
#create model
from sklearn.preprocessing import PolynomialFeatures
from sklearn.linear_model import LinearRegression
poly_reg = PolynomialFeatures(degree=2)
poly_X = poly_reg.fit_transform(x_train)
poly_reg.fit(poly_X,y_train)
regressor2= LinearRegression()
regressor2.fit(poly_X,y_train)
#graph
result = regressor2.predict(poly_X)
plt.scatter(x_train,y_train,color='red')
plt.plot(x_train, result,color='blue')
plt.show()
输出如下:
正如您所看到的,回归线看起来并不正确。任何帮助都将不胜感激
#auto_data.txt(part of data...)
****注意:我只使用重量和mpg列来表示此代码
文件(mpg、气缸、距离、马力、重量、加速度、年份、原点、名称)
打印前需要对值进行排序 数据: 使用这个:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.preprocessing import StandardScaler
from sklearn.preprocessing import PolynomialFeatures
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
data = pd.read_csv('data.txt', delim_whitespace=True)
data.dropna(inplace=True)
X = data['weight'].values
Y = data['mpg'].values
X = X.reshape(-1, 1)
Y = Y.reshape(-1, 1)
#scale data
from sklearn.preprocessing import StandardScaler
sc_X = StandardScaler()
sc_Y= StandardScaler()
X = sc_X.fit_transform(X)
Y = sc_Y.fit_transform(Y)
#split data into train and test set
from sklearn.model_selection import train_test_split
x_train,x_test,y_train,y_test = train_test_split(X,Y,test_size=0.2)
#create model
from sklearn.preprocessing import PolynomialFeatures
from sklearn.linear_model import LinearRegression
poly_reg = PolynomialFeatures(degree=2)
poly_X = poly_reg.fit_transform(x_train)
poly_reg.fit(poly_X,y_train)
regressor2= LinearRegression()
regressor2.fit(poly_X,y_train)
#graph
result = regressor2.predict(np.sort(poly_X,axis=0))
plt.scatter(x_train,y_train,color='red')
plt.plot(np.sort(x_train, axis = 0), result,color='blue')
plt.show()
打印前需要对值进行排序。你能添加数据吗?我添加了数据,谢谢你的快速回复。我将发布一个答案。我添加了dataframe。在dataframe.dropna(inplace=True)之后排序值(by='Weight',inplace=True),图形看起来仍然混乱。请查看我的答案并告诉我。请记住,您正在尝试绘制
x\u序列
,并且结果
但是结果
是预测的多项式特征
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.preprocessing import StandardScaler
from sklearn.preprocessing import PolynomialFeatures
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
data = pd.read_csv('data.txt', delim_whitespace=True)
data.dropna(inplace=True)
X = data['weight'].values
Y = data['mpg'].values
X = X.reshape(-1, 1)
Y = Y.reshape(-1, 1)
#scale data
from sklearn.preprocessing import StandardScaler
sc_X = StandardScaler()
sc_Y= StandardScaler()
X = sc_X.fit_transform(X)
Y = sc_Y.fit_transform(Y)
#split data into train and test set
from sklearn.model_selection import train_test_split
x_train,x_test,y_train,y_test = train_test_split(X,Y,test_size=0.2)
#create model
from sklearn.preprocessing import PolynomialFeatures
from sklearn.linear_model import LinearRegression
poly_reg = PolynomialFeatures(degree=2)
poly_X = poly_reg.fit_transform(x_train)
poly_reg.fit(poly_X,y_train)
regressor2= LinearRegression()
regressor2.fit(poly_X,y_train)
#graph
result = regressor2.predict(np.sort(poly_X,axis=0))
plt.scatter(x_train,y_train,color='red')
plt.plot(np.sort(x_train, axis = 0), result,color='blue')
plt.show()