Python 线性回归_Python_Pandas_Machine Learning

Python 线性回归

python pandas machine-learning

Python 线性回归,python,pandas,machine-learning,Python,Pandas,Machine Learning,节目： import pandas as pd ds=pd.read_csv('Animals.csv') x=ds.iloc[:,1].values y=ds.iloc[:,2].values from sklearn.model_selection import train_test_split x_train,x_test,y_train,y_test=train_test_split(x,y,test_size=0.2,random_state=0) x_train = x_tra

节目：

import pandas as pd

ds=pd.read_csv('Animals.csv')

x=ds.iloc[:,1].values
y=ds.iloc[:,2].values

from sklearn.model_selection import train_test_split
x_train,x_test,y_train,y_test=train_test_split(x,y,test_size=0.2,random_state=0)
x_train = x_train.reshape(-1, 1)
y_train = y_train.reshape(-1,1)

from sklearn.linear_model import LinearRegression as lr
reg=lr()
reg.fit(x_train,y_train)

y_pred=reg.predict(x_test)

预测并不完美为什么？数据集有问题吗？或者它可能有什么问题？我是机器学习新手

提前感谢

这实际上取决于您试图预测的内容，以及您拥有的功能是否是良好的预测因素。因此，即使您只是尝试使用LR，如果您的目标变量可以通过特性来解释，那么您应该获得一些合理的精度度量

查看你的代码> yyTest你应该考虑删除离群值，这可能会提高模型的准确性。

您可能还想尝试使用一些更有效的回归器，例如或a。

它永远不会完美，但是您可以通过添加更多的训练示例或让其训练更长时间来提高其效率。因此，数据越少，预测效果就越差，对吗？

y_pred = array([[433.34494686],
                [433.20384407],
                [418.6791427 ],
                [433.34789435],
                [407.49640802],
                [432.25311216]])

y_test = array([[ 119.5],
                [ 157. ],
                [5712. ],
                [  56. ],
                [  50. ],
                [ 680. ]])