Pandas 拟合上的简单模型错误:发现样本数不一致的输入变量

Pandas 拟合上的简单模型错误:发现样本数不一致的输入变量,pandas,machine-learning,data-science,Pandas,Machine Learning,Data Science,我知道这个问题以各种形式存在,但在网上搜索了几天/几个小时后,我仍然没有找到任何解决我问题的方法 这是我的笔记本: import numpy as np import pandas as pd X = pd.read_csv('../input/web-traffic-time-series-forecasting/train_1.csv.zip') X = X.drop('Page', axis=1) X.fillna(0, inplace=True, axis=0) X_sliced =

我知道这个问题以各种形式存在,但在网上搜索了几天/几个小时后,我仍然没有找到任何解决我问题的方法

这是我的笔记本:

import numpy as np
import pandas as pd

X = pd.read_csv('../input/web-traffic-time-series-forecasting/train_1.csv.zip')
X = X.drop('Page', axis=1)
X.fillna(0, inplace=True, axis=0)

X_sliced = X.iloc[:, 0:367]
y_sliced = X.iloc[:, 367:-1]

from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split

linreg = LinearRegression()

X_sliced.drop(X_sliced.iloc[:, 182:367], inplace=True, axis=1) #Here, I make sure that my X_sliced has the same shape as y_sliced

X_sliced.shape
电话:145063182

y_sliced.shape
X_train, y_train, X_test, y_test = train_test_split(X_sliced, y_sliced)
linreg.fit(X_train, y_train)
电话:145063182

y_sliced.shape
X_train, y_train, X_test, y_test = train_test_split(X_sliced, y_sliced)
linreg.fit(X_train, y_train)
ValueError:找到样本数不一致的输入变量:[10879736266]

当数据帧的形状完全相同时,为什么会收到此错误


链接到kaggle上的原始分配:

您已按错误顺序分配列车测试分割的输出,它应该是:

X_train, X_test, y_train, y_test = train_test_split(X_sliced, y_sliced) # x, x, y, y not x, y, x, y