Python 索引器:索引4超出大小为4的轴1的界限
我在网上学习机器学习。在多元回归模型中,当我编写以下代码时:Python 索引器:索引4超出大小为4的轴1的界限,python,numpy,machine-learning,statsmodels,Python,Numpy,Machine Learning,Statsmodels,我在网上学习机器学习。在多元回归模型中,当我编写以下代码时: # multiple linear regression import pandas as pd import numpy as np dataset = pd.read_csv("50_Startups.csv") x = dataset.iloc[:, :-1].values y = dataset.iloc[:, 4].values from sklearn.preprocessing import LabelEncoder
# multiple linear regression
import pandas as pd
import numpy as np
dataset = pd.read_csv("50_Startups.csv")
x = dataset.iloc[:, :-1].values
y = dataset.iloc[:, 4].values
from sklearn.preprocessing import LabelEncoder, OneHotEncoder
from sklearn.compose import ColumnTransformer
labelencoder_x = LabelEncoder()
x[:, 3] = labelencoder_x.fit_transform(x[:, 3])
ct = ColumnTransformer(
[('one_hot_encoder', OneHotEncoder(categories="auto"), [3])],
remainder="passthrough"
)
# avoiding the dummy variable trap
x = x[:, 1:]
from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=0)
# fitting multiple libnear regresion to the training set
from sklearn.linear_model import LinearRegression
regressor = LinearRegression()
regressor.fit(x_train, y_train)
# predicting the test set results
y_pred = regressor.predict(x_test)
import statsmodels.formula.api as sm
x = np.append(arr = np.ones((50, 1)).astype(int), values = x, axis = 1)
x_opt = x[:, [0, 1, 2, 3, 4, 5]]
regressor_ols = sm.OLS(endog = y, exog = x_opt).fit()
regressor_ols.summary()
我得到了以下错误:
Traceback (most recent call last):
File "/home/ashutosh/Machine Learning A-Z Template Folder/Part 2 - Regression/Section 5 - Multiple Linear Regression/P14-Multiple-Linear-Regression/Multiple_Linear_Regression/mlr.py", line 35, in <module>
x_opt = x[:, [0, 1, 2, 3, 4, 5]]
IndexError: index 4 is out of bounds for axis 1 with size 4
回溯(最近一次呼叫最后一次):
文件“/home/ashutosh/Machine Learning A-Z Template Folder/Part 2-回归/第5节-多元线性回归/P14多元线性回归/多元线性回归/mlr.py”,第35行,在
x_opt=x[:,[0,1,2,3,4,5]]
索引器:索引4超出大小为4的轴1的界限
我检查了多个答案,但他们没有和我一样的问题。
我能做什么
您可以从这里下载数据集:您的代码不可复制,这会导致
名称错误:名称“x”未定义
,因为values=x
在append
@G中。Anderson希望这一次它能帮助您在行不通之前打印x
。很明显,这里只有4个元素。也许你从CSV中错误地解析了它?我得到了相同的错误问题行上方的x.shape
是什么?我猜您只有4列,所以没有索引为4或5的列