Python ValueError：模型的特征数必须与输入匹配。模型n_特征为11，输入n_特征为2_Python_Numpy_Machine Learning_Jupyter Notebook_Data Science

Python ValueError：模型的特征数必须与输入匹配。模型n_特征为11，输入n_特征为2

python numpy machine-learning jupyter-notebook

Python ValueError：模型的特征数必须与输入匹配。模型n_特征为11，输入n_特征为2,python,numpy,machine-learning,jupyter-notebook,data-science,Python,Numpy,Machine Learning,Jupyter Notebook,Data Science,在jupyter笔记本中运行下面的代码时，我得到了值错误 ValueError：模型的特征数必须与输入匹配。模型n_特征为11，输入n_特征为2 如何解决这个问题 # Visualising the Training set results from matplotlib.colors import ListedColormap X_set, y_set = X_train, y_train X1, X2 = np.meshgrid(np.arange(start = X_set[:, 0].m

在jupyter笔记本中运行下面的代码时，我得到了值错误

ValueError：模型的特征数必须与输入匹配。模型n_特征为11，输入n_特征为2

如何解决这个问题

# Visualising the Training set results
from matplotlib.colors import ListedColormap
X_set, y_set = X_train, y_train
X1, X2 = np.meshgrid(np.arange(start = X_set[:, 0].min() - 1, stop = X_set[:, 0].max() + 1, step = 0.01),
                     np.arange(start = X_set[:, 1].min() - 1, stop = X_set[:, 1].max() + 1, step = 0.01))
plt.contourf(X1, X2, classifier.predict(np.array([X1.ravel(), X2.ravel()]).T).reshape(X1.shape),
             alpha = 0.75, cmap = ListedColormap(('red', 'green')))

我得到以下错误：

ValueError                                Traceback (most recent call last)
<ipython-input-42-bc13e66e79fe> in <module>
      4 X1, X2 = np.meshgrid(np.arange(start = X_set[:, 0].min() - 1, stop = X_set[:, 0].max() + 1, step = 0.01),
      5                      np.arange(start = X_set[:, 1].min() - 1, stop = X_set[:, 1].max() + 1, step = 0.01))
----> 6 plt.contourf(X1, X2, classifier.predict(np.array([X1.ravel(), X2.ravel()]).T).reshape(X1.shape),
      7              alpha = 0.75, cmap = ListedColormap(('red', 'green')))
      8 plt.xlim(X1.min(), X1.max())

~\anaconda3\lib\site-packages\sklearn\ensemble\_forest.py in predict(self, X)
    627             The predicted classes.
    628         """
--> 629         proba = self.predict_proba(X)
    630 
    631         if self.n_outputs_ == 1:

~\anaconda3\lib\site-packages\sklearn\ensemble\_forest.py in predict_proba(self, X)
    671         check_is_fitted(self)
    672         # Check data
--> 673         X = self._validate_X_predict(X)
    674 
    675         # Assign chunk of trees to jobs

~\anaconda3\lib\site-packages\sklearn\ensemble\_forest.py in _validate_X_predict(self, X)
    419         check_is_fitted(self)
    420 
--> 421         return self.estimators_[0]._validate_X_predict(X, check_input=True)
    422 
    423     @property

~\anaconda3\lib\site-packages\sklearn\tree\_classes.py in _validate_X_predict(self, X, check_input)
    394         n_features = X.shape[1]
    395         if self.n_features_ != n_features:
--> 396             raise ValueError("Number of features of the model must "
    397                              "match the input. Model n_features is %s and "
    398                              "input n_features is %s "

ValueError: Number of features of the model must match the input. Model n_features is 11 and input n_features is 2

ValueError回溯（最近一次调用）
在里面
4x1，X2=np.meshgrid（np.arange（开始=X_集[：，0].min（）-1，停止=X_集[：，0].max（）+1，步长=0.01），
5 np.arange（开始=X_集[：，1].min（）-1，停止=X_集[：，1].max（）+1，步长=0.01））
---->6 plt.contourf（X1，X2，分类器.predict（np.array（[X1.ravel（），X2.ravel（）]）.T）。重塑（X1.shape），
7 alpha=0.75，cmap=ListedColormap（（“红色”、“绿色”））
8 plt.xlim（X1.min（），X1.max（））
预测中的~\anaconda3\lib\site packages\sklearn\employee\\u forest.py（self，X）
627预测类。
628         """
-->629概率=自我预测概率（X）
630
631如果self.n_输出=1：
预测概率中的~\anaconda3\lib\site packages\sklearn\employee\\u forest.py（self，X）
671检查是否已安装（自身）
672#检查数据
-->673 X=自我验证X预测（X）
674
675#为作业分配树块
~\anaconda3\lib\site packages\sklearn\employee\\u forest.py in\u validate\u X\u predict（self，X）
419检查是否已安装（自身）
420
-->421返回自.估计量[0]。\u验证\u X\u预测（X，检查\u输入=真）
422
423@property
~\anaconda3\lib\site packages\sklearn\tree\\u classes.py in\u validate\u X\u predict（self，X，check\u输入）
394 n_特征=X.形状[1]
395如果self.n_特征！=n_特征：
-->396 raise VALUE ERROR（“模型的特征数量必须”
397“匹配输入。型号n_功能为%s和”
398“输入n_特征为%s”
ValueError:模型的特征数必须与输入匹配。模型n_特征数为11，输入n_特征数为2

完整的模型代码：

我将按照我理解问题的方式修复代码，添加了几行额外的代码。主要问题是，您只为预测输入第1列和第2列，但predictor需要11列1-11。因此，第3-11列应该以某种方式填充。至少您可以用零填充它们

在我的解决方案中，我按第1列对训练集进行排序，然后在使用构建网格网格时，我试图通过从网格网格中找到值接近X1的最近的第1列值来近似预测所需的第3-11列。也就是说，我试图找到第3-11列的最佳近似值，仅给出第1列，这并不是用ze填充第3-11列ros，这也是可以做到的

此外，我还注释了sklearn.cross\u validation import train\u test\u split中的行

，并将其替换为sklearn.model\u selection import train\u test\u split中的，因为第一行使用旧的sklearn库，在新版本中，只有第二行工作，子模块名称已更改。请自行选择此行的正确变体
# Random Forest Classification

# Importing the libraries
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

# Importing the dataset
dataset = pd.read_csv('finalplacementdata3.csv')
X = dataset.iloc[:, range(1, 12)].values
y = dataset.iloc[:, 12].values

siX = np.lexsort((X[:, 1], X[:, 0]))
sX, sy = X[siX], y[siX]

# Splitting the dataset into the Training set and Test set
#from sklearn.cross_validation import train_test_split
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.25, random_state = 0)

# Feature Scaling
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)

# Fitting Random Forest Classification to the Training set
from sklearn.ensemble import RandomForestClassifier
classifier = RandomForestClassifier(n_estimators = 10, criterion = 'entropy', random_state = 0)
classifier.fit(X_train, y_train)

# Predicting the Test set results
y_pred = classifier.predict(X_test)

# Making the Confusion Matrix
from sklearn.metrics import confusion_matrix
cm = confusion_matrix(y_test, y_pred)

# Visualising the Training set results
from matplotlib.colors import ListedColormap
X_set, y_set = X_train, y_train
X1, X2 = np.meshgrid(np.arange(start = X_set[:, 0].min() - 1, stop = X_set[:, 0].max() + 1, step = 0.01),
                     np.arange(start = X_set[:, 1].min() - 1, stop = X_set[:, 1].max() + 1, step = 0.01))
                     
riX = np.minimum(sX.shape[0] - 1, np.searchsorted(sX[:, 0], X1.ravel()))
rX = X[riX]

plt.contourf(X1, X2, classifier.predict(np.array([X1.ravel(), X2.ravel()] + list(rX[:, 2:].T)).T).reshape(X1.shape),
             alpha = 0.75, cmap = ListedColormap(('red', 'green')))
plt.xlim(X1.min(), X1.max())
plt.ylim(X2.min(), X2.max())
for i, j in enumerate(np.unique(y_set)):
    plt.scatter(X_set[y_set == j, 0], X_set[y_set == j, 1],
                c = ListedColormap(('red', 'green'))(i), label = j)
plt.title('Random Forest Classification (Training set)')
plt.xlabel('Quants')
plt.ylabel('CGPA')
plt.legend()
plt.show()

# Visualising the Test set results
from matplotlib.colors import ListedColormap
X_set, y_set = X_test, y_test
X1, X2 = np.meshgrid(np.arange(start = X_set[:, 0].min() - 1, stop = X_set[:, 0].max() + 1, step = 0.01),
                     np.arange(start = X_set[:, 1].min() - 1, stop = X_set[:, 1].max() + 1, step = 0.01))

riX = np.minimum(sX.shape[0] - 1, np.searchsorted(sX[:, 0], X1.ravel()))
rX = X[riX]

plt.contourf(X1, X2, classifier.predict(np.array([X1.ravel(), X2.ravel()] + list(rX[:, 2:].T)).T).reshape(X1.shape),
             alpha = 0.75, cmap = ListedColormap(('red', 'green')))
plt.xlim(X1.min(), X1.max())
plt.ylim(X2.min(), X2.max())
for i, j in enumerate(np.unique(y_set)):
    plt.scatter(X_set[y_set == j, 0], X_set[y_set == j, 1],
                c = ListedColormap(('red', 'green'))(i), label = j)
plt.title('Random Forest Classification (Test set)')
plt.xlabel('Quants')
plt.ylabel('CGPA')
plt.legend()
plt.show()

您的模型（分类器
）经过训练，每个X输入中有11个数字。但您为其提供了2个数字。即，您的预测数组np.array（[X1.ravel（），X2.ravel（）]））.T
只有两列，但应该有11列。如果您提供模型的代码，我们可以调查问题。或者，您可以创建11列，使用与上面相同的11个X，如X1、X2、X3…X11，更好地作为原因数组。@Arty请从这里检查完整的模型代码-->是的，在您的代码中，您正在训练模型进行预测第12列由第1-11列组成。因此，在代码的最后一部分，当您可视化和预测（当您有异常时）时，您只提供了两列X1、X2，但需要提供11列。