Python 如何将SMOTE重采样和特征选择集成到RFECV中_Python_Python 3.x_Classification

Python 如何将SMOTE重采样和特征选择集成到RFECV中

python python-3.x

Python 如何将SMOTE重采样和特征选择集成到RFECV中,python,python-3.x,classification,Python,Python 3.x,Classification,我正在研究一个shape（41188，58）数据集来制作一个二进制分类器。数据极不平衡。最初，我打算通过RFECV进行功能选择，这是我使用的代码，该代码借用自：我得到了以下结果：然后我将代码更改为cv=StratifiedKFold（2），将min\u features\u更改为\u select=20，这次我得到：在上述所有情况下，均未进行重采样。由于应将重采样应用于训练数据，因此我在这里使用交叉验证，每个训练数据折叠也应重采样（例如SMOTE）。我想知道如何将重采样和特征选择集成到

我正在研究一个shape（41188，58）数据集来制作一个二进制分类器。数据极不平衡。最初，我打算通过RFECV进行功能选择，这是我使用的代码，该代码借用自：

我得到了以下结果：

然后我将代码更改为

cv=StratifiedKFold（2）

，将

min\u features\u更改为\u select=20

，这次我得到：

在上述所有情况下，均未进行重采样。由于应将重采样应用于训练数据，因此我在这里使用交叉验证，每个训练数据折叠也应重采样（例如SMOTE）。我想知道如何将重采样和特征选择集成到RFECV中

# Create the RFE object and compute a cross-validated score.
svc = SVC(kernel="linear")

# The "accuracy" scoring is proportional to the number of correct classifications
min_features_to_select = 1  # Minimum number of features to consider
rfecv = RFECV(estimator=svc, step=1, cv=StratifiedKFold(5),
              scoring='accuracy',
              min_features_to_select=min_features_to_select)
rfecv.fit(X, y)

print("Optimal number of features : %d" % rfecv.n_features_)

# Plot number of features VS. cross-validation scores
plt.figure()
plt.plot(range(min_features_to_select, len(rfecv.grid_scores_) +
                  min_features_to_select), rfecv.grid_scores_)
plt.show()