Python scikit学习中rbf核支持向量机的递归特征消除值误差_Python_Scikit Learn_Rfe

Python scikit学习中rbf核支持向量机的递归特征消除值误差

python scikit-learn

Python scikit学习中rbf核支持向量机的递归特征消除值误差,python,scikit-learn,rfe,Python,Scikit Learn,Rfe,我试图在scikit learn中使用递归特征消除（RFE）函数，但不断得到错误ValueError:coef\uu仅在使用线性内核时可用。我正在尝试使用rbf核为支持向量分类器（SVC）执行特征选择。该网站的示例执行良好： print(__doc__) from sklearn.svm import SVC from sklearn.cross_validation import StratifiedKFold from sklearn.feature_selection import RF

我试图在scikit learn中使用递归特征消除（RFE）函数，但不断得到错误

ValueError:coef\uu仅在使用线性内核时可用。我正在尝试使用rbf核为支持向量分类器（SVC）执行特征选择。该网站的示例执行良好：
print(__doc__)

from sklearn.svm import SVC
from sklearn.cross_validation import StratifiedKFold
from sklearn.feature_selection import RFECV
from sklearn.datasets import make_classification
from sklearn.metrics import zero_one_loss

# Build a classification task using 3 informative features
X, y = make_classification(n_samples=1000, n_features=25, n_informative=3,
                       n_redundant=2, n_repeated=0, n_classes=8,
                       n_clusters_per_class=1, random_state=0)

# Create the RFE object and compute a cross-validated score.
svc = SVC(kernel="linear")
rfecv = RFECV(estimator=svc, step=1, cv=StratifiedKFold(y, 2),
          scoring='accuracy')
rfecv.fit(X, y)

print("Optimal number of features : %d" % rfecv.n_features_)

# Plot number of features VS. cross-validation scores
import pylab as pl
pl.figure()
pl.xlabel("Number of features selected")
pl.ylabel("Cross validation score (nb of misclassifications)")
pl.plot(range(1, len(rfecv.grid_scores_) + 1), rfecv.grid_scores_)
pl.show()

但是，简单地将内核类型从线性更改为rbf（如下所示）会产生错误：
print(__doc__)

from sklearn.svm import SVC
from sklearn.cross_validation import StratifiedKFold
from sklearn.feature_selection import RFECV
from sklearn.datasets import make_classification
from sklearn.metrics import zero_one_loss

# Build a classification task using 3 informative features
X, y = make_classification(n_samples=1000, n_features=25, n_informative=3,
                       n_redundant=2, n_repeated=0, n_classes=8,
                       n_clusters_per_class=1, random_state=0)

# Create the RFE object and compute a cross-validated score.
svc = SVC(kernel="rbf")
rfecv = RFECV(estimator=svc, step=1, cv=StratifiedKFold(y, 2),
          scoring='accuracy')
rfecv.fit(X, y)

print("Optimal number of features : %d" % rfecv.n_features_)

# Plot number of features VS. cross-validation scores
import pylab as pl
pl.figure()
pl.xlabel("Number of features selected")
pl.ylabel("Cross validation score (nb of misclassifications)")
pl.plot(range(1, len(rfecv.grid_scores_) + 1), rfecv.grid_scores_)
pl.show()

这看起来可能是个bug，但如果有人能发现我做错了什么，那就太好了。另外，我正在使用scikit学习版0.14.1运行python 2.7.6
谢谢你的帮助
 这似乎是预期的结果。要求估计器具有表示特征重要性的系数
：
估计员：对象
一种监督学习估计器，采用拟合方法更新保存拟合参数的coef_u属性。重要特征必须对应于coef_uu数组中的高绝对值
根据文档，通过将内核更改为RBF，将不再是线性的，coef\uu
属性变得不可用：
系数_
数组，形状=[n_类-1，n_特征]
赋予特征的权重（原始问题中的系数）。这仅在线性内核的情况下可用
当RFECV试图在内核不是线性的情况下访问coef\uuu
时，SVC会引发此错误。
重复的。感谢您的快速响应，我不知道我是如何在文档中错过它的，它就在我面前。嗨，关于如何获取rbf内核或任何自定义内核的权重，是否有任何快速建议？@YS-L如果您知道此问题的答案，请告诉我：。非常感谢：）