Python sklearn.feature_选择和RFECV

Python sklearn.feature_选择和RFECV,python,scikit-learn,sklearn-pandas,Python,Scikit Learn,Sklearn Pandas,共有914行*191列,例如: import pandas as pd from sklearn.cross_validation import StratifiedKFold from sklearn.feature_selection import SelectPercentile a = pd.read_csv('NCAA_2003-2016_with_diff.csv') logreg = lm.LogisticRegression() rfecv = RFECV(estimato

共有914行*191列,例如:

import pandas as pd
from sklearn.cross_validation import StratifiedKFold
from sklearn.feature_selection import SelectPercentile

a = pd.read_csv('NCAA_2003-2016_with_diff.csv')

logreg = lm.LogisticRegression()

rfecv = RFECV(estimator=logreg, cv=10, scoring='?')
这意味着还有其他的“x”,我试图选择最有效的变量来预测结果


如何编写for循环来实现这一点?

澄清一下,“x”是您的功能吗?您想知道如何进行功能选择吗?“df”来自哪里?你需要更多地描述数据以及你想要做什么?x是特征,y是响应变量,我想从我的数据集中的100多个特征中选择几个特征,基于回归模型,测量可以是“均方误差”或“f-分数”,我现在澄清了吗?
x = df[['diff_dist','team1_log5','tpp','orp','tempo','efg','ftr','blk']]
y = df[['result']]