Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/templates/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Scikit learn 递归特征消除的RFE计算_Scikit Learn_Rfe - Fatal编程技术网

Scikit learn 递归特征消除的RFE计算

Scikit learn 递归特征消除的RFE计算,scikit-learn,rfe,Scikit Learn,Rfe,我有一个名为“dataset\u con\u enc”的数据帧 我尝试对特征选择进行递归特征消除,因此: # Load libraries from sklearn.datasets import make_regression from sklearn.feature_selection import RFECV from sklearn import datasets, linear_model import warnings # Suppress an annoying but har

我有一个名为“dataset\u con\u enc”的数据帧

我尝试对特征选择进行递归特征消除,因此:

# Load libraries
from sklearn.datasets import make_regression
from sklearn.feature_selection import RFECV
from sklearn import datasets, linear_model
import warnings

# Suppress an annoying but harmless warning
warnings.filterwarnings(action="ignore", module="scipy", message="^internal gelsd")
# Calculating RFE for non-discretised dataset, and graphing the Importance for each feature, per dataset
selector1 = RFECV(LogisticRegression(), step=1, cv=5, n_jobs=-1)
selector1 = selector1.fit(dataset_con_enc.drop('target', axis=1).values, dataset_con_enc['target'].values)
但我在最后一行代码中得到一个错误:

ImportError                               Traceback (most recent call last)
<ipython-input-509-5e50f1655a89> in <module>()
      3 # Calculating RFE for non-discretised dataset, and graphing the Importance for each feature, per dataset
      4 selector1 = RFECV(LogisticRegression(), step=1, cv=5, n_jobs=-1)
----> 5 selector1 = selector1.fit(dataset_con_enc.drop('target', axis=1).values, dataset_con_enc['target'].values)

~\AppData\Local\Continuum\anaconda3\lib\site-packages\sklearn\feature_selection\rfe.py in fit(self, X, y)
    434         scores = parallel(
    435             func(rfe, self.estimator, X, y, train, test, scorer)
--> 436             for train, test in cv.split(X, y))
    437 
    438         scores = np.sum(scores, axis=0)

~\AppData\Local\Continuum\anaconda3\lib\site-packages\sklearn\externals\joblib\parallel.py in __call__(self, iterable)
    747         self._aborting = False
    748         if not self._managed_backend:
--> 749             n_jobs = self._initialize_backend()
    750         else:
    751             n_jobs = self._effective_n_jobs()

~\AppData\Local\Continuum\anaconda3\lib\site-packages\sklearn\externals\joblib\parallel.py in _initialize_backend(self)
    545         try:
    546             n_jobs = self._backend.configure(n_jobs=self.n_jobs, parallel=self,
--> 547                                              **self._backend_args)
    548             if self.timeout is not None and not self._backend.supports_timeout:
    549                 warnings.warn(

~\AppData\Local\Continuum\anaconda3\lib\site-packages\sklearn\externals\joblib\_parallel_backends.py in configure(self, n_jobs, parallel, **backend_args)
    303         if already_forked:
    304             raise ImportError(
--> 305                 '[joblib] Attempting to do parallel computing '
    306                 'without protecting your import on a system that does '
    307                 'not support forking. To use parallel-computing in a '

ImportError: [joblib] Attempting to do parallel computing without protecting your import on a system that does not support forking. To use parallel-computing in a script, you must protect your main loop using "if __name__ == '__main__'". Please see the joblib documentation on Parallel for more information
ImportError回溯(最近一次调用)
在()
3#计算非离散数据集的RFE,并绘制每个数据集每个特征的重要性
4选择器1=RFECV(逻辑回归(),步骤=1,cv=5,n_作业=-1)
---->5 selector1=selector1.fit(数据集\u con\u enc.drop('target',axis=1)。值,数据集\u con\u enc['target']。值)
~\AppData\Local\Continuum\anaconda3\lib\site packages\sklearn\feature\u selection\rfe.py合适(self,X,y)
434分=平行(
435 func(rfe、自估计器、X、y、训练、测试、记分器)
-->436对于列车,在等速拆分(X,y))中进行试验
437
438分=总分(分,轴=0)
调用中的~\AppData\Local\Continuum\anaconda3\lib\site packages\sklearn\externals\joblib\parallel.py(self,iterable)
747自动终止=错误
748如果不是自我管理的后端:
-->749 n_jobs=self._initialize_backend()
750其他:
751 n_jobs=自我有效的n_jobs()
~\AppData\Local\Continuum\anaconda3\lib\site packages\sklearn\externals\joblib\parallel.py in\u initialize\u backend(self)
545尝试:
546 n_jobs=self.\u backend.configure(n_jobs=self.n_jobs,parallel=self,
-->547**self.\u后端\u参数)
548如果self.timeout不是None且不是self.\u backend.supports\u timeout:
549.warn(
配置中的~\AppData\Local\Continuum\anaconda3\lib\site packages\sklearn\externals\joblib\\u parallel\u backends.py(self,n\u jobs,parallel,**backend\u args)
303如果已经分叉:
304引起恐怖(
-->305“[joblib]正在尝试进行并行计算”
306“在系统上不保护您的导入”
307“不支持分叉。要在
ImportError:[joblib]试图在不支持分叉的系统上执行并行计算而不保护您的导入。若要在脚本中使用并行计算,您必须使用“if _uname_uu=='_umain_uuu'”来保护主循环。有关详细信息,请参阅关于并行的joblib文档
你能帮我解决这个问题吗?
谢谢

显而易见的解决办法是将
n_jobs=1
(禁用并行计算)---但我不确定这是否是您正在寻找的解决方案。

您是如何运行上述代码的?我可以使用我的数据集在python2和python3中从终端和spyder运行代码,没有任何错误。请尝试将库更新到最新版本,谢谢,它可以工作
ImportError                               Traceback (most recent call last)
<ipython-input-509-5e50f1655a89> in <module>()
      3 # Calculating RFE for non-discretised dataset, and graphing the Importance for each feature, per dataset
      4 selector1 = RFECV(LogisticRegression(), step=1, cv=5, n_jobs=-1)
----> 5 selector1 = selector1.fit(dataset_con_enc.drop('target', axis=1).values, dataset_con_enc['target'].values)

~\AppData\Local\Continuum\anaconda3\lib\site-packages\sklearn\feature_selection\rfe.py in fit(self, X, y)
    434         scores = parallel(
    435             func(rfe, self.estimator, X, y, train, test, scorer)
--> 436             for train, test in cv.split(X, y))
    437 
    438         scores = np.sum(scores, axis=0)

~\AppData\Local\Continuum\anaconda3\lib\site-packages\sklearn\externals\joblib\parallel.py in __call__(self, iterable)
    747         self._aborting = False
    748         if not self._managed_backend:
--> 749             n_jobs = self._initialize_backend()
    750         else:
    751             n_jobs = self._effective_n_jobs()

~\AppData\Local\Continuum\anaconda3\lib\site-packages\sklearn\externals\joblib\parallel.py in _initialize_backend(self)
    545         try:
    546             n_jobs = self._backend.configure(n_jobs=self.n_jobs, parallel=self,
--> 547                                              **self._backend_args)
    548             if self.timeout is not None and not self._backend.supports_timeout:
    549                 warnings.warn(

~\AppData\Local\Continuum\anaconda3\lib\site-packages\sklearn\externals\joblib\_parallel_backends.py in configure(self, n_jobs, parallel, **backend_args)
    303         if already_forked:
    304             raise ImportError(
--> 305                 '[joblib] Attempting to do parallel computing '
    306                 'without protecting your import on a system that does '
    307                 'not support forking. To use parallel-computing in a '

ImportError: [joblib] Attempting to do parallel computing without protecting your import on a system that does not support forking. To use parallel-computing in a script, you must protect your main loop using "if __name__ == '__main__'". Please see the joblib documentation on Parallel for more information