Python 如何与jupyter和sklearn并行?

Python 如何与jupyter和sklearn并行?,python,scikit-learn,jupyter-notebook,jupyter,ipython-parallel,Python,Scikit Learn,Jupyter Notebook,Jupyter,Ipython Parallel,我正在尝试并行化scikit-learn的GridSearchCV。它在一个jupyter(hub)笔记本上运行。经过一些研究,我发现以下代码: from sklearn.externals.joblib import Parallel, parallel_backend, register_parallel_backend from ipyparallel import Client from ipyparallel.joblib import IPythonParallelBackend

我正在尝试并行化
scikit-learn
GridSearchCV
。它在一个
jupyter(hub)笔记本上运行。经过一些研究,我发现以下代码:

from sklearn.externals.joblib import Parallel, parallel_backend, register_parallel_backend
from ipyparallel import Client
from ipyparallel.joblib import IPythonParallelBackend

c = Client(profile='myprofile')
print(c.ids)
bview = c.load_balanced_view()

register_parallel_backend('ipyparallel', lambda : IPythonParallelBackend(view=bview))

grid = GridSearchCV(pipeline, cv=3, n_jobs=4, param_grid=param_grid)

with parallel_backend('ipyparallel'):
    grid.fit(X_train, Y_train)
请注意,我已将
n_jobs
参数设置为
4
,这是机器的cpu内核数。(这是
nproc
返回的内容)

但它似乎不起作用:
ImportError:无法导入名称“register\U parallel\U backend”
,尽管我使用
conda install joblib安装了joblib
,还尝试了
pip install-U joblib

那么,在这种环境下,并行化
GridSearchCV
的最佳方法是什么

更新:

无需
ipyparallel
,只需设置
n_作业
参数:

grid = GridSearchCV(pipeline, cv=3, n_jobs=4, param_grid=param_grid)
grid.fit(X_train, Y_train)
结果显示以下警告消息:

/opt/conda/lib/python3.5/site-  packages/sklearn/externals/joblib/parallel.py:540: UserWarning:

Multiprocessing-backed parallel loops cannot be nested, setting n_jobs=1

似乎它以顺序执行而不是并行执行结束。

我认为
n_jobs=-1
会将所有cpu核心启动到parallel@AlexanderYau:仅设置参数就会抛出错误消息,我已更新帖子。您的机器中有多少cpu内核?@AlexanderYau正好有4个cpu内核。