Python PicklingError:无法对任务进行pickle以将其发送给工作人员
我正在从事一个NLP Kaggle项目,我在项目中使用RandomizedSearchCV。我已经定义了一个名为GO的函数,它使用KFold和评分标准以及grid_参数实现了RandomizedSearchCV。下面是我的代码,当我调用函数GO时,它会给出一个错误:Python PicklingError:无法对任务进行pickle以将其发送给工作人员,python,pickle,Python,Pickle,我正在从事一个NLP Kaggle项目,我在项目中使用RandomizedSearchCV。我已经定义了一个名为GO的函数,它使用KFold和评分标准以及grid_参数实现了RandomizedSearchCV。下面是我的代码,当我调用函数GO时,它会给出一个错误: kf = KFold(n_splits=5, random_state=0, shuffle=True) acc = lambda y, y_pred: accuracy_score(y, y_pred) scorer = make
kf = KFold(n_splits=5, random_state=0, shuffle=True)
acc = lambda y, y_pred: accuracy_score(y, y_pred)
scorer = make_scorer(acc, greater_is_better=True)
def GO(model, grid, n_iter=100):
search = RandomizedSearchCV(model, grid, n_iter, scorer, n_jobs=-1, cv=kf, random_state=0, verbose=True)
return search.fit(X_train, y_train)
这是我得到的错误:
PicklingError Traceback (most recent call last)
<ipython-input-131-310dea03e0ad> in <module>
3
4 for pipe, grid in zip(pipes, grids):
----> 5 fitted_models.append(GO(pipe, grid))
<ipython-input-129-98eb26241ea1> in GO(model, grid, n_iter)
1 def GO(model, grid, n_iter=100):
2 search = RandomizedSearchCV(model, grid, n_iter, scorer, n_jobs=-1, cv=kf, random_state=0, verbose=True)
----> 3 return search.fit(X_train, y_train)
~\Anaconda3\lib\site-packages\sklearn\model_selection\_search.py in fit(self, X, y, groups, **fit_params)
720 return results_container[0]
721
--> 722 self._run_search(evaluate_candidates)
723
724 results = results_container[0]
~\Anaconda3\lib\site-packages\sklearn\model_selection\_search.py in _run_search(self, evaluate_candidates)
1513 evaluate_candidates(ParameterSampler(
1514 self.param_distributions, self.n_iter,
-> 1515 random_state=self.random_state))
~\Anaconda3\lib\site-packages\sklearn\model_selection\_search.py in evaluate_candidates(candidate_params)
709 for parameters, (train, test)
710 in product(candidate_params,
--> 711 cv.split(X, y, groups)))
712
713 all_candidate_params.extend(candidate_params)
~\Anaconda3\lib\site-packages\sklearn\externals\joblib\parallel.py in __call__(self, iterable)
928
929 with self._backend.retrieval_context():
--> 930 self.retrieve()
931 # Make sure that we get a last message telling us we are done
932 elapsed_time = time.time() - self._start_time
~\Anaconda3\lib\site-packages\sklearn\externals\joblib\parallel.py in retrieve(self)
831 try:
832 if getattr(self._backend, 'supports_timeout', False):
--> 833 self._output.extend(job.get(timeout=self.timeout))
834 else:
835 self._output.extend(job.get())
~\Anaconda3\lib\site-packages\sklearn\externals\joblib\_parallel_backends.py in wrap_future_result(future, timeout)
519 AsyncResults.get from multiprocessing."""
520 try:
--> 521 return future.result(timeout=timeout)
522 except LokyTimeoutError:
523 raise TimeoutError()
~\Anaconda3\lib\concurrent\futures\_base.py in result(self, timeout)
430 raise CancelledError()
431 elif self._state == FINISHED:
--> 432 return self.__get_result()
433 else:
434 raise TimeoutError()
~\Anaconda3\lib\concurrent\futures\_base.py in __get_result(self)
382 def __get_result(self):
383 if self._exception:
--> 384 raise self._exception
385 else:
386 return self._result
PicklingError: Could not pickle the task to send it to the workers.
PicklingError回溯(最近一次调用)
在里面
3.
4对于拉链中的管道、格栅(管道、格栅):
---->5个已安装的_模型。附加(GO(管道、网格))
in-GO(模型、网格、n_iter)
1个def GO(型号,网格,n_iter=100):
2 search=RandomizedSearchCV(模型、网格、n_iter、记分员、n_jobs=-1、cv=kf、random_state=0、verbose=True)
---->3返回搜索。拟合(X_列,y_列)
~\Anaconda3\lib\site packages\sklearn\model\u selection\\u search.py in fit(self、X、y、groups、**fit\u参数)
720返回结果\u容器[0]
721
-->722自我搜索(评估候选人)
723
724结果=结果\u容器[0]
~\Anaconda3\lib\site packages\sklearn\model\u selection\u search.py in\u run\u search(self,evaluate\u候选者)
1513评估候选参数(参数采样器)(
1514自参数分布,自n_iter,
->1515随机状态=自身。随机状态)
~\Anaconda3\lib\site packages\sklearn\model\u selection\\u search.py in evaluate\u候选者(候选者参数)
709参数(列车、试验)
710英寸产品(候选参数,
-->711 cv.分割(X、y、组)
712
713所有候选参数扩展(候选参数)
调用中的~\Anaconda3\lib\site packages\sklearn\externals\joblib\parallel.py(self,iterable)
928
929,带有self.\u backend.retrieval\u context():
-->930 self.retrieve()
931#确保我们收到最后一条消息,告诉我们已经完成
932已用时间=time.time()-self.\u开始时间
检索中的~\Anaconda3\lib\site packages\sklearn\externals\joblib\parallel.py(self)
831尝试:
832如果getattr(self.\u后端“支持\u超时”,则为False):
-->833 self.\u output.extend(job.get(timeout=self.timeout))
834其他:
835 self.\u output.extend(job.get())
~\Anaconda3\lib\site packages\sklearn\externals\joblib\\u parallel\u backends.py在wrap\u future\u结果中(future,超时)
519 AsyncResults.get from multiprocessing。”“”
520试试:
-->521返回future.result(超时=超时)
522除LokyTimeOuter错误外:
523 raise TimeoutError()
~\Anaconda3\lib\concurrent\futures\\u base.py输入结果(self,超时)
430升高取消错误()
431 elif self.\u state==完成:
-->432返回self.\u获取\u结果()
433其他:
434 raise TimeoutError()
~\Anaconda3\lib\concurrent\futures\\u base.py in\uuuuu get\u result(self)
382定义获取结果(自身):
383如果自身存在例外情况:
-->384升起自我。\u异常
385其他:
386返回自我。\u结果
PicklingError:无法对任务进行pickle以将其发送给工作人员。
我试图解决它,但无法解决。这里有人能帮我吗?模型中有什么?