Python RandomizedSearchCV能否在多个CPU上运行?
我正在尝试使用随机搜索CV来学习逻辑回归的hypeparameters。代码如下:Python RandomizedSearchCV能否在多个CPU上运行?,python,parallel-processing,machine-learning,scikit-learn,Python,Parallel Processing,Machine Learning,Scikit Learn,我正在尝试使用随机搜索CV来学习逻辑回归的hypeparameters。代码如下: random_searcher = RandomizedSearchCV( estimator = MyLogRegClassifier(), param_distributions = {'penalty': ['l2', 'l1'], 'class_weight': [None, 'auto'], '
random_searcher = RandomizedSearchCV(
estimator = MyLogRegClassifier(),
param_distributions = {'penalty': ['l2', 'l1'],
'class_weight': [None, 'auto'],
'C': logspace(-20, 20, 10000),
'intercept_scaling': logspace(-20, 20, 10000)},
cv = 4,
scoring = make_scorer(f1_score, average='samples'),
n_iter = 100,
n_jobs = -1,
pre_dispatch = 10,
refit = False
)
random_searcher.fit(X_tr, y_tr)
然而,它不能使用多个CPU,它总是在一个CPU上运行。但是当我切换到GridSearchCV
时,它会生成多个进程并加载所有CPU
grid_searcher = GridSearchCV(
estimator = MyLogRegClassifier(),
param_grid = {'penalty': ['l2', 'l1'],
'class_weight': [None, 'auto'],
'C': logspace(-20, 20, 10000),
'intercept_scaling': logspace(-20, 20, 10000)},
cv = 4,
scoring = make_scorer(f1_score, average='samples'),
n_jobs = -1,
pre_dispatch = 10,
refit = False
)
grid_searcher.fit(X_tr, y_tr)
X_tr是形状为(6700,25640)的稀疏矩阵,y_tr是形状为(6700,83)的稠密矩阵
MyLogRegClassifier
本质上是OneVsRestClassifier(LogisticRegression())
我遗漏了什么吗?为什么要将
LogisticRegression
包装到MyLogRegClassifier
中?您可以直接使用带有网格搜索的OneVsRestClassifier
分类器。您所要做的就是更改参数dictestimator\uuuu
的键,比如estimator\uu惩罚
,因为如果我需要以某种方式调整分类器,最好在一个地方更改代码。不过,谢谢你的提示。这很奇怪。你能举个最简单的例子吗?它肯定会起作用。您使用的是哪个版本的sklearn?也不要将日志空间用于RandomizedSearchCV。提供一个列表而不是一个连续的分布基本上把随机搜索CV的大部分好处抛到了窗外
class MyLogRegClassifier(BaseEstimator, ClassifierMixin):
def __init__(self, C=1.0, penalty='l2',
intercept_scaling=1, class_weight=None, n_jobs=1):
self.C = C
self.n_jobs = n_jobs
self.penalty = penalty
self.intercept_scaling = intercept_scaling
self.class_weight = class_weight
def fit(self, X, y):
self.clf = OneVsRestClassifier(LogisticRegression(
C=self.C, penalty=self.penalty,
intercept_scaling=self.intercept_scaling,
class_weight=self.class_weight),
n_jobs=self.n_jobs)
self.clf.fit(X, y)
return self
def predict(self, X):
return self.clf.predict(X)