Scikit learn XGBRegressor比GradientBoostingRegressor慢得多
我是新来的Scikit learn XGBRegressor比GradientBoostingRegressor慢得多,scikit-learn,xgboost,Scikit Learn,Xgboost,我是新来的xgboost,我试图通过与传统的gbm进行比较来学习如何使用它。但是,我注意到xgboost比gbm慢得多。例如: from sklearn.model_selection import KFold, GridSearchCV from sklearn.ensemble import GradientBoostingRegressor from xgboost import XGBRegressor from sklearn.datasets import load_boston i
xgboost
,我试图通过与传统的gbm
进行比较来学习如何使用它。但是,我注意到xgboost
比gbm
慢得多。例如:
from sklearn.model_selection import KFold, GridSearchCV
from sklearn.ensemble import GradientBoostingRegressor
from xgboost import XGBRegressor
from sklearn.datasets import load_boston
import time
boston = load_boston()
X = boston.data
y = boston.target
kf = KFold(n_splits = 5)
cv_params = {'cv': kf, 'scoring': 'r2', 'n_jobs': 4, 'verbose': 1}
gbm = GradientBoostingRegressor()
xgb = XGBRegressor()
grid = {'n_estimators': [100, 300, 500], 'max_depth': [3, 5]}
timer = time.time()
gbm_cv = GridSearchCV(gbm, param_grid = grid, **cv_params).fit(X, y)
print('GBM time: ', time.time() - timer)
timer = time.time()
xgb_cv = GridSearchCV(xgb, param_grid = grid, **cv_params).fit(X, y)
print('XGB time: ', time.time() - timer)
在具有8核的Macbook Pro上,输出为:
Fitting 5 folds for each of 6 candidates, totalling 30 fits
[Parallel(n_jobs=4)]: Done 30 out of 30 | elapsed: 1.9s finished
GBM time: 2.262791872024536
Fitting 5 folds for each of 6 candidates, totalling 30 fits
[Parallel(n_jobs=4)]: Done 30 out of 30 | elapsed: 16.4s finished
XGB time: 17.902266025543213
我认为xgboost应该快得多,所以我一定是做错了什么。有人能帮我指出我做错了什么吗?这是在我的机器上运行时的输出,没有在
cv\u参数中设置n\u作业
参数
Fitting 5 folds for each of 6 candidates, totalling 30 fits
[Parallel(n_jobs=1)]: Done 30 out of 30 | elapsed: 4.1s finished
('GBM time: ', 4.248916864395142)
Fitting 5 folds for each of 6 candidates, totalling 30 fits
('XGB time: ', 2.934467077255249)
[Parallel(n_jobs=1)]: Done 30 out of 30 | elapsed: 2.9s finished
当n_jobs
设置为4时,GBM的输出为2.5s,但XGB需要很长时间
所以这可能是n_工作的问题!可能XGBoost库没有很好地配置为使用GridSearchCV运行n_作业。这就是我运行代码的原因:
GBM时间:2.1901206970214844 XGB时间:2.563245534896856
。我相信可能是XGBoost库中的nthreads
,而不是至少某些版本的XGBoost中导致此问题的n_作业。我以前肯定遇到过。是的,我刚找到这个: