Python 使用贝叶斯优化绘制xgboost评估指标
我正在使用这段代码对XGBoost进行贝叶斯优化调优和训练。我想根据时代来绘制日志损失,但我还没有找到一种方法 这是我的XGBoost代码:Python 使用贝叶斯优化绘制xgboost评估指标,python,plot,xgboost,bayesian,Python,Plot,Xgboost,Bayesian,我正在使用这段代码对XGBoost进行贝叶斯优化调优和训练。我想根据时代来绘制日志损失,但我还没有找到一种方法 这是我的XGBoost代码: def bayes_fun(parameters): ''' Function that that sets paramters and performs cross-validation for Bayesian Optimisation ''' parameters = parameters[0]
def bayes_fun(parameters):
'''
Function that that sets paramters and performs cross-validation for Bayesian Optimisation
'''
parameters = parameters[0] # setting regressor parameters
reg = xgb.XGBClassifier(objective = 'multi:softmax', # objective function (target variable follows poisson distribution)
num_class = 3,
eval_metric = 'mlogloss', # eval metric for poisson distribution
base_score = 1, # specific to scoring
learning_rate = [8] # learning rate
random_state = 1234, # set seed
max_depth = int(parameters[0]), # maximum depth trees can grow
min_child_weight=parameters[1], # amount of weight for a tree to produce a child
gamma = parameters[2], # controls minimum loss reduction
reg_alpha = parameters[3], # L1 regularisation term on weights
reg_lambda = parameters[4], # L2 regularisation term on weights
subsample = parameters[5], # ratio of the training instances
colsample_bytree = parameters[6], # ratio of number of features used when constructing a tree
colsample_bylevel = parameters[7],
max_delta_step =parameters[9]) # ratio of number of features used at each node of each tree
reg_param = reg.get_xgb_params() # getting parameters from regressor
# cross validation
bst = xgb.cv(params = reg_param, # setting parameters
dtrain = DTrain, # training data
folds = folds_list, # list of folds to use
num_boost_round = 9999, # number of iterations
early_stopping_rounds = 30, # early stopping (reduce overfitting)
verbose_eval = False) # don\t show output text
score = -100 * bst['test-mlogloss-mean'].iloc[-1] # if you are using negative log-likelyhood use minus
return score
# set parameter ranges
domains = [{'name': 'max_depth', 'type': 'discrete', 'domain': (2, 10)},
{'name': 'minchild', 'type': 'continuous', 'domain': (5, 500)},
{'name': 'gamma', 'type': 'continuous', 'domain': (0, 1)},
{'name': 'alpha', 'type': 'continuous', 'domain': (0, 10)},
{'name': 'lmbd', 'type': 'continuous', 'domain': (0, 50)},
{'name': 'subsample', 'type': 'continuous', 'domain': (0.50, 0.99)},
{'name': 'colsample_bytree', 'type': 'continuous', 'domain': (0.40, 0.70)},
{'name': 'colsample_bylevel', 'type': 'continuous', 'domain': (0.40, 0.70)},
{'name': 'learning_rate', 'type': 'continuous', 'domain': (0,1)},
{'name': 'max_delta_step', 'type': 'continuous', 'domain': (0,1)}]
# Bayesian Optmisation (check package notes to adjust parameters to suit your data)
optimizer = BayesianOptimization(f=bayes_fun, domain=domains, model_type='GP', acquisition_type='LCB', initial_design_type = 'latin', initial_design_numdata=5, exact_feval=True, maximize=False)
# run Bayesian for 15 rounds
optimizer.run_optimization(max_iter=200, verbosity=True)
下面是我找到的一段代码,用于获取每个历元图的logloss,但它只是通过一个简单的XGBClassifier()
调用:
model.fit(X, data_nobands2.categoric, eval_metric=['merror','mlogloss'], eval_set=eval_s)
results = model.evals_result()
epochs = len(results['validation_0']['merror'])
import matplotlib.pyplot as pyplot
x_axis = range(0, epochs)
# plot log loss
fig, ax = pyplot.subplots()
ax.plot(x_axis, results['validation_0']['mlogloss'], label='Train')
ax.plot(x_axis, results['validation_1']['mlogloss'], label='Test')
ax.legend()
pyplot.ylabel('Log Loss')
pyplot.title('XGBoost Log Loss')
pyplot.show()
我认为最后一段代码中可能有一部分可以用于贝叶斯优化,但我不这么认为;我不知道怎么做。有没有人这样做过,或者类似的事情