Python:LightGBM超参数调整值错误_Python_Python 3.x_Scikit Learn_Classification_Lightgbm

Python:LightGBM超参数调整值错误

python python-3.x scikit-learn

Python:LightGBM超参数调整值错误,python,python-3.x,scikit-learn,classification,lightgbm,Python,Python 3.x,Scikit Learn,Classification,Lightgbm,我已经编写了以下代码来在LightGBM分类器模型上执行RandomizedSearchCV，但是我得到了以下错误 ValueError：对于提前停止，评估至少需要一个数据集和评估指标代码 import lightgbm as lgb fit_params={"early_stopping_rounds":30, "eval_metric" : 'f1', "eval_set" :

我已经编写了以下代码来在LightGBM分类器模型上执行

RandomizedSearchCV

，但是我得到了以下错误

ValueError：对于提前停止，评估至少需要一个数据集和评估指标

代码

import lightgbm as lgb
fit_params={"early_stopping_rounds":30, 
            "eval_metric" : 'f1', 
            "eval_set" : [(X_val,y_val)],
            'eval_names': ['valid'],
            'verbose': 100,
            # 'categorical_feature': 'auto'
            }

from scipy.stats import randint as sp_randint
from scipy.stats import uniform as sp_uniform
param_test ={'num_leaves': sp_randint(6, 50), 
             'min_child_samples': sp_randint(100, 500), 
             'min_child_weight': [1e-5, 1e-3, 1e-2, 1e-1, 1, 1e1, 1e2, 1e3, 1e4],
             'subsample': sp_uniform(loc=0.2, scale=0.8), 
             'colsample_bytree': sp_uniform(loc=0.4, scale=0.6),
             'reg_alpha': [0, 1e-1, 1, 2, 5, 7, 10, 50, 100],
             'reg_lambda': [0, 1e-1, 1, 5, 10, 20, 50, 100]}

n_HP_points_to_test = 100

from sklearn.model_selection import RandomizedSearchCV
#n_estimators is set to a "large value". The actual number of trees build will depend on early stopping and 5000 define only the absolute maximum
clf = lgb.LGBMClassifier(max_depth=-1, 
                         random_state=42, 
                         silent=True, 
                         metric='f1', 
                         n_jobs=4, 
                         n_estimators=5000,
                         )

gs = RandomizedSearchCV(
    estimator=clf, param_distributions=param_test, 
    n_iter=n_HP_points_to_test,
    scoring='f1',
    cv=3,
    refit=True,
    random_state=41,
    verbose=True)

gs.fit(X_trn, y_trn, **fit_params)
print('Best score reached: {} with params: {} '.format(gs.best_score_, gs.best_params_))

尝试过的解决方案
我曾尝试实施以下链接中给出的解决方案，但没有一个有效。如何解决这个问题

第三个链接（2020年2月）中的最后一条消息表明，如果无法识别度量，则会引发此错误，

“f1”

确实不是其中之一。可以使用其中一个内置项（但仍可以使用F1作为超参数搜索的选择标准），也可以创建自定义指标（请参见末尾的说明）。

LightGBM中的

F1

不是内置指标。您可以轻松添加自定义评估指标：

从sklearn.metrics导入f1\U分数
def lightgbm_eval_metric_f1（预测、数据训练）：
target=dtrain.get_label（）
重量=dtrain.get_重量（）
唯一目标=np.唯一（目标）
如果len（唯一_目标）>2：
cols=len（唯一目标）
rows=int（preds.shape[0]/len（唯一目标））
preds=np.重塑（preds，（行，列），order=“F”）
返回“f1”，f1_分数（目标，预测，重量），真

关于优化，我宁愿使用LightGBM的本地pythonapi（

LightGBM.train

）和

Optuna

框架，这非常有效

框架：

但是使用Optuna调优LightGBM最简单的方法是使用MLJAR AutoML（它内置了

f1

metric）


automl=automl(
mode=“Optuna”
算法=[“LightGBM”]，
optuna_时间_预算=600，#调谐10分钟
eval_metric=“f1”
)
自动拟合（X，y）

MLJAR AutoML框架：

如果您想在MLJAR中检查LightGBM+Optuna优化的详细信息，下面是代码