Warning: file_get_contents(/data/phpspider/zhask/data//catemap/6/entity-framework/4.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 使用H2O实现网格搜索时出现服务器错误Water.exceptions.H2OIllegalArgumentException_Python_H2o_Grid Search_Gbm - Fatal编程技术网

Python 使用H2O实现网格搜索时出现服务器错误Water.exceptions.H2OIllegalArgumentException

Python 使用H2O实现网格搜索时出现服务器错误Water.exceptions.H2OIllegalArgumentException,python,h2o,grid-search,gbm,Python,H2o,Grid Search,Gbm,我是一个使用H2O的新手。我正在尝试使用GBM运行H2OGridSearch以获取最佳超参数。我正在按照在的指示行事。当我尝试回归时,它工作得很好,但现在当我尝试分类时,它给了我一个错误 这是我的密码: import h2o from h2o.automl import H2OAutoML h2o.init(nthreads = -1, max_mem_size = 8) data = h2o.import_file("train.csv") y = "target" data[y] =

我是一个使用H2O的新手。我正在尝试使用GBM运行H2OGridSearch以获取最佳超参数。我正在按照在的指示行事。当我尝试回归时,它工作得很好,但现在当我尝试分类时,它给了我一个错误

这是我的密码:

    import h2o
from h2o.automl import H2OAutoML
h2o.init(nthreads = -1, max_mem_size = 8)
data = h2o.import_file("train.csv")
y = "target"
data[y] = data[y].asfactor()
X = data.columns
X.remove(y)
from h2o.grid.grid_search import H2OGridSearch
from h2o.estimators import H2OGradientBoostingEstimator

# hyper_params = {'max_depth' : range(1,30,2)}
hyper_params = {'max_depth' : [4,6,8,12,16,20]} ##faster for larger datasets

#Build initial GBM Model
gbm_grid = H2OGradientBoostingEstimator(
        ## more trees is better if the learning rate is small enough 
        ## here, use "more than enough" trees - we have early stopping
        ntrees=10000,
        ## smaller learning rate is better
        ## since we have learning_rate_annealing, we can afford to start with a 
        #bigger learning rate
        learn_rate=0.05,
        ## learning rate annealing: learning_rate shrinks by 1% after every tree 
        ## (use 1.00 to disable, but then lower the learning_rate)
        learn_rate_annealing = 0.99,
        ## sample 80% of rows per tree
        sample_rate = 0.8,
        ## sample 80% of columns per split
        col_sample_rate = 0.8,
        ## fix a random number generator seed for reproducibility
        seed = 1234,
        ## score every 10 trees to make early stopping reproducible 
        #(it depends on the scoring interval)
        score_tree_interval = 10, 
        ## early stopping once the validation AUC doesn't improve by at least 0.01% for 
        #5 consecutive scoring events
        stopping_rounds = 5,
        stopping_metric = "AUC",
        stopping_tolerance = 1e-4)

#Build grid search with previously made GBM and hyper parameters
grid = H2OGridSearch(gbm_grid,hyper_params,
                         grid_id = 'depth_grid',
                         search_criteria = {'strategy': "Cartesian"})


#Train grid search
grid.train(x=X, 
           y=y,
           training_frame = data)
以下是错误:

`

H2OResponseError回溯(最近一次调用)
在()
38网格列车(x=x,
39 y=y,
--->40训练(帧=数据)
列车中的~\Anaconda3\lib\site packages\h2o\grid\grid\u search.py(self、x、y、训练帧、偏移列、折叠列、权重列、验证帧、**参数)
207 x=列表(xset)
208帕姆[“x”]=x
-->209自我构建模型(parms)
210
211
~\Anaconda3\lib\site packages\h2o\grid\grid\u search.py内建模型(self,algou参数)
225 y=y,如果y在训练框架中。名称其他训练框架。名称[y]
226自我模型._估计器_type=“分类器”如果训练_frame.types[y]=“枚举”否则“回归器”
-->227自我模型构建(x、y、训练框架、验证框架、算法参数)
228
229
~\Anaconda3\lib\site packages\h2o\grid\grid\u search.py in\u model\u build(self、x、y、tframe、vframe、kwargs)
247 rest\u ver=kwargs.pop(“\u rest\u version”),如果kwargs中的“\u rest\u version”为“无”
248
-->249 grid=H2OJob(h2o.api(“POST/99/grid/%s”%algo,data=kwargs),作业类型=(algo+“网格构建”))
250
251如果是自己的未来:
api中的~\Anaconda3\lib\site packages\h2o\h2o.py(端点、数据、json、文件名、保存到)
101#在连接类中执行类型检查
102检查连接()
-->103返回h2oconn.request(端点,data=data,json=json,filename=filename,save_to=save_to)
104
105
请求中的~\Anaconda3\lib\site packages\h2o\backend\connection.py(self、endpoint、data、json、filename、save_to)
405认证=自我。\认证,验证=自我。\验证\u ssl\u证书,代理=自我。\代理)
406自记录结束事务(开始时间,resp)
-->407返回自处理响应(响应,保存到)
408
409例外(requests.exceptions.ConnectionError、requests.exceptions.HTTPError)为e:
响应中的~\Anaconda3\lib\site packages\h2o\backend\connection.py(响应,保存到)
741#客户端错误(400=“错误请求”,404=“未找到”,412=“前提条件失败”)
742如果状态代码在{400,404,412}和isinstance(数据,(H2OErrorV3,H2OModelBuilderErrorRv3)):
-->743响应错误(数据)
744
745#服务器错误(尤其是500=“服务器错误”)
H2OResponseError:服务器错误water.exceptions.H2OIllegalArgumentException:
错误:非法参数:training\u函数框架:grid:无法将新模型附加到具有不同训练输入的网格
请求:POST/99/Grid/gbm
数据:{'search_-criteria':“{'strategy':'Cartesian'}”,“hyper_-parameters':“{'max_-depth':[4,6,8,12,16,20]}”“ntrees”:“10000”,“学习率”:“0.05”,“学习率”:“0.99”,“样本率”:“0.8”,“col_样本率”:“0.8”,“种子”:“1234”,“得分树间隔”:“10”,“停止轮数”:“5”,“停止度量”:“AUC”,“停止公差”:“0.0001”,“训练框架”:“py_1_sid_af08”,“响应列”:“目标”,“网格id”:“深度网格”}

有人能帮我做这个吗?

开始工作了!我猜水里有只虫子。。在运行网格进行分类之前,我运行了3次网格。当我重新启动Anaconda服务器时,上面的代码开始正常工作


谢谢

当您从回归更改为分类时,您是否也更改了数据集?如果是这样,那么您还需要更改
H2OGridSearch()
中的
grid\u id
参数。当您训练第一个网格并使用
grid\u id='depth\u grid'
,然后使用相同的
grid\u id
再次运行grid函数时,它会将模型附加到该现有网格。由于网格中的所有模型都必须使用相同的训练集,这可以解释为什么会出现错误。
H2OResponseError                          Traceback (most recent call last)
<ipython-input-14-cc0918796da0> in <module>()
     38 grid.train(x=X, 
     39            y=y,
---> 40            training_frame = data)

~\Anaconda3\lib\site-packages\h2o\grid\grid_search.py in train(self, x, y, training_frame, offset_column, fold_column, weights_column, validation_frame, **params)
    207         x = list(xset)
    208         parms["x"] = x
--> 209         self.build_model(parms)
    210 
    211 

~\Anaconda3\lib\site-packages\h2o\grid\grid_search.py in build_model(self, algo_params)
    225             y = y if y in training_frame.names else training_frame.names[y]
    226             self.model._estimator_type = "classifier" if training_frame.types[y] == "enum" else "regressor"
--> 227         self._model_build(x, y, training_frame, validation_frame, algo_params)
    228 
    229 

~\Anaconda3\lib\site-packages\h2o\grid\grid_search.py in _model_build(self, x, y, tframe, vframe, kwargs)
    247         rest_ver = kwargs.pop("_rest_version") if "_rest_version" in kwargs else None
    248 
--> 249         grid = H2OJob(h2o.api("POST /99/Grid/%s" % algo, data=kwargs), job_type=(algo + " Grid Build"))
    250 
    251         if self._future:

~\Anaconda3\lib\site-packages\h2o\h2o.py in api(endpoint, data, json, filename, save_to)
    101     # type checks are performed in H2OConnection class
    102     _check_connection()
--> 103     return h2oconn.request(endpoint, data=data, json=json, filename=filename, save_to=save_to)
    104 
    105 

~\Anaconda3\lib\site-packages\h2o\backend\connection.py in request(self, endpoint, data, json, filename, save_to)
    405                                     auth=self._auth, verify=self._verify_ssl_cert, proxies=self._proxies)
    406             self._log_end_transaction(start_time, resp)
--> 407             return self._process_response(resp, save_to)
    408 
    409         except (requests.exceptions.ConnectionError, requests.exceptions.HTTPError) as e:

~\Anaconda3\lib\site-packages\h2o\backend\connection.py in _process_response(response, save_to)
    741         # Client errors (400 = "Bad Request", 404 = "Not Found", 412 = "Precondition Failed")
    742         if status_code in {400, 404, 412} and isinstance(data, (H2OErrorV3, H2OModelBuilderErrorV3)):
--> 743             raise H2OResponseError(data)
    744 
    745         # Server errors (notably 500 = "Server Error")

H2OResponseError: Server error water.exceptions.H2OIllegalArgumentException:
  Error: Illegal argument: training_frame of function: grid: Cannot append new models to a grid with different training input


  Request: POST /99/Grid/gbm
    data: {'search_criteria': "{'strategy': 'Cartesian'}", 'hyper_parameters': "{'max_depth': [4, 6, 8, 12, 16, 20]}", 'ntrees': '10000', 'learn_rate': '0.05', 'learn_rate_annealing': '0.99', 'sample_rate': '0.8', 'col_sample_rate': '0.8', 'seed': '1234', 'score_tree_interval': '10', 'stopping_rounds': '5', 'stopping_metric': 'AUC', 'stopping_tolerance': '0.0001', 'training_frame': 'py_1_sid_af08', 'response_column': 'target', 'grid_id': 'depth_grid'}