“随机搜索”中的示例;使用Python和H2O进行机器学习;手动不工作

“随机搜索”中的示例;使用Python和H2O进行机器学习;手动不工作,python,scikit-learn,h2o,valueerror,Python,Scikit Learn,H2o,Valueerror,我有点困惑,因为我没有从工作中得到最后一个例子(第36页) 代码如下: import h2o h2o.init() from h2o.estimators.gbm import H2OGradientBoostingEstimator from h2o.transforms.preprocessing import H2OScaler from h2o.cross_validation import H2OKFold from h2o.model.regression import h2o_r

我有点困惑,因为我没有从工作中得到最后一个例子(第36页)

代码如下:

import h2o
h2o.init()

from h2o.estimators.gbm import H2OGradientBoostingEstimator
from h2o.transforms.preprocessing import H2OScaler
from h2o.cross_validation import H2OKFold
from h2o.model.regression import h2o_r2_score

from sklearn.pipeline import Pipeline
from sklearn.model_selection import RandomizedSearchCV
from sklearn.metrics.scorer import make_scorer


h2o.__PROGRESS_BAR__=False
h2o.no_progress()

iris_data_path = "http://h2o-public-test-data.s3.amazonaws.com/smalldata/iris/iris.csv"# load demonstration data1819In [5]: 
iris_df = h2o.import_file(path=iris_data_path)

params = {"standardize__center":    [True, False],
          "standardize__scale":     [True, False],
          "gbm__ntrees":            [10,20],
          "gbm__max_depth":         [1,2,3],
          "gbm__learn_rate":        [0.1,0.2]}

custom_cv = H2OKFold(iris_df, n_folds=5, seed=42)

pipeline = Pipeline([("standardize", H2OScaler()),
                     ("gbm", H2OGradientBoostingEstimator(distribution="gaussian"))])

random_search = RandomizedSearchCV(pipeline, params, n_iter=5, scoring=make_scorer(h2o_r2_score),
                                               cv=custom_cv, random_state=42, n_jobs=1)

random_search.fit(iris_df[1:], iris_df[0])
它返回错误ValueError:没有有效的列规范。只允许所有整数或字符串的标量、列表或切片,或布尔掩码

完整的终端信息:

Traceback (most recent call last):

  File "untitled-Copy1.py", line 34, in <module>
    random_search.fit(iris_df[1:], iris_df[0])
  File "/department/jupyter-dev/anaconda3/envs/python36/lib/python3.6/site-packages/sklearn/model_selection/_search.py", line 710, in fit
    self._run_search(evaluate_candidates)
  File "/department/jupyter-dev/anaconda3/envs/python36/lib/python3.6/site-packages/sklearn/model_selection/_search.py", line 1484, in _run_search
    random_state=self.random_state))
  File "/department/jupyter-dev/anaconda3/envs/python36/lib/python3.6/site-packages/sklearn/model_selection/_search.py", line 689, in evaluate_candidates
    cv.split(X, y, groups)))
  File "/department/jupyter-dev/anaconda3/envs/python36/lib/python3.6/site-packages/joblib/parallel.py", line 1004, in __call__
    if self.dispatch_one_batch(iterator):
  File "/department/jupyter-dev/anaconda3/envs/python36/lib/python3.6/site-packages/joblib/parallel.py", line 835, in dispatch_one_batch
    self._dispatch(tasks)
  File "/department/jupyter-dev/anaconda3/envs/python36/lib/python3.6/site-packages/joblib/parallel.py", line 754, in _dispatch
    job = self._backend.apply_async(batch, callback=cb)
  File "/department/jupyter-dev/anaconda3/envs/python36/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 209, in apply_async
    result = ImmediateResult(func)
  File "/department/jupyter-dev/anaconda3/envs/python36/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 590, in __init__
    self.results = batch()
  File "/department/jupyter-dev/anaconda3/envs/python36/lib/python3.6/site-packages/joblib/parallel.py", line 256, in __call__
    for func, args, kwargs in self.items]
  File "/department/jupyter-dev/anaconda3/envs/python36/lib/python3.6/site-packages/joblib/parallel.py", line 256, in <listcomp>
    for func, args, kwargs in self.items]
  File "/department/jupyter-dev/anaconda3/envs/python36/lib/python3.6/site-packages/sklearn/model_selection/_validation.py", line 508, in _fit_and_score
    X_train, y_train = _safe_split(estimator, X, y, train)
  File "/department/jupyter-dev/anaconda3/envs/python36/lib/python3.6/site-packages/sklearn/utils/metaestimators.py", line 201, in _safe_split
    X_subset = _safe_indexing(X, indices)
  File "/department/jupyter-dev/anaconda3/envs/python36/lib/python3.6/site-packages/sklearn/utils/__init__.py", line 390, in _safe_indexing
    indices_dtype = _determine_key_type(indices)
  File "/department/jupyter-dev/anaconda3/envs/python36/lib/python3.6/site-packages/sklearn/utils/__init__.py", line 288, in _determine_key_type
    raise ValueError(err_msg)
ValueError: No valid specification of the columns. Only a scalar, list or slice of all integers or all strings, or boolean mask is allowed
Closing connection _sid_b8c1 at exit
H2O session _sid_b8c1 closed.
回溯(最近一次呼叫最后一次):
文件“untitled-Copy1.py”,第34行,在
随机搜索.fit(iris_-df[1:],iris_-df[0])
文件“/department/jupyter dev/anaconda3/envs/python36/lib/python3.6/site packages/sklearn/model_selection/_search.py”,第710行
自我评估。运行搜索(评估候选人)
文件“/department/jupyter dev/anaconda3/envs/python36/lib/python3.6/site packages/sklearn/model\u selection/\u search.py”,第1484行,在运行搜索中
随机状态=自身。随机状态)
文件“/department/jupyter-dev/anaconda3/envs/python36/lib/python3.6/site-packages/sklearn/model\u-selection/\u-search.py”,第689行,在评估候选者中
cv.分割(X、y、组)
文件“/department/jupyter-dev/anaconda3/envs/python36/lib/python3.6/site-packages/joblib/parallel.py”,第1004行,在调用中__
如果self.dispatch\u一批(迭代器):
文件“/department/jupyter-dev/anaconda3/envs/python36/lib/python3.6/site-packages/joblib/parallel.py”,第835行,一批发送
自我分配(任务)
文件“/department/jupyter dev/anaconda3/envs/python36/lib/python3.6/site packages/joblib/parallel.py”,第754行,在
作业=self.\u后端.apply\u异步(批处理,回调=cb)
文件“/department/jupyter-dev/anaconda3/envs/python36/lib/python3.6/site-packages/joblb/_-parallel_-backends.py”,第209行,在apply_-async中
结果=立即结果(func)
文件“/department/jupyter-dev/anaconda3/envs/python36/lib/python3.6/site-packages/joblib/_-parallel\u-backends.py”,第590行,在uu-init中__
self.results=batch()
文件“/department/jupyter-dev/anaconda3/envs/python36/lib/python3.6/site-packages/joblib/parallel.py”,第256行,在调用__
对于self.items中的func、args、kwargs]
文件“/department/jupyter-dev/anaconda3/envs/python36/lib/python3.6/site-packages/joblib/parallel.py”,第256行,在
对于self.items中的func、args、kwargs]
文件“/department/jupyter-dev/anaconda3/envs/python36/lib/python3.6/site-packages/sklearn/model\u-selection/\u-validation.py”,第508行,在“fit”和“score”中
X\u序列,y\u序列=\u安全分割(估计量,X,y,序列)
文件“/department/jupyter-dev/anaconda3/envs/python36/lib/python3.6/site-packages/sklearn/utils/metaestimators.py”,第201行,分拆
X_子集=_安全索引(X,索引)
文件“/department/jupyter-dev/anaconda3/envs/python36/lib/python3.6/site-packages/sklearn/utils/_-init__.py”,第390行,在安全索引中
索引\u数据类型=\u确定\u键\u类型(索引)
文件“/department/jupyter-dev/anaconda3/envs/python36/lib/python3.6/site-packages/sklearn/utils/_______________.py”,第288行,以确定键类型
提升值错误(错误消息)
ValueError:没有有效的列规范。只允许所有整数或字符串的标量、列表或切片,或布尔掩码
关闭出口处的连接_sid_b8c1
H2O会话_sid_b8c1已关闭。
我正在使用Python3.6.10和sklearn 0.22.1以及h2o 3.28.0.3

我做错了什么?感谢您的帮助

祝你度过愉快的一天:)

没有人吗?(我知道我不太喜欢评论自己的问题,但我真的很感谢你的评论。)顺便说一句,我想做的是应用带有L1惩罚的有序逻辑回归。我知道,存在mord包,但这只实现了L2惩罚。这存在于H2o库中。我不希望切换到R,因为我所有的并行化代码都是Python(我用于GBM)。另外,我正在寻找python中顺序svm回归的实现。谢谢你的帮助!:)没有人(我知道我不太喜欢评论自己的问题,但我真的很感谢你的评论。)顺便说一句,我想做的是应用带有L1惩罚的有序逻辑回归。我知道,存在mord包,但这只实现了L2惩罚。这存在于H2o库中。我不希望切换到R,因为我所有的并行化代码都是Python(我用于GBM)。另外,我正在寻找python中顺序svm回归的实现。谢谢你的帮助!:)