Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/334.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python RandomizedSearchCv导致属性错误_Python_Numpy_Pandas_Machine Learning_Scikit Learn - Fatal编程技术网

Python RandomizedSearchCv导致属性错误

Python RandomizedSearchCv导致属性错误,python,numpy,pandas,machine-learning,scikit-learn,Python,Numpy,Pandas,Machine Learning,Scikit Learn,在RandomizedSearchCv上执行fit()后: tfidf = TfidfVectorizer(strip_accents=None,lowercase=False,preprocessor=None) param_grid = {'vect__ngram_range': [(1,1)],'vect__stop_words': [stop, None], 'vect__tokeni

在RandomizedSearchCv上执行fit()后:

        tfidf = TfidfVectorizer(strip_accents=None,lowercase=False,preprocessor=None)
        param_grid = 
            {'vect__ngram_range': [(1,1)],'vect__stop_words': [stop, None],
                       'vect__tokenizer': [tokenizer, tokenizer_porter],
                       'clf__penalty': ['l1', 'l2'],
            'clf__C': [1.0, 10.0, 100.0]},
lr_tfidf = Pipeline([('vect', tfidf),('clf',LogisticRegression(random_state=0))])
gs_lr_tfidf = RandomizedSearchCV(lr_tfidf,param_grid,cv=5,n_jobs=1)
gs_lr_tfidf.fit(X_train, y_train)
我发现以下错误:

    Traceback (most recent call last):
  File "G:/pythonprojectraschka/ch08/ch08-2.py", line 95, in <module>
    gs_lr_tfidf.fit(X_train, y_train)
  File "C:\Anaconda3\lib\site-packages\sklearn\grid_search.py", line 996, in fit
    return self._fit(X, y, sampled_params)
  File "C:\Anaconda3\lib\site-packages\sklearn\grid_search.py", line 553, in _fit
    for parameters in parameter_iterable
  File "C:\Anaconda3\lib\site-packages\sklearn\externals\joblib\parallel.py", line 800, in __call__
    while self.dispatch_one_batch(iterator):
  File "C:\Anaconda3\lib\site-packages\sklearn\externals\joblib\parallel.py", line 653, in dispatch_one_batch
    tasks = BatchedCalls(itertools.islice(iterator, batch_size))
  File "C:\Anaconda3\lib\site-packages\sklearn\externals\joblib\parallel.py", line 68, in __init__
    self.items = list(iterator_slice)
  File "C:\Anaconda3\lib\site-packages\sklearn\grid_search.py", line 549, in <genexpr>
    delayed(_fit_and_score)(clone(base_estimator), X, y, self.scorer_,
  File "C:\Anaconda3\lib\site-packages\sklearn\grid_search.py", line 223, in __iter__
    for v in self.param_distributions.values()])
AttributeError: 'list' object has no attribute 'values'
原因可能是什么?X_列(文本)和y_列(二进制)是合适的(我猜)numpy数组

数据集的完整代码:

在这里,您使用的是a而不是。 看起来它认为其中一个参数是一个分布,并尝试从这个分布中取样

因此,如果您可以使用GridSearchCV对所有参数进行彻底搜索,那么这就是您的解决方案


这个“价值观”从何而来?你能解释一下吗?如果删除n_jobs参数,此错误是否会持续?n_jobs参数不会改变任何内容。使用.values()方法将X序列和y_序列转换为numpy数组。我可以提供完整的代码,如果必要的话(对不起,我是python新手),你能试试
打印类型(X\u train)
打印类型(y\u train)
。所以这不是问题的根源。我猜这可能是一个与并行处理(n_作业)相关的bug。我也使用windows,但在我的例子中n_jobs=1(因此它不会引发任何问题)尝试提供完整的代码和玩具示例,这样人们就可以轻松地复制粘贴和测试您的代码。我刚刚用gridsearchcv运行了您的代码,它对我有效。非常感谢您的帮助:)您显然是对的。
lr_tfidf.fit(X_train, y_train)