Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/325.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python Sklearn具有多个元素的数组的真值不明确。使用.any()或.all()错误_Python_Numpy_Scikit Learn - Fatal编程技术网

Python Sklearn具有多个元素的数组的真值不明确。使用.any()或.all()错误

Python Sklearn具有多个元素的数组的真值不明确。使用.any()或.all()错误,python,numpy,scikit-learn,Python,Numpy,Scikit Learn,我正在网上看到的培训数据集上尝试一个代码,但似乎无法解决前面提到的错误 当我第一次运行代码时,我得到了上述错误: ValueError Traceback (most recent call last) ----> 2 knn_cv.fit(X_train, y_train) <ipython-input-21-fb975450c609> in fit(self, X, y) 214 X = normalize(X, norm='l1', copy=False

我正在网上看到的培训数据集上尝试一个代码,但似乎无法解决前面提到的错误

当我第一次运行代码时,我得到了上述错误:

ValueError  Traceback (most recent call last)
----> 2 knn_cv.fit(X_train, y_train)
<ipython-input-21-fb975450c609> in fit(self, X, y)
214         X = normalize(X, norm='l1', copy=False)
215 
--> 216         cv = check_cv(self.cv, X, y)
/anaconda3/lib/python3.6/site-packages/sklearn/model_selection/_split.py in 
check_cv(cv, y, classifier)
1980 
1981     if isinstance(cv, numbers.Integral):
-> 1982         if (classifier and (y is not None) and
1983                 (type_of_target(y) in ('binary', 'multiclass'))):
1984             return StratifiedKFold(cv)

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
如果我用你的火车,它说 TypeError:“KFold”对象不可编辑

另一个查询建议将数组更改为列表,但它给了我 np.arrayy_列车时刻表 结果: TypeError:未调整大小的对象的len

也更新了sklearn,但似乎无法修复错误。希望有人能解释什么是错误的,或者如果可能的话,我如何修改代码解释。 我对这部分代码还是有点不熟悉

使用GoogleNews-vectors-negative300.bin.gz创建的培训样本

y_列=数组[3,17,14,14,5,13,…0,1,17,16,2]

y_train.shape=100

X_列车=

我附上了代码的主要部分,如果你需要整个事情的参考,我提供了下面的链接

根据作者的说法,我应该得到这个:

[Parallel(n_jobs=3)]: Done  12 tasks      | elapsed:   30.8s

[Parallel(n_jobs=3)]: Done  34 out of  34 | elapsed:  2.0min finished

[Parallel(n_jobs=3)]: Done  12 tasks      | elapsed:   25.7s

[Parallel(n_jobs=3)]: Done  33 out of  33 | elapsed:  2.9min finished

[Parallel(n_jobs=3)]: Done  12 tasks      | elapsed:   53.3s

[Parallel(n_jobs=3)]: Done  33 out of  33 | elapsed:  2.0min finished

WordMoversKNNCV(W_embed=memmap([[ 0.04283, -0.01124, ..., -0.05679, -0.00763],
       [ 0.02884, -0.05923, ..., -0.04744,  0.06698],
   ...,
       [ 0.08428, -0.15534, ..., -0.01413,  0.04561],
       [-0.02052,  0.08666, ...,  0.03659,  0.10445]]),
    cv=3, n_jobs=3, n_neighbors_try=range(1, 20), scoring=None,
    verbose=5)
您使用的check_cv错误。根据:-

所以它需要输入y和估计量。但是你提供的X和y是错误的。更改以下行:

cv = check_cv(self.cv, X, y)
knn = KNeighborsClassifier(metric='precomputed', algorithm='brute')
致:


请注意行的顺序。

找出if语句中的哪个项生成布尔数组。别猜了。测试。@hpaulj在我看来,y_列导致了布尔值,这让我很困惑,因为y_列只是一个1d数组。触发布尔值的常见原因通常是and语句,通常应该通过替换and来解决,但在这种情况下,它似乎位于check_cv函数中,我不知道如何修复。您能提供一个仍然生成错误是什么?这样,问题和答案对其他路过它的人会很有用。这个问题可能会有帮助:展示你的火车样本。它的形状是什么?它包含什么?谢谢你的帮助!但我有点困惑,因为现在check_cv只包含y数据集,这是否意味着我需要将X_序列和y_序列数据集合并?因为后续的代码要求X[train_ix]和y[train_ix]@bonedino check_cv只用于检查交叉验证迭代器的类型,如果它与y是否兼容。它不会影响代码的任何其他部分。
def fit(self, X, y):
    if self.n_neighbors_try is None:
        n_neighbors_try = range(1, 6)
    else:
        n_neighbors_try = self.n_neighbors_try

    X = check_array(X, accept_sparse='csr', copy=True)
    X = normalize(X, norm='l1', copy=False)

    cv = check_cv(self.cv, X, y)
    knn = KNeighborsClassifier(metric='precomputed', algorithm='brute')
    scorer = check_scoring(knn, scoring=self.scoring)

    scores = []
    for train_ix, test_ix in cv:
        dist = self._pairwise_wmd(X[test_ix], X[train_ix])
        knn.fit(X[train_ix], y[train_ix])
        scores.append([
            scorer(knn.set_params(n_neighbors=k), dist, y[test_ix])
            for k in n_neighbors_try
        ])
    scores = np.array(scores)
    self.cv_scores_ = scores

    best_k_ix = np.argmax(np.mean(scores, axis=0))
    best_k = n_neighbors_try[best_k_ix]
    self.n_neighbors = self.n_neighbors_ = best_k

    return super(WordMoversKNNCV, self).fit(X, y)

 knn_cv = WordMoversKNNCV(cv=3,n_neighbors_try=range(1, 20), 
 W_embed=W_common, verbose=5, n_jobs=3)
 knn_cv.fit(X_train, y_train.all())
[Parallel(n_jobs=3)]: Done  12 tasks      | elapsed:   30.8s

[Parallel(n_jobs=3)]: Done  34 out of  34 | elapsed:  2.0min finished

[Parallel(n_jobs=3)]: Done  12 tasks      | elapsed:   25.7s

[Parallel(n_jobs=3)]: Done  33 out of  33 | elapsed:  2.9min finished

[Parallel(n_jobs=3)]: Done  12 tasks      | elapsed:   53.3s

[Parallel(n_jobs=3)]: Done  33 out of  33 | elapsed:  2.0min finished

WordMoversKNNCV(W_embed=memmap([[ 0.04283, -0.01124, ..., -0.05679, -0.00763],
       [ 0.02884, -0.05923, ..., -0.04744,  0.06698],
   ...,
       [ 0.08428, -0.15534, ..., -0.01413,  0.04561],
       [-0.02052,  0.08666, ...,  0.03659,  0.10445]]),
    cv=3, n_jobs=3, n_neighbors_try=range(1, 20), scoring=None,
    verbose=5)
check_cv(cv=’warn’, y=None, classifier=False):

cv : int, 
     cross-validation generator or an iterable, optional

y : array-like, optional
    The target variable for supervised learning problems.

classifier : boolean, optional, default False
             Whether the task is a classification task, 
             in which case stratified KFold will be used
cv = check_cv(self.cv, X, y)
knn = KNeighborsClassifier(metric='precomputed', algorithm='brute')
knn = KNeighborsClassifier(metric='precomputed', algorithm='brute')
cv = check_cv(self.cv, y, knn)