Python ValueError:块[0，：]的行维度不兼容_Python_Machine Learning_Scikit Learn_Sentiment Analysis_Feature Extraction

Python ValueError:块[0，：]的行维度不兼容

python machine-learning scikit-learn

Python ValueError:块[0，：]的行维度不兼容,python,machine-learning,scikit-learn,sentiment-analysis,feature-extraction,Python,Machine Learning,Scikit Learn,Sentiment Analysis,Feature Extraction,我试图从twitter数据集中提取一些文本特征（word\u count.char\u count…）和tf idf，用于情绪分析。使用sklearn的featureUnion将它们组合起来，并将它们提供给管道中的分类器我收到以下错误ValueError:块[0，：]的行维度不兼容。已获取块[0,8]。形状[0]==7920，应为1。以下是代码： features_union = FeatureUnion(transformer_list = [('word_count', WordCalcu

我试图从twitter数据集中提取一些文本特征（word\u count.char\u count…）和tf idf，用于情绪分析。使用sklearn的featureUnion将它们组合起来，并将它们提供给管道中的分类器

我收到以下错误ValueError:块[0，：]的行维度不兼容。已获取块[0,8]。形状[0]==7920，应为1。以下是代码：

features_union = FeatureUnion(transformer_list = [('word_count', WordCalculator()),
                                                  ('char_count', CharCalculator()),
                                                  ('avg_word_len', AvdWordLengthCalculater()),
                                                  ('stop_words_count', StopWordsCalculater()),
                                                  ('spl_char_count', SplCharCalculater()),
                                                  ('hash_tag_count', HashTagCalculator()),
                                                  ('num_count',NumericsCalculator()),
                                                  ('cap_letter_count',CapsCalculator()),
                                                  ('tfidf_feature',Pipeline([('preprocessor', Preprocessor()),
                                                                             ('selector', ItemSelector('tweet')),
                                                                             ('count', CountVectorizer()),
                                                                             ('tfidf', TfidfTransformer())]))])
pipeline = Pipeline([('noise_remover', UrlRemover()),
                     ('features', features_union),
                     ('model', MultinomialNB())])
pipeline.fit(train, train['label'])```

这是完整的错误日志

ValueError                                Traceback (most recent call last)
<ipython-input-33-bb532fc90bb0> in <module>
     14                      ('features', features_union),
     15                      ('model', MultinomialNB())])
---> 16 pipeline.fit(train, train['label'])

~/opt/anaconda3/lib/python3.7/site-packages/sklearn/pipeline.py in fit(self, X, y, **fit_params)
    348             This estimator
    349         """
--> 350         Xt, fit_params = self._fit(X, y, **fit_params)
    351         with _print_elapsed_time('Pipeline',
    352                                  self._log_message(len(self.steps) - 1)):

~/opt/anaconda3/lib/python3.7/site-packages/sklearn/pipeline.py in _fit(self, X, y, **fit_params)
    313                 message_clsname='Pipeline',
    314                 message=self._log_message(step_idx),
--> 315                 **fit_params_steps[name])
    316             # Replace the transformer of the step with the fitted
    317             # transformer. This is necessary when loading the transformer

~/opt/anaconda3/lib/python3.7/site-packages/joblib/memory.py in __call__(self, *args, **kwargs)
    353 
    354     def __call__(self, *args, **kwargs):
--> 355         return self.func(*args, **kwargs)
    356 
    357     def call_and_shelve(self, *args, **kwargs):

~/opt/anaconda3/lib/python3.7/site-packages/sklearn/pipeline.py in _fit_transform_one(transformer, X, y, weight, message_clsname, message, **fit_params)
    726     with _print_elapsed_time(message_clsname, message):
    727         if hasattr(transformer, 'fit_transform'):
--> 728             res = transformer.fit_transform(X, y, **fit_params)
    729         else:
    730             res = transformer.fit(X, y, **fit_params).transform(X)

~/opt/anaconda3/lib/python3.7/site-packages/sklearn/pipeline.py in fit_transform(self, X, y, **fit_params)
    943 
    944         if any(sparse.issparse(f) for f in Xs):
--> 945             Xs = sparse.hstack(Xs).tocsr()
    946         else:
    947             Xs = np.hstack(Xs)

~/opt/anaconda3/lib/python3.7/site-packages/scipy/sparse/construct.py in hstack(blocks, format, dtype)
    463 
    464     """
--> 465     return bmat([blocks], format=format, dtype=dtype)
    466 
    467 

~/opt/anaconda3/lib/python3.7/site-packages/scipy/sparse/construct.py in bmat(blocks, format, dtype)
    584                                                     exp=brow_lengths[i],
    585                                                     got=A.shape[0]))
--> 586                     raise ValueError(msg)
    587 
    588                 if bcol_lengths[j] == 0:

ValueError: blocks[0,:] has incompatible row dimensions. Got blocks[0,8].shape[0] == 7920, expected 1.

数据集形状-（7920，3）

如能立即提供帮助，我们将不胜感激

0   1   0   #fingerprint #Pregnancy Test https://google.com...
1   2   0   Finally a transparant silicon case ^^ Thanks t...
2   3   0   We love this! Would you go? #talk #makememorie...
3   4   0   I'm wired I know I'm George I was made that wa...
4   5   1   What amazing service! Apple won't even talk to...