Python ValueError:块[0,:]的行维度不兼容
我试图从twitter数据集中提取一些文本特征(word\u count.char\u count…)和tf idf,用于情绪分析。使用sklearn的featureUnion将它们组合起来,并将它们提供给管道中的分类器 我收到以下错误ValueError:块[0,:]的行维度不兼容。已获取块[0,8]。形状[0]==7920,应为1。以下是代码:Python ValueError:块[0,:]的行维度不兼容,python,machine-learning,scikit-learn,sentiment-analysis,feature-extraction,Python,Machine Learning,Scikit Learn,Sentiment Analysis,Feature Extraction,我试图从twitter数据集中提取一些文本特征(word\u count.char\u count…)和tf idf,用于情绪分析。使用sklearn的featureUnion将它们组合起来,并将它们提供给管道中的分类器 我收到以下错误ValueError:块[0,:]的行维度不兼容。已获取块[0,8]。形状[0]==7920,应为1。以下是代码: features_union = FeatureUnion(transformer_list = [('word_count', WordCalcu
features_union = FeatureUnion(transformer_list = [('word_count', WordCalculator()),
('char_count', CharCalculator()),
('avg_word_len', AvdWordLengthCalculater()),
('stop_words_count', StopWordsCalculater()),
('spl_char_count', SplCharCalculater()),
('hash_tag_count', HashTagCalculator()),
('num_count',NumericsCalculator()),
('cap_letter_count',CapsCalculator()),
('tfidf_feature',Pipeline([('preprocessor', Preprocessor()),
('selector', ItemSelector('tweet')),
('count', CountVectorizer()),
('tfidf', TfidfTransformer())]))])
pipeline = Pipeline([('noise_remover', UrlRemover()),
('features', features_union),
('model', MultinomialNB())])
pipeline.fit(train, train['label'])```
这是完整的错误日志
ValueError Traceback (most recent call last)
<ipython-input-33-bb532fc90bb0> in <module>
14 ('features', features_union),
15 ('model', MultinomialNB())])
---> 16 pipeline.fit(train, train['label'])
~/opt/anaconda3/lib/python3.7/site-packages/sklearn/pipeline.py in fit(self, X, y, **fit_params)
348 This estimator
349 """
--> 350 Xt, fit_params = self._fit(X, y, **fit_params)
351 with _print_elapsed_time('Pipeline',
352 self._log_message(len(self.steps) - 1)):
~/opt/anaconda3/lib/python3.7/site-packages/sklearn/pipeline.py in _fit(self, X, y, **fit_params)
313 message_clsname='Pipeline',
314 message=self._log_message(step_idx),
--> 315 **fit_params_steps[name])
316 # Replace the transformer of the step with the fitted
317 # transformer. This is necessary when loading the transformer
~/opt/anaconda3/lib/python3.7/site-packages/joblib/memory.py in __call__(self, *args, **kwargs)
353
354 def __call__(self, *args, **kwargs):
--> 355 return self.func(*args, **kwargs)
356
357 def call_and_shelve(self, *args, **kwargs):
~/opt/anaconda3/lib/python3.7/site-packages/sklearn/pipeline.py in _fit_transform_one(transformer, X, y, weight, message_clsname, message, **fit_params)
726 with _print_elapsed_time(message_clsname, message):
727 if hasattr(transformer, 'fit_transform'):
--> 728 res = transformer.fit_transform(X, y, **fit_params)
729 else:
730 res = transformer.fit(X, y, **fit_params).transform(X)
~/opt/anaconda3/lib/python3.7/site-packages/sklearn/pipeline.py in fit_transform(self, X, y, **fit_params)
943
944 if any(sparse.issparse(f) for f in Xs):
--> 945 Xs = sparse.hstack(Xs).tocsr()
946 else:
947 Xs = np.hstack(Xs)
~/opt/anaconda3/lib/python3.7/site-packages/scipy/sparse/construct.py in hstack(blocks, format, dtype)
463
464 """
--> 465 return bmat([blocks], format=format, dtype=dtype)
466
467
~/opt/anaconda3/lib/python3.7/site-packages/scipy/sparse/construct.py in bmat(blocks, format, dtype)
584 exp=brow_lengths[i],
585 got=A.shape[0]))
--> 586 raise ValueError(msg)
587
588 if bcol_lengths[j] == 0:
ValueError: blocks[0,:] has incompatible row dimensions. Got blocks[0,8].shape[0] == 7920, expected 1.
数据集形状-(7920,3)
如能立即提供帮助,我们将不胜感激
0 1 0 #fingerprint #Pregnancy Test https://google.com...
1 2 0 Finally a transparant silicon case ^^ Thanks t...
2 3 0 We love this! Would you go? #talk #makememorie...
3 4 0 I'm wired I know I'm George I was made that wa...
4 5 1 What amazing service! Apple won't even talk to...