Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/306.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 情绪分析联合名单_Python_Arrays_Scikit Learn - Fatal编程技术网

Python 情绪分析联合名单

Python 情绪分析联合名单,python,arrays,scikit-learn,Python,Arrays,Scikit Learn,我正在使用scikit learn python进行情绪分析,现在我正在使用nltk进行单词柠檬化,以提高处理速度,例如: 我在nltk处理后得到以下数组: array([ ['Really', 'a', 'terrible', 'course', u'lecture', u'be', 'so', 'boring', 'i', u'contemplate', 'suicide', 'on', 'numerous', u'occasion', 'and', 'the', 'tutes', u'go

我正在使用scikit learn python进行情绪分析,现在我正在使用nltk进行单词柠檬化,以提高处理速度,例如:

我在nltk处理后得到以下数组:

array([ ['Really', 'a', 'terrible', 'course', u'lecture', u'be', 'so', 'boring', 'i', u'contemplate', 'suicide', 'on', 'numerous', u'occasion', 'and', 'the', 'tutes', u'go', 'for', 'two', u'hour', 'and', u'be', 'completely'], ['Management', 'accounting', u'require', 'sufficient', 'practice', 'to', 'get', 'a', 'hang', 'of', 'Made', 'easier', 'with', 'a', 'great', 'lecturer']], dtype=object)
但scklearn要求阵列是

array([ 'Really a terrible course  lectures were so boring i contemplated suicide on numerous occasions and the tutes went for two hours and were completely ', 'Management accounting requires sufficient practice to get a hang of  Made easier with a great lecturer '],dtype=object)
那么,将此数组转换为正确形式的最佳方法是什么?我尝试使用,但结果很奇怪

您会这样做:

second_array = [' '.join(each) for each in first_array]
或者,您可以告诉
sklearn.CountVectorizer
仅使用您的令牌:

vect = CountVectorizer(preprocessor=lambda x: x, tokenizer=lambda x: x)
X = vect.fit_transform(first_array)
你会做:

second_array = [' '.join(each) for each in first_array]
或者,您可以告诉
sklearn.CountVectorizer
仅使用您的令牌:

vect = CountVectorizer(preprocessor=lambda x: x, tokenizer=lambda x: x)
X = vect.fit_transform(first_array)