Scikit learn 为什么n-gram范围去掉了中性标记和概率?

Scikit learn 为什么n-gram范围去掉了中性标记和概率?,scikit-learn,nlp,range,google-colaboratory,n-gram,Scikit Learn,Nlp,Range,Google Colaboratory,N Gram,为什么上面缺少9%:Neutral?你可以发布一个y值的摘要(比如标签和每个标签的样本数)?'13%':'sazing','15%':'Bad','57%':'Good','6%':'sorry':通常是9%。不仅是这批的结果。 from sklearn.feature_extraction.text import CountVectorizer vectorizer = CountVectorizer(stop_words = "english", ngram_range=(1, 2), m

为什么上面缺少9%:Neutral?

你可以发布一个y值的摘要(比如标签和每个标签的样本数)?'13%':'sazing','15%':'Bad','57%':'Good','6%':'sorry':通常是9%。不仅是这批的结果。
from sklearn.feature_extraction.text import CountVectorizer

vectorizer = CountVectorizer(stop_words = "english", ngram_range=(1, 2), max_features=5000)
X = vectorizer.fit_transform(data['reviews_text'])
bag_of_words= pd.DataFrame(X.toarray(), columns=vectorizer.get_feature_names())
x= bag_of_words
y= data['reviews_sentiment']

from sklearn.naive_bayes import MultinomialNB
model = MultinomialNB()
trained_model = model.fit(X, y)
probas= trained_model.predict_proba(x_test)[0]
probabilities=[str(int(x*100))+'%' for x in probas]
labels= list(trained_model.classes_)

dict(zip(probabilities,labels))

{'13%': 'Amazing', '15%': 'Bad', '57%': 'Good', '6%': 'Terrible'}