Warning: file_get_contents(/data/phpspider/zhask/data//catemap/0/asp.net-core/3.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Machine learning 在使用逻辑回归进行情绪分析时,这是获得积极或消极程度的一种方法吗_Machine Learning_Deep Learning_Logistic Regression_Sentiment Analysis_Python 3.7 - Fatal编程技术网

Machine learning 在使用逻辑回归进行情绪分析时,这是获得积极或消极程度的一种方法吗

Machine learning 在使用逻辑回归进行情绪分析时,这是获得积极或消极程度的一种方法吗,machine-learning,deep-learning,logistic-regression,sentiment-analysis,python-3.7,Machine Learning,Deep Learning,Logistic Regression,Sentiment Analysis,Python 3.7,我一直在使用逻辑回归进行情绪分析,其中预测结果仅给出1或0,分别给出积极或消极情绪 我的挑战是,我想将给定的用户输入分为四类(非常好、良好、平均、差),但每次的预测结果都是1或0 下面是到目前为止我的代码示例 from sklearn.feature_extraction.text import CountVectorizer from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer from sklearn.met

我一直在使用逻辑回归进行情绪分析,其中预测结果仅给出1或0,分别给出积极或消极情绪

我的挑战是,我想将给定的用户输入分为四类(非常好、良好、平均、差),但每次的预测结果都是1或0

下面是到目前为止我的代码示例

from sklearn.feature_extraction.text import CountVectorizer
from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer
from sklearn.metrics import classification_report
from sklearn.linear_model import LogisticRegression
from sklearn.datasets import load_files
from sklearn.model_selection import GridSearchCV
import numpy as np
import mglearn
import matplotlib.pyplot as plt
# import warnings filter
from warnings import simplefilter
# ignore all future warnings
#simplefilter(action='ignore', category=FutureWarning)

# Get the dataset from http://ai.stanford.edu/~amaas/data/sentiment/

reviews_train = load_files("aclImdb/train/")
text_train, y_train = reviews_train.data, reviews_train.target

print("")
print("Number of documents in train data: {}".format(len(text_train)))
print("")
print("Samples per class (train): {}".format(np.bincount(y_train)))
print("")

reviews_test = load_files("aclImdb/test/")
text_test, y_test = reviews_test.data, reviews_test.target

print("Number of documents in test data: {}".format(len(text_test)))
print("")
print("Samples per class (test): {}".format(np.bincount(y_test)))
print("")


vect = CountVectorizer(stop_words="english", analyzer='word', 
                        ngram_range=(1, 1), max_df=1.0, min_df=1, 
max_features=None)
X_train = vect.fit(text_train).transform(text_train)
X_test = vect.transform(text_test)

print("Vocabulary size: {}".format(len(vect.vocabulary_)))
print("")
print("X_train:\n{}".format(repr(X_train)))
print("X_test: \n{}".format(repr(X_test)))

feature_names = vect.get_feature_names()
print("Number of features: {}".format(len(feature_names)))
print("")

param_grid = {'C': [0.001, 0.01, 0.1, 1, 10]}
grid = 
GridSearchCV(LogisticRegression(penalty='l1',dual=False,max_iter=110, 
solver='liblinear'), param_grid, cv=5)
grid.fit(X_train, y_train)

print("Best cross-validation score: {:.2f}".format(grid.best_score_))
print("Best parameters: ", grid.best_params_)
print("Best estimator: ", grid.best_estimator_)

lr = grid.best_estimator_
lr.predict(X_test)

print("Best Estimator Score: {:.2f}".format(lr.score(X_test, y_test)))
print("")

#creating an empty list for getting overall sentiment
lst = []

# number of elemetns as input
print("")
n = int(input("Enter number of rounds : ")) 

# iterating till the range 
for i in range(0, n):
    temp =[]
ele = input("\n Please Enter a sentence to get a sentiment Evaluation.  
 \n\n")
temp.append(ele)

print("")
print("Review prediction: {}". format(lr.predict(vect.transform(temp))))
print("")
lst.append(ele) # adding the element 

print(lst)
print("")
print("Overal prediction: {}". format(lr.predict(vect.transform(lst))))
print("")
我想得到一些介于-0到1之间的值,比如使用维德情感强度分析器的极性_分数

下面是一个代码示例,说明了我想使用感伤强度分析器的极性_分数实现的目标

# import SentimentIntensityAnalyzer class 
# from vaderSentiment.vaderSentiment module. 
from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer 

# function to print sentiments 
# of the sentence.

def sentiment_scores(sentence): 

# Create a SentimentIntensityAnalyzer object. 
sid_obj = SentimentIntensityAnalyzer() 

# polarity_scores method of SentimentIntensityAnalyzer 
# oject gives a sentiment dictionary. 
# which contains pos, neg, neu, and compound scores.

sentiment_dict = sid_obj.polarity_scores(sentence) 

print("")
print("\n Overall sentiment dictionary is : ", sentiment_dict," \n") 
print("sentence was rated as: ", sentiment_dict['neg']*100, "% Negative 
\n") 
print("sentence was rated as: ", sentiment_dict['neu']*100, "% Neutral 
\n") 
print("sentence was rated as: ", sentiment_dict['pos']*100, "% Positive 
\n")

print("Sentence Overall Rated As: ", end = " ") 

# decide sentiment as positive, negative and neutral


if sentiment_dict['compound'] >= 0.5: 
    print("Exellent \n")
elif sentiment_dict['compound'] > 0 and sentiment_dict['compound'] <0.5:
    print("Very Good \n")
elif sentiment_dict['compound'] == 0:
    print("Good \n")
elif sentiment_dict['compound'] <= -0.5:
    print("Average \n")
elif sentiment_dict['compound'] > -0.5 and sentiment_dict['compound']<0:
    print("Poor \n")  

# Driver code 
if __name__ == "__main__" : 

while True:
       # print("")
        sentence= []
        sentence = input("\n Please enter a sentence to get a sentimet 
 evaluation. Enter exit to end progam \n")

        if sentence == "exit":

            print("\n Program End...........\n")
            print("")
            break
        else:
            sentiment_scores(sentence)
#导入IntensityAnalyzer类
#来自vadertouction.vadertouction模块。
从vadertouction.vadertouction导入感伤强度分析器
#打印情感的功能
#这句话的意思是。
def情绪测试分数(句子):
#创建感伤强度分析器对象。
sid_obj=情绪强度分析器()
#情绪强度分析仪的极性评分法
#oject提供了一本情感词典。
#其中包含pos、neg、neu和复合分数。
情绪(句子)得分(句子)
打印(“”)
打印(“\n总体情绪词典为:”,情绪词典,“\n”)
印刷品(“句子被评定为:”,情绪['neg']*100,”%负面
\n“)
印刷体(“句子被评定为:”,情绪['neu']*100,”%中性
\n“)
打印(“句子被评定为:”,情绪指令['pos']*100,”%阳性
\n“)
打印(“句子整体评级为:”,end=“”)
#确定情绪为积极、消极和中性
如果“复合物”>=0.5:
打印(“Exellent\n”)

elif themation_dict['composite']>0和themation_dict['composite']您有两个选项

1:根据示例的负面或正面程度,将初始训练数据标记为多个类别,而不是仅0或1,并执行多类别分类

2:由于1可能不可能,请尝试使用
预测概率(X)
预测日志概率(X)
决策函数(X)
方法,并使用这些方法的结果根据一些硬编码阈值将输出分为4类。我建议使用
predict_probabila
,因为这些数字可以直接解释为概率,与其他方法相比,这是逻辑回归的主要优点之一。例如,假设第1列(不是第0列)为“正”分类

probs = lr.predict_proba(X_test)
labels = np.repeat("very_good", len(probs))
labels[probs[:, 1] <  0.75] = "good"
labels[probs[:, 1] < 0.5] = "average"
labels[probs[:, 1] < 0.25] = "poor"
probs=lr.预测概率(X检验)
标签=np.重复(“非常好”,len(probs))
标签[probs[:,1]<0.75]=“良好”
标签[probs[:,1]<0.5]=“平均”
标签[probs[:,1]<0.25]=“差”

您有几个选择

1:根据示例的负面或正面程度,将初始训练数据标记为多个类别,而不是仅0或1,并执行多类别分类

2:由于1可能不可能,请尝试使用
预测概率(X)
预测日志概率(X)
决策函数(X)
方法,并使用这些方法的结果根据一些硬编码阈值将输出分为4类。我建议使用
predict_probabila
,因为这些数字可以直接解释为概率,与其他方法相比,这是逻辑回归的主要优点之一。例如,假设第1列(不是第0列)为“正”分类

probs = lr.predict_proba(X_test)
labels = np.repeat("very_good", len(probs))
labels[probs[:, 1] <  0.75] = "good"
labels[probs[:, 1] < 0.5] = "average"
labels[probs[:, 1] < 0.25] = "poor"
probs=lr.预测概率(X检验)
标签=np.重复(“非常好”,len(probs))
标签[probs[:,1]<0.75]=“良好”
标签[probs[:,1]<0.5]=“平均”
标签[probs[:,1]<0.25]=“差”

非常感谢您,predict_proba()方法在me@MabutaBee如果这个答案有助于你考虑投票,非常感谢你,预测树()方法很有效。me@MabutaBee如果这个答案对你有帮助的话,考虑一下投票吧。