Python 使用词典查找文本中肯定词和否定词的数量，_Python_Nlp_Nested Lists_Sentiment Analysis_Lexicon

Python 使用词典查找文本中肯定词和否定词的数量，

python nlp

Python 使用词典查找文本中肯定词和否定词的数量，,python,nlp,nested-lists,sentiment-analysis,lexicon,Python,Nlp,Nested Lists,Sentiment Analysis,Lexicon,我试图找出如何创建一个列表列表，其中每个子列表包含给定文本中肯定词和否定词的数量。下面是我正在处理的正文本文件和负文本文件的名称，以及这些文本文件中的单词示例。“X_train”变量中还有一个示例文本。以及输出应该是什么样子 positive_words.txt#快乐、伟大、神奇 negative_words.txt=#悲伤、糟糕、贫穷 X_train=[“食物很棒，服务很棒”，“我对我的食物很满意”，“我的食物味道不好”，“我很穷，买不到食物，所以我很难过，但至少我有鸡肉” X\u训练词\u

我试图找出如何创建一个列表列表，其中每个子列表包含给定文本中肯定词和否定词的数量。下面是我正在处理的正文本文件和负文本文件的名称，以及这些文本文件中的单词示例。“X_train”变量中还有一个示例文本。以及输出应该是什么样子

positive_words.txt#快乐、伟大、神奇

negative_words.txt=#悲伤、糟糕、贫穷

X_train=[“食物很棒，服务很棒”，“我对我的食物很满意”，“我的食物味道不好”，“我很穷，买不到食物，所以我很难过，但至少我有鸡肉”

X\u训练词\u特征=？

上述变量的输出应该是什么样子。

print(X_train_lexicon_features)

输出： [[2,0]，[1,0]，[0,1]，[0,2]]

#从上面给出的示例来看，X_train变量中的第一个文本应该产生[2,0]，因为它有“伟大”和“惊人”两个词，这两个词都在积极词汇中。[正面、负面]

下面是一个计算正面和负面单词数量的类

class LexiconClassifier():
    def __init__(self):
        self.positive_words = set()
        with open('positive-words.txt', encoding = 'utf-8') as iFile:
            for row in iFile:
                self.positive_words.add(row.strip())

        self.negative_words = set()
        with open('negative-words.txt', encoding='iso-8859-1') as iFile:
            for row in iFile:
                self.negative_words.add(row.strip())
    
    def count_pos_words(self, sentence):
        num_pos_words = 0
        for word in sentence.lower().split():
            if word in self.positive_words:
                num_pos_words += 1
        return num_pos_words

    def count_neg_words(self, sentence):
        num_neg_words = 0
        for word in sentence.lower().split():
            if word in self.negative_words:
                num_neg_words += 1
        return num_neg_words

下面是我运行的代码，用于返回每篇文本的肯定字数

myLC = LexiconClassifier()

X_train_lexicon_features = []

for i in X_train:
     X_train_lexicon_features.append(myLC.count_pos_words(i))

输出： [2,1,0,0]

我不确定的是如何将“count_neg_words”函数混合到上面的代码中，该代码还将返回如下列表：[[2,0]，[1,0]，[0,1]，[0,2]

我感谢您的建议，并提前感谢您