Python 如何计算每个句子分数中每个单词在句子中出现的次数？_Python_Nlp_Sentiment Analysis

Python 如何计算每个句子分数中每个单词在句子中出现的次数？

python nlp

Python 如何计算每个句子分数中每个单词在句子中出现的次数？,python,nlp,sentiment-analysis,Python,Nlp,Sentiment Analysis,我有一份用户调查文件： Score Comment 8 Rapid bureaucratic affairs. Reports for policy... 4 There needs to be communication or feed back f... 7 service is satisfactory 5 Good 5 There is no 10 My main reason for the pro

我有一份用户调查文件：

Score    Comment
8        Rapid bureaucratic affairs. Reports for policy...
4        There needs to be communication or feed back f...
7        service is satisfactory
5        Good
5        There is no
10       My main reason for the product is competition ...
9        Because I have not received the results. And m...
5        no reason

我想确定哪些关键字对应较高的分数，哪些关键字对应较低的分数

我的想法是构建一个单词表（或“单词向量”词典），其中将包含与之相关联的分数，以及分数与该句子相关联的次数

如下所示：

Word        Score   Count
Word1:      7       1
            4       2
Word2:      5       1
            9       1
            3       2
            2       1
Word3:      9       3
Word4:      8       1
            9       1
            4       2
...         ...     ...

word_vec = {}
# col 1 is the word, col 2 is the score, col 3 is the number of times it occurs

for i in range(len(data)):
    sentence = data['SurveyResponse'][i].split(' ')
    for word in sentence:
        word_vec['word'] = word
        if word in word_vec:
            word_vec[word] = {'Score':data['SCORE'][i], 'NumberOfTimes':(word_vec[word]['NumberOfTimes'] += 1)}
        else:
            word_vec[word] = {'Score':data['SCORE'][i], 'NumberOfTimes':1}

然后，对于每个单词，平均分数是与该单词相关联的所有分数的平均值

为此，我的代码如下：

Word        Score   Count
Word1:      7       1
            4       2
Word2:      5       1
            9       1
            3       2
            2       1
Word3:      9       3
Word4:      8       1
            9       1
            4       2
...         ...     ...

word_vec = {}
# col 1 is the word, col 2 is the score, col 3 is the number of times it occurs

for i in range(len(data)):
    sentence = data['SurveyResponse'][i].split(' ')
    for word in sentence:
        word_vec['word'] = word
        if word in word_vec:
            word_vec[word] = {'Score':data['SCORE'][i], 'NumberOfTimes':(word_vec[word]['NumberOfTimes'] += 1)}
        else:
            word_vec[word] = {'Score':data['SCORE'][i], 'NumberOfTimes':1}

但这段代码给了我以下错误：

File "<ipython-input-144-14b3edc8cbd4>", line 9
    word_vec[word] = {'Score':data['SCORE'][i], 'NumberOfTimes':(word_vec[word]['NumberOfTimes'] += 1)}
                                                                                                  ^
SyntaxError: invalid syntax

文件“”，第9行
单词[word]={'Score'：数据['Score'][i]，'NumberOfTimes'：（单词[word]['NumberOfTimes']+=1）}
^
SyntaxError:无效语法

有人能告诉我正确的方法吗？

试试这段代码

word_vec = {}
# col 1 is the word, col 2 is the score, col 3 is the number of times it occurs

for i in range(len(data)):
    sentence = data['SurveyResponse'][i].split(' ')
    for word in sentence:
        word_vec['word'] = word
        if word in word_vec:
            word_vec[word]['Score'] += data['SCORE'][i] # Keep accumulating the total score for each word, would be easier to find the average score later on
            word_vec[word]['NumberOfTimes'] += 1
        else:
            word_vec[word] = {'Score':data['SCORE'][i], 'NumberOfTimes':1}

要增加'NumberOfTimes'的值，您可以像这样直接增加[word]['NumberOfTimes']+=1

您可以使用采集计数器。它允许计算每个单词出现的次数

这里有一个例子：

 from collections import Counter

 c = Counter(["jsdf","ijoiuj","je","oui","je","non","oui","je"])

 print(c)

结果:

Counter({'je': 3, 'oui': 2, 'ijoiuj': 1, 'jsdf': 1, 'non': 1})

您可以从文档中提取单词并将其放入列表中。最后，计数器将处理该列表以计算每个单词的出现次数

谢谢！还有一个问题-我如何在一本新字典中创建和存储平均分数，

word\u vec\u avg={}

？我在word\u vec.items（）中尝试了k，v的

：word\u vec\u avg[k]=v['Score']/v['NumberOfTimes']

，但它给出了错误

类型错误：字符串索引必须是整数。变量v似乎是您共享的错误的数组。你能打印v值并给出输出吗？