在Python中,如何计算一个单词在特定类别的列中重复的次数?

在Python中,如何计算一个单词在特定类别的列中重复的次数?,python,pandas,dataframe,Python,Pandas,Dataframe,所以我已经在这个问题上纠缠了好几天了,如果有人帮助我,我将不胜感激。 我有一个dataframe,列是: # Column Non-Null Count Dtype --- ------ -------------- ----- 0 PhraseId 93636 non-null int64 1 SentenceId 93636 non-null int64 2 Phrase 93636 non-null

所以我已经在这个问题上纠缠了好几天了,如果有人帮助我,我将不胜感激。 我有一个dataframe,列是:

 #   Column      Non-Null Count  Dtype  
---  ------      --------------  -----   
0   PhraseId    93636 non-null  int64   
1   SentenceId  93636 non-null  int64   
2   Phrase      93636 non-null  object  
3   Sentiment   93636 non-null  int64 
情绪是从0到4,这基本上是从好到坏的评分。我添加了两列可能会有所帮助:每个短语的单词数,并将每个短语拆分为一个列表,该列表包含短语中的单词

我想做的是创建4个条形图(每个情绪对应一个条形图),显示该情绪中重复次数最多的15个单词。x轴将是该情绪中重复出现的前15个词

下面,我粘贴了一个我写的代码,它计算每个词在每个情绪中重复的次数。这可能是条形图所需要的

样本数据:

       PhraseId SentenceId  Phrase                Sentiment SplitPhrase  NumOfWords
44723   75358   3866        Build some robots...    0   [Build, some, robots...] 52
counters = {}
for Sentiment in train_data['Sentiment'].unique():
    counters[Sentiment] = Counter()
    indices = (train_data['Sentiment'] == Sentiment)
    for Phrase in train_data['SplitPhrase'][indices]:
        counters[Sentiment].update(Phrase)
        
print(counters) 
{2: Counter({'the': 28041, ',': 25046, 'a': 19962, 'of': 19376, 'and': 19052, 'to': 13470, '.': 10505, "'s": 10290, 'in': 8108, 'is': 8012, 'that': 7276, 'it': 6176, 'as': 5027, 'with': 4474, 'for': 4362, 'its': 4159, 'film': 3933......}),
 3: Counter({'the': 28041, ',': 25046, 'a': 19962,.....
要计算每个情绪的单词重复次数:

       PhraseId SentenceId  Phrase                Sentiment SplitPhrase  NumOfWords
44723   75358   3866        Build some robots...    0   [Build, some, robots...] 52
counters = {}
for Sentiment in train_data['Sentiment'].unique():
    counters[Sentiment] = Counter()
    indices = (train_data['Sentiment'] == Sentiment)
    for Phrase in train_data['SplitPhrase'][indices]:
        counters[Sentiment].update(Phrase)
        
print(counters) 
{2: Counter({'the': 28041, ',': 25046, 'a': 19962, 'of': 19376, 'and': 19052, 'to': 13470, '.': 10505, "'s": 10290, 'in': 8108, 'is': 8012, 'that': 7276, 'it': 6176, 'as': 5027, 'with': 4474, 'for': 4362, 'its': 4159, 'film': 3933......}),
 3: Counter({'the': 28041, ',': 25046, 'a': 19962,.....
样本输出:

       PhraseId SentenceId  Phrase                Sentiment SplitPhrase  NumOfWords
44723   75358   3866        Build some robots...    0   [Build, some, robots...] 52
counters = {}
for Sentiment in train_data['Sentiment'].unique():
    counters[Sentiment] = Counter()
    indices = (train_data['Sentiment'] == Sentiment)
    for Phrase in train_data['SplitPhrase'][indices]:
        counters[Sentiment].update(Phrase)
        
print(counters) 
{2: Counter({'the': 28041, ',': 25046, 'a': 19962, 'of': 19376, 'and': 19052, 'to': 13470, '.': 10505, "'s": 10290, 'in': 8108, 'is': 8012, 'that': 7276, 'it': 6176, 'as': 5027, 'with': 4474, 'for': 4362, 'its': 4159, 'film': 3933......}),
 3: Counter({'the': 28041, ',': 25046, 'a': 19962,.....

你的解释有道理;但是,请包括示例数据,而不仅仅是
df.info()
的输出。请查看此链接,了解如何询问一个好的
pandas
问题:好的,谢谢,我附上了示例数据的图像无图像!请
阅读我共享的链接:)我又编辑了一次,希望这样更好。我还稍微改变了我的问题,因为我找到了一种方法来计算每个词在每个情绪中重复了多少次。我现在需要基于此创建一个条形图。