在Python中，如何计算一个单词在特定类别的列中重复的次数？_Python_Pandas_Dataframe

在Python中，如何计算一个单词在特定类别的列中重复的次数？

python pandas dataframe

在Python中，如何计算一个单词在特定类别的列中重复的次数？,python,pandas,dataframe,Python,Pandas,Dataframe,所以我已经在这个问题上纠缠了好几天了，如果有人帮助我，我将不胜感激。我有一个dataframe，列是： # Column Non-Null Count Dtype --- ------ -------------- ----- 0 PhraseId 93636 non-null int64 1 SentenceId 93636 non-null int64 2 Phrase 93636 non-null

所以我已经在这个问题上纠缠了好几天了，如果有人帮助我，我将不胜感激。我有一个dataframe，列是：

 #   Column      Non-Null Count  Dtype  
---  ------      --------------  -----   
0   PhraseId    93636 non-null  int64   
1   SentenceId  93636 non-null  int64   
2   Phrase      93636 non-null  object  
3   Sentiment   93636 non-null  int64

情绪是从0到4，这基本上是从好到坏的评分。我添加了两列可能会有所帮助：每个短语的单词数，并将每个短语拆分为一个列表，该列表包含短语中的单词

我想做的是创建4个条形图（每个情绪对应一个条形图），显示该情绪中重复次数最多的15个单词。x轴将是该情绪中重复出现的前15个词

下面，我粘贴了一个我写的代码，它计算每个词在每个情绪中重复的次数。这可能是条形图所需要的

样本数据：

       PhraseId SentenceId  Phrase                Sentiment SplitPhrase  NumOfWords
44723   75358   3866        Build some robots...    0   [Build, some, robots...] 52

counters = {}
for Sentiment in train_data['Sentiment'].unique():
    counters[Sentiment] = Counter()
    indices = (train_data['Sentiment'] == Sentiment)
    for Phrase in train_data['SplitPhrase'][indices]:
        counters[Sentiment].update(Phrase)
        
print(counters)

{2: Counter({'the': 28041, ',': 25046, 'a': 19962, 'of': 19376, 'and': 19052, 'to': 13470, '.': 10505, "'s": 10290, 'in': 8108, 'is': 8012, 'that': 7276, 'it': 6176, 'as': 5027, 'with': 4474, 'for': 4362, 'its': 4159, 'film': 3933......}),
 3: Counter({'the': 28041, ',': 25046, 'a': 19962,.....

要计算每个情绪的单词重复次数：

       PhraseId SentenceId  Phrase                Sentiment SplitPhrase  NumOfWords
44723   75358   3866        Build some robots...    0   [Build, some, robots...] 52

counters = {}
for Sentiment in train_data['Sentiment'].unique():
    counters[Sentiment] = Counter()
    indices = (train_data['Sentiment'] == Sentiment)
    for Phrase in train_data['SplitPhrase'][indices]:
        counters[Sentiment].update(Phrase)
        
print(counters)

{2: Counter({'the': 28041, ',': 25046, 'a': 19962, 'of': 19376, 'and': 19052, 'to': 13470, '.': 10505, "'s": 10290, 'in': 8108, 'is': 8012, 'that': 7276, 'it': 6176, 'as': 5027, 'with': 4474, 'for': 4362, 'its': 4159, 'film': 3933......}),
 3: Counter({'the': 28041, ',': 25046, 'a': 19962,.....

样本输出：

       PhraseId SentenceId  Phrase                Sentiment SplitPhrase  NumOfWords
44723   75358   3866        Build some robots...    0   [Build, some, robots...] 52

counters = {}
for Sentiment in train_data['Sentiment'].unique():
    counters[Sentiment] = Counter()
    indices = (train_data['Sentiment'] == Sentiment)
    for Phrase in train_data['SplitPhrase'][indices]:
        counters[Sentiment].update(Phrase)
        
print(counters)

{2: Counter({'the': 28041, ',': 25046, 'a': 19962, 'of': 19376, 'and': 19052, 'to': 13470, '.': 10505, "'s": 10290, 'in': 8108, 'is': 8012, 'that': 7276, 'it': 6176, 'as': 5027, 'with': 4474, 'for': 4362, 'its': 4159, 'film': 3933......}),
 3: Counter({'the': 28041, ',': 25046, 'a': 19962,.....

你的解释有道理；但是，请包括示例数据，而不仅仅是

df.info（）

的输出。请查看此链接，了解如何询问一个好的

pandas

问题：好的，谢谢，我附上了示例数据的图像无图像！请

阅读我共享的链接：）我又编辑了一次，希望这样更好。我还稍微改变了我的问题，因为我找到了一种方法来计算每个词在每个情绪中重复了多少次。我现在需要基于此创建一个条形图。