Python 计算pandas中字符串列中单词的出现次数_Python_Pandas

Python 计算pandas中字符串列中单词的出现次数

python pandas

Python 计算pandas中字符串列中单词的出现次数,python,pandas,Python,Pandas,我有一个如下所示的数据帧。每一列单词包含一个或多个由分隔的单词 import pandas as pd import numpy as np dfm = pd.DataFrame({'id': np.arange(5), 'words': ['apple;pear;orange', 'apple', 'pear;grape', 'orange', 'orange;pear']}) 我需要数一数这些词的出现次数。以下是我需要的输出： word count 0 apple 2

我有一个如下所示的数据帧。每一列

单词

包含一个或多个由

分隔的单词
import pandas as pd
import numpy as np
dfm = pd.DataFrame({'id': np.arange(5), 'words': ['apple;pear;orange', 'apple', 'pear;grape', 'orange', 'orange;pear']})

我需要数一数这些词的出现次数。以下是我需要的输出：
    word    count
0   apple   2
1   pear    3
2   orange  3
3   grape   1

有人知道我怎样才能做到吗？谢谢
 您可以value\u counts（）
对单词进行拆分的explode（）
，例如：
In []:
dfm.words.str.split(';').explode().value_counts()

Out[]:
orange    3
pear      3
apple     2
grape     1
Name: words, dtype: int64

或者，您可以使用groupby（）
不按值排序，这将提供要查找的输出：
In []:
words = dfm.words.str.split(';').explode()
words.groupby(words).count().to_frame('count').reset_index()

Out[]:
    words  count
0   apple      2
1   grape      1
2  orange      3
3    pear      3

你做了什么来解决这个问题？这不能简单地分解成几个不同的步骤吗？