如何从数据帧计算字频-Python_Python_Pandas_Dictionary_Dataframe_Text Mining

如何从数据帧计算字频-Python

python pandas dictionary dataframe

如何从数据帧计算字频-Python,python,pandas,dictionary,dataframe,text-mining,Python,Pandas,Dictionary,Dataframe,Text Mining,我目前从字典中创建了一个熊猫数据框。数据帧看起来像： URL TITLE 0 /xxxx.xx Hi this is word count 1 /xxxx.xx Hi this is Stack Overflow 2 /xxxx.xx Stack Overflow Questions 我想在此表中添加一个新列，其中列出了“堆栈溢出”一词出现的频率。例如，它会像： URL TITLE

我目前从字典中创建了一个熊猫数据框。数据帧看起来像：

      URL         TITLE
0   /xxxx.xx   Hi this is word count
1   /xxxx.xx   Hi this is Stack Overflow
2   /xxxx.xx   Stack Overflow Questions

我想在此表中添加一个新列，其中列出了“堆栈溢出”一词出现的频率。例如，它会像：

      URL         TITLE                          COUNT
0   /xxxx.xx   Hi this is word count               0
1   /xxxx.xx   Hi this is Stack Overflow           1
2   /xxxx.xx   Stack Overflow Questions            1

count

函数似乎不适用于字典，而仅适用于字符串。有没有简单的方法可以做到这一点？

假设这实际上是一个

熊猫数据帧，您可以：
import pandas as pd

table = {   'URL': ['/xxxx.xx', '/xxxx.xx', '/xxxx.xx'], 
            'TITLE': ['Hi this is word count', 'Hi this is Stack Overflow', 'Stack Overflow Questions']}

df = pd.DataFrame(table)
df['COUNT'] = df.TITLE.str.count('Stack Overflow')
print(df)

这将产生：
                       TITLE       URL  COUNT
0      Hi this is word count  /xxxx.xx      0
1  Hi this is Stack Overflow  /xxxx.xx      1
2   Stack Overflow Questions  /xxxx.xx      1

数据帧上的count（）
方法擅长计算单个值的出现次数，例如“堆栈溢出”
对多个值进行频率分析，考虑使用和方法。
您能显示字典和创建表的代码是否是熊猫数据文件？“堆栈溢出”不是一个单词，而是两个单词。如果字符串包含“溢出堆栈”或“Notstack Overflow”，该怎么办？@Jan Yes这是我转换成数据帧的字典。我会尝试进一步澄清这个问题。