Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/string/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 每帧行的字频率_Python_String_Pandas_Dataframe - Fatal编程技术网

Python 每帧行的字频率

Python 每帧行的字频率,python,string,pandas,dataframe,Python,String,Pandas,Dataframe,我正试图找出如何在每个数据帧行中获得最频繁的单词,比如说前10个最频繁的单词。我有一些代码可以让我在整个DF中使用最频繁的单词,但现在我需要更细粒度 import pandas as pd import numpy as np df1 = pd.read_csv('C:/temp/comments.csv',encoding='latin-1',names=['client','comments']) df1.head(3) 现在我可以得到整个df1中最常见的单词: y = pd.Serie

我正试图找出如何在每个数据帧行中获得最频繁的单词,比如说前10个最频繁的单词。我有一些代码可以让我在整个DF中使用最频繁的单词,但现在我需要更细粒度

import pandas as pd
import numpy as np
df1 = pd.read_csv('C:/temp/comments.csv',encoding='latin-1',names=['client','comments'])
df1.head(3)

现在我可以得到整个df1中最常见的单词:

y = pd.Series(' '.join(df1['description']).lower().split()).value_counts()[:10]

如何获取每个df行的信息?

根据您想要的是数据帧、一系列词典还是词典列表,您可以使用几种不同的方法来实现这一点

from collections import Counter

# dataframe of word counts per row
res = df['comments'].str.split().apply(pd.value_counts)

# series of dictionaries of word counts, each series entry covering one row
res = df['comments'].str.split().apply(Counter)

# list of dictionaries of word counts, each list item covering one row
res = [Counter(x) for x in df['comments'].str.split()]