Python 3.x 从数据帧中提取并计数哈希标记_Python 3.x_Pandas

Python 3.x 从数据帧中提取并计数哈希标记

python-3.x pandas

Python 3.x 从数据帧中提取并计数哈希标记,python-3.x,pandas,Python 3.x,Pandas,我有一个带有一些tweet的数据帧，类似这样： tweets = pd.Series(['This is a tweet example #help #thankyou', 'Second tweet example #help', 'Third tweet example #help #stackoverflow']) tweets_df = pd.DataFrame({'Tweets': tweets})

我有一个带有一些tweet的数据帧，类似这样：

tweets = pd.Series(['This is a tweet example #help #thankyou', 
                    'Second tweet example #help', 
                    'Third tweet example #help #stackoverflow'])

tweets_df = pd.DataFrame({'Tweets': tweets})

然后，我将hashtags放在dataframe的另一列中

tweets_df['hashtags'] = tweets_df['Tweets'].apply(lambda twt : re.findall(r"#(\w+)", twt))

现在我想计算它们，并将结果放入另一个数据帧中。我尝试了以下方法，但没有成功

tweets_df['hashtags'].str.split(expand=True).stack().value_counts()

结果必须类似于：

#help           2
#thankyou       1
#stackoverflow  1

让我们使用extractall和value_计数：

输出：

#help             3
#stackoverflow    1
#thankyou         1
Name: 0, dtype: int64

让我们使用extractall和value_计数：

输出：

#help             3
#stackoverflow    1
#thankyou         1
Name: 0, dtype: int64

你可以用柜台

您不需要将tweet制作成数据帧。只需从那里执行提取：

tweets.str.extractall(r'(\#\w*)')[0].value_counts()

#help             3
#stackoverflow    1
#thankyou         1
Name: 0, dtype: int64

您不需要将tweet制作成数据帧。只需从那里执行提取：

tweets.str.extractall(r'(\#\w*)')[0].value_counts()

#help             3
#stackoverflow    1
#thankyou         1
Name: 0, dtype: int64

帮不上忙吗？帮不上忙吗？