Python 将数据帧中2个类别的值计数到透视表中_Python_Pandas_Pivot

Python 将数据帧中2个类别的值计数到透视表中

python pandas

Python 将数据帧中2个类别的值计数到透视表中,python,pandas,pivot,Python,Pandas,Pivot,是一个参考，我已经发现做类似的操作，但不准确我拥有的是： foll中的数据帧。格式： Tweets Classified FreqWord calm director day science meetings nasal talk cutting edge remote sensing research drought veg fluorescence calm lov

是一个参考，我已经发现做类似的操作，但不准确

我拥有的是：
foll中的数据帧。格式：

    Tweets                                                   Classified     FreqWord
     calm director day science meetings nasal talk cutting edge remote sensing research drought veg fluorescence calm love                 Positive drought
     love thought drought   Positive    drought
     reign mother kerr funny none tried make come back drought  Positive    drought
     wonder could help thai market b post reuters drought devastates south europe crops Negative    drought
     wonder could help thai market b post reuters drought devastates south europe crops Negative    crops
     wonder could help thai market b post reuters drought devastates south europe crops Negative    crops
     wonder could help thai market b post reuters drought devastates south europe crops Negative    business
     every child safe drinking water thank uk aid providing suppo ensure children rights drought    Positive    drought
     every child safe drinking water thank uk aid providing suppo ensure children rights drought    Positive    water

我需要的是：
数据透视表中的数据帧，其中索引为

分类

，列为

FreqWord

，值需要是在该频繁词中分类的出现次数tweet。简言之，类似foll的东西

Classified  drought crops   business    water
Positive        5       0          0        1
Negative        1       2          1        0

另请注意

对于这个数据集，我有更多的“常用词”和“分类词”

您可以这样做：

pd.crosstab(df.Classified, df.FreqWord)

输出

FreqWord    business  crops  drought  water
Classified                                 
Negative           1      2        1      0
Positive           0      0        4      1

或者得到你的假人：

df_out = pd.get_dummies(df[['Classified','FreqWord']], columns=['FreqWord'])\
           .set_index('Classified').sum(level=0)
df_out.columns = df_out.columns.str.split('_').str[1]

输出：

            business  crops  drought  water
Classified                                 
Positive           0      0        4      1
Negative           1      2        1      0

并且，如果您希望可以重置_索引：

df_out.reset_index()

  Classified  business  crops  drought  water
0   Positive         0      0        4      1
1   Negative         1      2        1      0

了不起的工作@Scott！这很简单。我差点就因为这个扯头发！多全面的回答啊@马祖：谢谢你