Python 按用户聚合类别计数_Python_Pandas

Python 按用户聚合类别计数

python pandas

Python 按用户聚合类别计数,python,pandas,Python,Pandas,如何使用熊猫为每个类别的每个用户创建频率计数。我想这样做，以便我可以轴心创建一个效用矩阵 |--|**author** | **category**| 0| A | movies 1| B | games 2| C | pics 4| A | movies 5| C | movies 6| B | games |--|**author** | **category count**| A | movies |2 | B | games

如何使用熊猫为每个类别的每个用户创建频率计数。我想这样做，以便我可以轴心创建一个效用矩阵

|--|**author** | **category**|   
0|  A | movies  
1|  B | games  
2|  C | pics  
4|  A | movies  
5|  C | movies  
6|  B | games 




|--|**author** | **category count**|   

A | movies |2 |  
B | games  |2 |  
C | movies |1 |  
C | pics   |1 |

您可以使用with获取列

作者

和

类别

中所有类别的长度-输出为

系列

with

多索引

print (df.groupby(['author','category']).size())
author  category
A       movies      2
B       games       2
C       movies      1
        pics        1
dtype: int64

然后添加用于从

多索引创建列的内容，并为值列设置列名-输出为DataFrame
：
df = df.groupby(['author','category']).size().reset_index(name='category count')
print (df)
  author category  category count
0      A   movies               2
1      B    games               2
2      C   movies               1
3      C     pics               1

但如果需要，有多种解决方案：
#add unstack for reshape
df1 = df.groupby(['author','category']).size().unstack(fill_value=0)
print (df1)
category  games  movies  pics
author                       
A             0       2     0
B             2       0     0
C             0       1     1



编辑：
太棒了，谢谢你提供了一个有效的解决方案。你甚至还特意给我看了效用矩阵的代码。如果你不介意的话，你可以解释一下，为什么使用大小/重置索引会起到这样的作用？当然，给我一点时间。我尝试添加一些解释，也许也会有帮助。如果有什么不清楚的地方，我试着解释一下。谢谢！[size]（）在文档中没有描述，所以我很困惑，但它是有意义的。虽然我认为这是一个奇怪的命名方法是的，也有计数函数，但它有点不同。请参阅上次编辑，我添加了链接以获得更好的解释。
df1 = pd.crosstab(df['author'],df['category'])
print (df1)
category  games  movies  pics
author                       
A             0       2     0
B             2       0     0
C             0       1     1

df1 = df.pivot_table(index='author',columns='category', aggfunc='size', fill_value=0)
print (df1)
category  games  movies  pics
author                       
A             0       2     0
B             2       0     0
C             0       1     1