Python 基于dataframe中的另一列获取两列的频率计数
我目前有以下情况:Python 基于dataframe中的另一列获取两列的频率计数,python,python-3.x,pandas,pandas-groupby,Python,Python 3.x,Pandas,Pandas Groupby,我目前有以下情况: Business Name Violation Business License # Place 1 Crime 1 111 Place 1 Crime 2 222 Place 2 Crime 3 333 Place 3 Crime 4 444 Place 3 Crime 5 444 Busin
Business Name Violation Business License #
Place 1 Crime 1 111
Place 1 Crime 2 222
Place 2 Crime 3 333
Place 3 Crime 4 444
Place 3 Crime 5 444
Business Name Violations Business License #'s
Place 1 2 2
Place 2 1 1
Place 3 2 1
我试图得到以下信息:
Business Name Violation Business License #
Place 1 Crime 1 111
Place 1 Crime 2 222
Place 2 Crime 3 333
Place 3 Crime 4 444
Place 3 Crime 5 444
Business Name Violations Business License #'s
Place 1 2 2
Place 2 1 1
Place 3 2 1
本质上,我只需要根据业务名称获得两个不同列的计数。这是迄今为止我知道的错误代码:
df.groupby(['Business Name','Business License #']).size()
任何帮助都将不胜感激 使用
pandas.DataFrame.groupby.nunique
:
df.groupby('Business Name')[['Violation','Business License #']].nunique()
Violation Business License #
Business Name
Place 1 2 2
Place 2 1 1
Place 3 2 1
Chris是对的,
nunique
将完成这项工作,但您需要在之后重置索引:
df.groupby('Business Name')[['Business Violation', 'Business License']].nunique().reset_index()
Business Name Business Violation Business License
0 Place 1 2 2
1 Place 2 1 1
2 Place 3 2 1