Python 3.x 大熊猫分组和条件比率
我想根据一个条件来计算计数的比率,我正在努力使用Python 3.x 大熊猫分组和条件比率,python-3.x,pandas,Python 3.x,Pandas,我想根据一个条件来计算计数的比率,我正在努力使用pandas数据帧来获得正确的计数比率 数据如下: JOB_ROLE COMMENTS ACTIVITY_TYPE COUNTS Director-Level Meeting Requested EmailSend 490 Manager-Level Meeting Requested Email
pandas
数据帧来获得正确的计数比率
数据如下:
JOB_ROLE COMMENTS ACTIVITY_TYPE COUNTS
Director-Level Meeting Requested EmailSend 490
Manager-Level Meeting Requested EmailSend 305
Non-Managerial Meeting Requested EmailSend 272
Top Executive; C-Level Meeting Requested EmailSend 226
VP-Level Meeting Requested EmailSend 185
Director-Level Meeting Requested FormSubmit 131
Manager-Level Meeting Requested FormSubmit 74
Top Executive; C-Level Meeting Requested FormSubmit 61
VP-Level Meeting Requested FormSubmit 53
Non-Managerial Meeting Requested FormSubmit 52
Other Meeting Requested EmailSend 20
Other Meeting Requested FormSubmit 2
ratios = mr_jr.groupby('JOB_ROLE').apply(lambda x: x[x['ACTIVITY_TYPE']=='FormSubmit'].COUNTS / x[x['ACTIVITY_TYPE']=='EmailSend'].COUNTS)
我的尝试如下:
JOB_ROLE COMMENTS ACTIVITY_TYPE COUNTS
Director-Level Meeting Requested EmailSend 490
Manager-Level Meeting Requested EmailSend 305
Non-Managerial Meeting Requested EmailSend 272
Top Executive; C-Level Meeting Requested EmailSend 226
VP-Level Meeting Requested EmailSend 185
Director-Level Meeting Requested FormSubmit 131
Manager-Level Meeting Requested FormSubmit 74
Top Executive; C-Level Meeting Requested FormSubmit 61
VP-Level Meeting Requested FormSubmit 53
Non-Managerial Meeting Requested FormSubmit 52
Other Meeting Requested EmailSend 20
Other Meeting Requested FormSubmit 2
ratios = mr_jr.groupby('JOB_ROLE').apply(lambda x: x[x['ACTIVITY_TYPE']=='FormSubmit'].COUNTS / x[x['ACTIVITY_TYPE']=='EmailSend'].COUNTS)
将条件应用于每组并执行算术的正确方法是什么
先谢谢你
已编辑
期望输出:
看起来像是数据透视表的作业
piv=df.pivot('JOB\u ROLE','ACTIVITY\u TYPE')。计数
输出:
In [119]: piv.FormSubmit / piv.EmailSend
Out[119]:
JOB_ROLE
Director-Level 0.267347
Manager-Level 0.242623
Non-Managerial 0.191176
Other 0.100000
Top Executive; C-Level 0.269912
VP-Level 0.286486
dtype: float64
没有枢轴:
df.set_index('JOB_ROLE', drop=True, inplace=True)
emails = df[df.ACTIVITY_TYPE == 'EmailSend']
forms = df[df.ACTIVITY_TYPE == 'FormSubmit']
print(forms.COUNTS / emails.COUNTS)
你想要的输出是什么?@ScottBoston:对不起,编辑好了!非常感谢。这是一个很酷的把戏。有没有一种方法可以不使用
pivot
,或者使用groupby
和apply
。虽然如果每个(作业\角色、活动\类型)对只出现一次,但可能不需要groupby。实际上,我发布的示例数据是一个groupby
的结果,这就是原因,您每对只能看到一行。谢谢你的帮助。