Python 使用group by计算for循环的百分比
如果我有下面的循环代码,它给出了基于下面的丢失和赢得类型的比率,如果我想查看相同的数据,但按教授分组,我将如何修改代码Python 使用group by计算for循环的百分比,python,python-3.x,pandas,Python,Python 3.x,Pandas,如果我有下面的循环代码,它给出了基于下面的丢失和赢得类型的比率,如果我想查看相同的数据,但按教授分组,我将如何修改代码 leads = ['Passed','Failed'] max_status = None max_percent = None for lead in leads: df_overall = df[(df['Status']== lead) & (df['size']== '20-34')] num_overall = len(df_overall)
leads = ['Passed','Failed']
max_status = None
max_percent = None
for lead in leads:
df_overall = df[(df['Status']== lead) & (df['size']== '20-34')]
num_overall = len(df_overall)
lead_df = df[(df['size']== '20-34')]
num_total = len(lead_df)
percentage_overall = num_overall / num_total
if max_status is None:
print(lead, percentage_overall)
这给了我如下输出:
Passed .65
Failed .35
我想编辑按教授分组的代码,因为它们在我的数据框架中也是教授列
预期产出:
Mr. Johnson Passed .35
Mr. Johnson Failed .65
Ms. Jones Passed .90
Ms. Jones Failed .10
Mr. Boe Passed .80
Mr. Boe Passed .20
谢谢我相信您需要:
你可以通过教授进行分组,然后将你的数学应用到每个小组。在线教程中有很多例子。你被困在哪里?请发布你的编码尝试。你能分享一些来自df的内容吗。我想你可以用一个更简单的groupby来实现这一点
leads = ['Passed','Failed']
lead_df = df[(df['size']== '20-34')]
#filter by list leads
df_overall = lead_df[lead_df['Status'].isin(lead)]
num_overall1 = df_overall.groupby(['professor','Status']).size()
num_total1 = lead_df.groupby(['professor','Status']).size()
out = num_overall1.div(num_total1).reset_index(name='per')
print (out)