Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/python-3.x/16.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 按%分组以计算非数值列值的权重_Python_Python 3.x_Pandas - Fatal编程技术网

Python 按%分组以计算非数值列值的权重

Python 按%分组以计算非数值列值的权重,python,python-3.x,pandas,Python,Python 3.x,Pandas,我的df如下所示: day | mealtype Monday Snack Monday Snack Monday Dinner Tuesday Breakfast Monday Dinner Tuesday Dinner Sunday Snack Sunday Dinner Sunday Lunch 我想计算一种膳食类型每天发生的次数百分比 我在下面有一个以前的代码,它给出了我以前编码的类似计算的计算结果,但它计算了一个金额列的group by中的总和

我的df如下所示:

 day   | mealtype
Monday   Snack
Monday   Snack
Monday   Dinner
Tuesday  Breakfast
Monday   Dinner
Tuesday  Dinner
Sunday   Snack
Sunday   Dinner
Sunday   Lunch
我想计算一种膳食类型每天发生的次数百分比

我在下面有一个以前的代码,它给出了我以前编码的类似计算的计算结果,但它计算了一个金额列的group by中的总和

 cols = ['day', 'mealtype']
    cols2 = ['day']
    
    (df.groupby(cols).amount.apply(lambda x: x.sum())/
     df.groupby(cols2).amount.apply(lambda x: x.sum()))
这里我没有一个数量,我只想计算每天每餐的发生率

预期产出:

Monday Snacks .5
Monday Dinner .5
Tuesday Breakfast .5
Tuesday Dinner .5
Sunday  Snack .33
Sunday  Lunch .33
Sunday  Dinner .33
谢谢

groupby()、值计数(标准化)
和名称输出
%

 df.groupby('day')['mealtype'].value_counts(normalize=True).to_frame('%').reset_index().round(1)


    day    mealtype    %
0   Monday     Dinner  0.5
1   Monday      Snack  0.5
2   Sunday     Dinner  0.3
3   Sunday      Lunch  0.3
4   Sunday      Snack  0.3
5  Tuesday  Breakfast  0.5
6  Tuesday     Dinner  0.5
groupby()、值\u计数(标准化)
和名称输出
%

 df.groupby('day')['mealtype'].value_counts(normalize=True).to_frame('%').reset_index().round(1)


    day    mealtype    %
0   Monday     Dinner  0.5
1   Monday      Snack  0.5
2   Sunday     Dinner  0.3
3   Sunday      Lunch  0.3
4   Sunday      Snack  0.3
5  Tuesday  Breakfast  0.5
6  Tuesday     Dinner  0.5

以下是使用
groupby
的一种可能方法:

df = df.groupby('day')['mealtype'].value_counts().div(df.groupby('day')['mealtype'].count())
df = df.to_frame('percent').reset_index()
print(df)
输出:

       day   mealtype   percent
0   Monday     Dinner  0.500000
1   Monday      Snack  0.500000
2   Sunday     Dinner  0.333333
3   Sunday      Lunch  0.333333
4   Sunday      Snack  0.333333
5  Tuesday  Breakfast  0.500000
6  Tuesday     Dinner  0.500000

以下是使用
groupby
的一种可能方法:

df = df.groupby('day')['mealtype'].value_counts().div(df.groupby('day')['mealtype'].count())
df = df.to_frame('percent').reset_index()
print(df)
输出:

       day   mealtype   percent
0   Monday     Dinner  0.500000
1   Monday      Snack  0.500000
2   Sunday     Dinner  0.333333
3   Sunday      Lunch  0.333333
4   Sunday      Snack  0.333333
5  Tuesday  Breakfast  0.500000
6  Tuesday     Dinner  0.500000

我添加了手动样本输出仅供参考,周一包含2顿晚餐和2份零食,因此其一半用于bothI添加了手动样本输出仅供参考,周一包含2顿晚餐和2份零食,因此其一半用于两者