Python 熊猫：计算剔除自己行的平均值'；s值_Python_Pandas_Mean_Aggregation

Python 熊猫：计算剔除自己行的平均值'；s值

python pandas

Python 熊猫：计算剔除自己行的平均值'；s值,python,pandas,mean,aggregation,Python,Pandas,Mean,Aggregation,我想按组计算平均值，忽略行本身的值 import pandas as pd d = {'col1': ["a", "a", "b", "a", "b", "a"], 'col2': [0, 4, 3, -5, 3, 4]} df = pd.DataFrame(data=d) 我知道如何分组返回： df.groupby('col1').agg({'col2': 'mean'}) 返回： Out[247]: col1 col2 1 a 4 3 a -5 5

我想按组计算平均值，忽略行本身的值

import pandas as pd

d = {'col1': ["a", "a", "b", "a", "b", "a"], 'col2': [0, 4, 3, -5, 3, 4]}
df = pd.DataFrame(data=d)

我知道如何分组返回：

df.groupby('col1').agg({'col2': 'mean'})

Out[247]: 
  col1  col2
1    a     4
3    a    -5
5    a     4

Out[251]: 
col2    1.0
dtype: float64

但我想要的是组，忽略行的值。例如，对于第一行：

df.query('col1 == "a"')[1:4].mean()

Out[247]: 
  col1  col2
1    a     4
3    a    -5
5    a     4

Out[251]: 
col2    1.0
dtype: float64

编辑：预期输出是一个与上述df格式相同的数据帧，其中包含一个列

mean\u excl\u own

，该列是组中所有其他成员的平均值，不包括行的自身值。

您可以

col1

和平均值。然后从平均值中减去给定行的值：

df['col2'] = df.groupby('col1').col2.transform('mean').sub(df.col2)

谢谢你的意见。我最终使用了@VnC链接到的方法

我是这样解决的：

import pandas as pd

d = {'col1': ["a", "a", "b", "a", "b", "a"], 'col2': [0, 4, 3, -5, 3, 4]}
df = pd.DataFrame(data=d)

group_summary = df.groupby('col1', as_index=False)['col2'].agg(['mean', 'count'])
df = pd.merge(df, group_summary, on = 'col1')

df['other_sum'] = df['col2'] * df['mean'] - df['col2'] 
df['result'] = df['other_sum'] / (df['count']  - 1)

查看最终结果：

df['result']

其中打印：

Out: 
0    1.000000
1   -0.333333
2    2.666667
3   -0.333333
4    3.000000
5    3.000000
Name: result, dtype: float64

编辑：我以前在列名方面遇到过一些问题，但我用answer修复了它。

df.groupby（'col1'）。mean（）

检查这里我想OP想要的是组的聚合平均值，而不是这个，但他想要从给定行中减去值，这只有在解决方案中包含行本身时才有意义，对吗？他们想要计算整个组的平均值，但忽略该组的第一行，至少这是我的理解，我当然可能是错的。嗯，实际上我认为你是对的，我将删除我的答案