Python 使用具有相同行值的其他列计算列的模式_Python_Pandas_Pandas Groupby

Python 使用具有相同行值的其他列计算列的模式

python pandas

Python 使用具有相同行值的其他列计算列的模式,python,pandas,pandas-groupby,Python,Pandas,Pandas Groupby,我有以下df- id score 222.0 0.0 222.0 0.0 222.0 1.0 222.0 0.0 222.0 1.0 222.0 1.0 222.0 1.0 222.0 0.0

我有以下

df

id            score
222.0         0.0           
222.0         0.0           
222.0         1.0           
222.0         0.0           
222.0         1.0           
222.0         1.0           
222.0         1.0           
222.0         0.0           
222.0         1.0           
222.0        -1.0           
416.0         0.0           
416.0         0.0           
416.0         2.0           
416.0         0.0           
416.0         1.0           
416.0         0.0           
416.0         1.0           
416.0         1.0           
416.0         0.0           
416.0         0.0           
895.0         1.0           
895.0         0.0           
895.0         0.0           
895.0         0.0           
895.0         0.0           
895.0         0.0           
895.0         0.0           
895.0         0.0           
895.0         0.0           
895.0         0.0

我想为

score

列的

id

的相同值计算模式。像这样的-

id            score
222.0         1.0           
416.0         0.0           
895.0         0.0

我的朋友是这样试的-

df['score'] = df.mode()['score']

但我得到了以下结果-

id            score
222.0         0.0           
222.0         NaN           
222.0         NaN           
222.0         NaN           
222.0         NaN           
222.0         NaN           
222.0         NaN           
222.0         NaN           
222.0         NaN           
222.0         NaN           
416.0         NaN           
416.0         NaN           
416.0         NaN          
416.0         NaN           
416.0         NaN           
416.0         NaN           
416.0         NaN           
416.0         NaN           
416.0         NaN           
416.0         NaN           
895.0         NaN           
895.0         NaN           
895.0         NaN           
895.0         NaN           
895.0         NaN           
895.0         NaN           
895.0         NaN           
895.0         NaN           
895.0         NaN           
895.0         NaN

这里出了什么问题？

按ID对分数进行分组，并对每个分数应用模式：

>>> df.score.groupby(df['id']).apply(lambda g: g.mode()).reset_index()[['id', 'score']]
      id    score
0   222.0   1.0
1   416.0   0.0
2   895.0   0.0

你也可以使用

In [79]: df.groupby('id').agg({'score': lambda x: x.value_counts().index[0]}).reset_index()
Out[79]:
      id  score
0  222.0    1.0
1  416.0    0.0
2  895.0    0.0

或者，使用

In [80]: from scipy.stats.mstats import mode

In [81]: df.groupby('id').agg({'score': lambda x: mode(x)[0]}).reset_index()
Out[81]:
      id  score
0  222.0    1.0
1  416.0    0.0
2  895.0    0.0

谢谢，我没有做groupby。

df.score.groupby（df['id']）.agg（lambda g:g.mode（））.reset_index（）

可以吗？非常好