如何使用多个分组列替换python数据帧中的平均值
使用多个分组列将数据帧值替换为平均值。以下快照是数据帧:如何使用多个分组列替换python数据帧中的平均值,python,pandas,dataframe,pandas-groupby,Python,Pandas,Dataframe,Pandas Groupby,使用多个分组列将数据帧值替换为平均值。以下快照是数据帧: Current Loan Amount DateTime Day Month Year 0 611314 1-Jan-92 1 Jan 92 1 266662 2-Jan-92 2 Jan 92 2 153494 3-Jan-92 3 Jan 92 3 176242 4-Jan-92 4 Jan 92 4 321992 5-Jan-92 5
Current Loan Amount DateTime Day Month Year
0 611314 1-Jan-92 1 Jan 92
1 266662 2-Jan-92 2 Jan 92
2 153494 3-Jan-92 3 Jan 92
3 176242 4-Jan-92 4 Jan 92
4 321992 5-Jan-92 5 Jan 92
5 202928 6-Jan-92 6 Jan 92
6 621786 7-Jan-92 7 Jan 92
7 266794 8-Jan-92 8 Jan 92
8 202466 9-Jan-92 9 Jan 92
9 266288 10-Jan-92 10 Jan 92
10 121110 11-Jan-92 11 Jan 92
11 258104 12-Jan-92 12 Jan 92
12 161722 13-Jan-92 13 Jan 92
13 753016 14-Jan-92 14 Jan 92
14 444664 15-Jan-92 15 Jan 92
15 172282 16-Jan-92 16 Jan 92
16 275440 17-Jan-92 17 Jan 92
17 218834 18-Jan-92 18 Jan 92
18 0 19-Jan-92 19 Jan 92
19 0 20-Jan-92 20 Jan 92
我需要将0.0值替换为当年和当月当前贷款金额的平均值
我使用了不同的方法,下面给出了平均值,但它不会更改数据帧并删除其余的列
data = data_loan.groupby(['Year','Month'])
def replace(group):
mask = (group==0)
group[mask] = group[~mask].mean()
return group
new_data = data.transform(replace)
这将用组的平均值替换0
import numpy as np
data_loan['current'] = data_loan['current'].replace(0, np.nan)
data_loan["current"] = data_loan.groupby(['Month','Year'])["current"].transform(lambda x: x.fillna(x.mean()))