Python 3.x 根据另一列中的具体条件计算平均值
我有一个如下所示的数据帧 df: 我想计算一个名为Python 3.x 根据另一列中的具体条件计算平均值,python-3.x,pandas,dataframe,Python 3.x,Pandas,Dataframe,我有一个如下所示的数据帧 df: 我想计算一个名为Avg\u N\u More\u 365的列 说明: if df['Age_days'] > 365, df['Avg_N_More_365'] = df['N_More_365']/(df['Age_days']-365) else df['Avg_N_More_365'] = 0 预期产出: ID Age_days N_30 N_More_365 Group Avg_N_More_365 1
Avg\u N\u More\u 365的列
说明:
if df['Age_days'] > 365, df['Avg_N_More_365'] = df['N_More_365']/(df['Age_days']-365)
else df['Avg_N_More_365'] = 0
预期产出:
ID Age_days N_30 N_More_365 Group Avg_N_More_365
1 565 60 1000 Good 5
2 385 2 180 Normal 9
3 10 4 0 Normal 0
4 100 0 100 Normal 0
5 965 0 1200 Good 2
6 1165 0 3200 Good 4
7 865 10 4000 Normal 8
首先,为您创建一个函数:-
def func(val):
if val['Age_days']>365:
return val['N_More_365']/(val['Age_days']-365)
else:
return 0
现在,最后使用apply()
方法和chainastype()
方法:-
df['Avg_N_More_365']=df.apply(func,axis=1).astype(int)
现在,如果您打印df
,您将获得预期输出:-
ID Age_days N_30 N_More_365 Group Avg_N_More_365
0 1 565 60 1000 Good 5
1 2 385 2 180 Normal 9
2 3 10 4 0 Normal 0
3 4 100 0 100 Normal 0
4 5 965 0 1200 Good 2
5 6 1165 0 3200 Good 4
6 7 865 10 4000 Normal 8
首先,为您创建一个函数:-
def func(val):
if val['Age_days']>365:
return val['N_More_365']/(val['Age_days']-365)
else:
return 0
现在,最后使用apply()
方法和chainastype()
方法:-
df['Avg_N_More_365']=df.apply(func,axis=1).astype(int)
现在,如果您打印df
,您将获得预期输出:-
ID Age_days N_30 N_More_365 Group Avg_N_More_365
0 1 565 60 1000 Good 5
1 2 385 2 180 Normal 9
2 3 10 4 0 Normal 0
3 4 100 0 100 Normal 0
4 5 965 0 1200 Good 2
5 6 1165 0 3200 Good 4
6 7 865 10 4000 Normal 8
用于提高性能:
mask = df['Age_days'] > 365
df['Avg_N_More_365'] = np.where(mask, df['N_More_365']/(df['Age_days']-365), 0)
print (df)
ID Age_days N_30 N_More_365 Group Avg_N_More_365
0 1 565 60 1000 Good 5.0
1 2 385 2 180 Normal 9.0
2 3 10 4 0 Normal 0.0
3 4 100 0 100 Normal 0.0
4 5 965 0 1200 Good 2.0
5 6 1165 0 3200 Good 4.0
6 7 865 10 4000 Normal 8.0
用于提高性能:
mask = df['Age_days'] > 365
df['Avg_N_More_365'] = np.where(mask, df['N_More_365']/(df['Age_days']-365), 0)
print (df)
ID Age_days N_30 N_More_365 Group Avg_N_More_365
0 1 565 60 1000 Good 5.0
1 2 385 2 180 Normal 9.0
2 3 10 4 0 Normal 0.0
3 4 100 0 100 Normal 0.0
4 5 965 0 1200 Good 2.0
5 6 1165 0 3200 Good 4.0
6 7 865 10 4000 Normal 8.0
这将有助于:
df.loc[df['Age\u days']>365,['Avg\u N\u More\u 365']=df['N\u More\u 365']/df['Age\u days']
新列将添加到现有数据框中。这将起作用:
df.loc[df['Age\u days']>365,['Avg\u N\u More\u 365']=df['N\u More\u 365']/df['Age\u days']
新列将添加到现有数据帧中