Python 3.x 根据另一列中的具体条件计算平均值

Python 3.x 根据另一列中的具体条件计算平均值,python-3.x,pandas,dataframe,Python 3.x,Pandas,Dataframe,我有一个如下所示的数据帧 df: 我想计算一个名为Avg\u N\u More\u 365的列 说明: if df['Age_days'] > 365, df['Avg_N_More_365'] = df['N_More_365']/(df['Age_days']-365) else df['Avg_N_More_365'] = 0 预期产出: ID Age_days N_30 N_More_365 Group Avg_N_More_365 1

我有一个如下所示的数据帧 df:

我想计算一个名为
Avg\u N\u More\u 365的列

说明:

if df['Age_days'] > 365, df['Avg_N_More_365'] = df['N_More_365']/(df['Age_days']-365)
else df['Avg_N_More_365'] = 0
预期产出:

 ID     Age_days    N_30     N_More_365  Group      Avg_N_More_365
    1      565         60       1000        Good       5
    2      385         2        180         Normal     9
    3      10          4        0           Normal     0
    4      100         0        100         Normal     0
    5      965         0        1200        Good       2
    6      1165        0        3200        Good       4
    7      865         10       4000        Normal     8

首先,为您创建一个函数:-

def func(val):
    if val['Age_days']>365:
        return val['N_More_365']/(val['Age_days']-365)
    else:
        return 0
现在,最后使用
apply()
方法和chain
astype()
方法:-

df['Avg_N_More_365']=df.apply(func,axis=1).astype(int)
现在,如果您打印
df
,您将获得预期输出:-

    ID  Age_days    N_30    N_More_365  Group   Avg_N_More_365
0   1   565           60    1000        Good         5
1   2   385           2     180         Normal       9
2   3   10            4     0           Normal       0
3   4   100           0     100         Normal       0
4   5   965           0     1200        Good         2
5   6   1165          0     3200        Good         4
6   7   865           10    4000        Normal       8

首先,为您创建一个函数:-

def func(val):
    if val['Age_days']>365:
        return val['N_More_365']/(val['Age_days']-365)
    else:
        return 0
现在,最后使用
apply()
方法和chain
astype()
方法:-

df['Avg_N_More_365']=df.apply(func,axis=1).astype(int)
现在,如果您打印
df
,您将获得预期输出:-

    ID  Age_days    N_30    N_More_365  Group   Avg_N_More_365
0   1   565           60    1000        Good         5
1   2   385           2     180         Normal       9
2   3   10            4     0           Normal       0
3   4   100           0     100         Normal       0
4   5   965           0     1200        Good         2
5   6   1165          0     3200        Good         4
6   7   865           10    4000        Normal       8
用于提高性能:

mask = df['Age_days'] > 365
df['Avg_N_More_365'] = np.where(mask, df['N_More_365']/(df['Age_days']-365), 0)
print (df)
   ID  Age_days  N_30  N_More_365   Group  Avg_N_More_365
0   1       565    60        1000    Good             5.0
1   2       385     2         180  Normal             9.0
2   3        10     4           0  Normal             0.0
3   4       100     0         100  Normal             0.0
4   5       965     0        1200    Good             2.0
5   6      1165     0        3200    Good             4.0
6   7       865    10        4000  Normal             8.0

用于提高性能:

mask = df['Age_days'] > 365
df['Avg_N_More_365'] = np.where(mask, df['N_More_365']/(df['Age_days']-365), 0)
print (df)
   ID  Age_days  N_30  N_More_365   Group  Avg_N_More_365
0   1       565    60        1000    Good             5.0
1   2       385     2         180  Normal             9.0
2   3        10     4           0  Normal             0.0
3   4       100     0         100  Normal             0.0
4   5       965     0        1200    Good             2.0
5   6      1165     0        3200    Good             4.0
6   7       865    10        4000  Normal             8.0

这将有助于:

df.loc[df['Age\u days']>365,['Avg\u N\u More\u 365']=df['N\u More\u 365']/df['Age\u days']
新列将添加到现有数据框中。

这将起作用:

df.loc[df['Age\u days']>365,['Avg\u N\u More\u 365']=df['N\u More\u 365']/df['Age\u days']
新列将添加到现有数据帧中