Python 如何在某些条件下用平均值覆盖列_Python_Pandas

Python 如何在某些条件下用平均值覆盖列

python pandas

Python 如何在某些条件下用平均值覆盖列,python,pandas,Python,Pandas,我在熊猫中有以下数据帧 ID Quantity Rate Product 1 10 70 MS 2 10 70 MS 3 100 70 MS 4 10 100 MS 5 700 65 HS 6

我在熊猫中有以下数据帧

 ID      Quantity       Rate       Product
 1       10             70         MS
 2       10             70         MS
 3       100            70         MS
 4       10             100        MS
 5       700            65         HS
 6       1100           65         HS
 7       700            100        HS

对于

MS

，我想用

Quantity和Rate

中的平均值来限制值，如果

Quantity大于100，Rate大于99

，则应该用平均值代替，对于

HS

如果

Quantity大于1000，Rate大于99，则应该用平均值代替
我在用下面的方法
 mean_MS = df['Quantity'][(df['Product'] == 'MS') and (df['Quantity'] < 100)].mean()

解决这个问题的一个办法,
m1=df['Product']=='MS'
m2=(df['Quantity']>=100)|(df['Rate']>99)
df.loc[m1&m2,'Quantity']=df[m1&(df['Quantity']<100)]['Quantity'].mean()
df.loc[m1&m2,'Rate']=df[m1&(df['Rate']<99)]['Rate'].mean()

m3=df['Product']=='HS'
m4=(df['Quantity']>=1000)|(df['Rate']>99)
df.loc[m3&m4,'Quantity']=df[m3&(df['Quantity']<1000)]['Quantity'].mean()
df.loc[m3&m4,'Rate']=df[m3&(df['Rate']<99)]['Rate'].mean()

说明：
将问题分为两个子模型，一个是MS
，另一个是HS
，因为两者包含相同的逻辑，但数量值不同
首先，您必须仅更改MS的值，以便在m1中标记，然后，如果数量大于或等于100或速率大于99，则替换df中的平均值，其中df包含requiredMS
行，并清除条件超过的值
对速率重复相同的逻辑
在数量条件从100修改为1000的情况下，对HS也重复步骤2和3
IIUC，您也可以尝试以下方法：
val1= df.loc[df.Product.eq('MS'),['Quantity','Rate']].mode().values 
#array([[10, 70]], dtype=int64)
val2= df.loc[df.Product.eq('HS'),['Quantity','Rate']].mode().values
#array([[700,  65]], dtype=int64)

df.loc[df.Product.eq('MS')&df.Quantity.ge(100)|df.Product.eq('MS')&df.Rate.gt(99),['Quantity','Rate']] = val1

df.loc[df.Product.eq('HS')&df.Quantity.ge(1000)|df.Product.eq('HS')&df.Rate.gt(99),['Quantity','Rate']] = val2
print(df)

   ID  Quantity  Rate Product
0   1        10    70      MS
1   2        10    70      MS
2   3        10    70      MS
3   4        10    70      MS
4   5       700    65      HS
5   6       700    65      HS
6   7       700    65      HS

也许这会有帮助<对于MS，代码>数量应大于或小于100？是，这不是correct@neil-我提出的解决方案对您有效吗？如果你还在这里，请放心。
   ID  Quantity  Rate Product
0   1      10.0  70.0      MS
1   2      10.0  70.0      MS
2   3      10.0  70.0      MS
3   4      10.0  70.0      MS
4   5     700.0  65.0      HS
5   6     700.0  65.0      HS
6   7     700.0  65.0      HS

val1= df.loc[df.Product.eq('MS'),['Quantity','Rate']].mode().values 
#array([[10, 70]], dtype=int64)
val2= df.loc[df.Product.eq('HS'),['Quantity','Rate']].mode().values
#array([[700,  65]], dtype=int64)

df.loc[df.Product.eq('MS')&df.Quantity.ge(100)|df.Product.eq('MS')&df.Rate.gt(99),['Quantity','Rate']] = val1

df.loc[df.Product.eq('HS')&df.Quantity.ge(1000)|df.Product.eq('HS')&df.Rate.gt(99),['Quantity','Rate']] = val2
print(df)

   ID  Quantity  Rate Product
0   1        10    70      MS
1   2        10    70      MS
2   3        10    70      MS
3   4        10    70      MS
4   5       700    65      HS
5   6       700    65      HS
6   7       700    65      HS