Python 如何根据条件为df的列赋值?
我需要根据条件为Python 如何根据条件为df的列赋值?,python,dataframe,Python,Dataframe,我需要根据条件为df的列赋值。 如果df.condition>0,df.result=df.data1,如果df.condition为您的条件创建布尔掩码,并使用它们来选择赋值左侧和右侧的行。 测向头(15) 数据a b数据2 0 1.864896 81 30 0 1 -0.059083 81 93 0 2 -0.953324 89 1 0 3 0.367495 2 68 0 4 -1.537818 70 88
df
的列赋值。
如果df.condition>0
,df.result=df.data1
,如果df.condition为您的条件创建布尔掩码,并使用它们来选择赋值左侧和右侧的行。
测向头(15)
数据a b数据2
0 1.864896 81 30 0
1 -0.059083 81 93 0
2 -0.953324 89 1 0
3 0.367495 2 68 0
4 -1.537818 70 88 0
5 -1.118238 76 35 0
6 -0.017608 46 68 0
7 1.571796 12 95 0
8 0.683234 44 7 0
9 -1.320751 50 42 0
10 -0.463197 19 66 0
11 0.786541 44 32 0
12 -0.171833 28 26 0
13 1.668763 75 7 0
14 0.846662 42 56 0
>>>gt=df.data>0
>>>lt=df.data<0
>>>df.loc[gt,'a']=df.loc[gt,'data2']
>>>df.loc[lt,'b']=df.loc[lt,'data2']
>>>测向头(15)
数据a b数据2
0 1.864896 0 30 0
1 -0.059083 81 0 0
2 -0.953324 89 0 0
3 0.367495 0 68 0
4 -1.537818 70 0 0
5 -1.118238 76 0 0
6 -0.017608 46 0 0
7 1.571796 0 95 0
8 0.683234 0 7 0
9 -1.320751 50 0 0
10 -0.463197 19 0 0
11 0.786541 0 32 0
12 -0.171833 28 0 0
13 1.668763 0 7 0
14 0.846662 0 56 0
使用时必须反转逻辑,因为它只会更改不满足条件的值
>>> df.head(10)
data a b data2
0 1.046114 41 66 0
1 0.156532 65 46 0
2 -0.768515 56 36 0
3 0.640834 36 89 0
4 0.008113 39 26 0
5 -0.528028 63 49 0
6 -1.343293 87 94 0
7 1.076804 5 26 0
8 0.172443 9 57 0
9 -0.375729 84 47 0
>>> gt = df.data > 0
>>> lt = df.data < 0
>>> df.b.where(gt,df.data2,inplace=True)
>>> df.a.where(lt,df.data2,inplace=True)
>>> df.head(10)
data a b data2
0 1.046114 0 66 0
1 0.156532 0 46 0
2 -0.768515 56 0 0
3 0.640834 0 89 0
4 0.008113 0 26 0
5 -0.528028 63 0 0
6 -1.343293 87 0 0
7 1.076804 0 26 0
8 0.172443 0 57 0
9 -0.375729 84 0 0
>>>
测向头(10)
数据a b数据2
0 1.046114 41 66 0
1 0.156532 65 46 0
2 -0.768515 56 36 0
3 0.640834 36 89 0
4 0.008113 39 26 0
5 -0.528028 63 49 0
6 -1.343293 87 94 0
7 1.076804 5 26 0
8 0.172443 9 57 0
9 -0.375729 84 47 0
>>>gt=df.data>0
>>>lt=df.data<0
>>>df.b.where(gt,df.data2,inplace=True)
>>>df.a.where(lt,df.data2,inplace=True)
>>>测向头(10)
数据a b数据2
0 1.046114 0 66 0
1 0.156532 0 46 0
2 -0.768515 56 0 0
3 0.640834 0 89 0
4 0.008113 0 26 0
5 -0.528028 63 0 0
6 -1.343293 87 0 0
7 1.076804 0 26 0
8 0.172443 0 57 0
9 -0.375729 84 0 0
>>>
深入研究。我想到了一种方法,此外,我将学习numpy。如果您能够根据示例输入发布预期的输出,那么againIt将非常棒。
def main():
condition = {"condition": np.random.randn(200)}
df = pd.DataFrame(condition)
df['data1']=np.random.randint(1,100, len(df))
df['data2']=np.random.randint(1,100, len(df))
df['result']=0
gt=df.condition>0
lt=df.condition<0
df.result.where(gt,df.data2,inplace=True)
df.result.where(lt,df.data1,inplace=True)
print (df.head(10))
return
main()
condition data1 data2 result
0 -1.580927 63 23 23
1 -1.549005 94 20 20
2 2.153873 18 83 18
3 -0.115974 31 8 8
4 -0.726009 61 38 38
5 2.039930 96 63 96
6 -1.523605 94 96 96
7 -0.157509 8 4 4
8 -0.166163 11 21 21
9 -0.540077 14 64 64
import pandas as pd
import numpy as np
def main():
condition = {"condition": np.random.randn(200)}
df = pd.DataFrame(condition)
df['data1'] = np.random.randint(1, 100, len(df))
df['data2'] = np.random.randint(1, 100, len(df))
df['result'] = np.where(df['condition'] > 0, df['data1'], df['data2'])
print (df.head(10))
main()
>>> df.head(15)
data a b data2
0 1.864896 81 30 0
1 -0.059083 81 93 0
2 -0.953324 89 1 0
3 0.367495 2 68 0
4 -1.537818 70 88 0
5 -1.118238 76 35 0
6 -0.017608 46 68 0
7 1.571796 12 95 0
8 0.683234 44 7 0
9 -1.320751 50 42 0
10 -0.463197 19 66 0
11 0.786541 44 32 0
12 -0.171833 28 26 0
13 1.668763 75 7 0
14 0.846662 42 56 0
>>> gt = df.data > 0
>>> lt = df.data < 0
>>> df.loc[gt,'a'] = df.loc[gt,'data2']
>>> df.loc[lt,'b'] = df.loc[lt,'data2']
>>> df.head(15)
data a b data2
0 1.864896 0 30 0
1 -0.059083 81 0 0
2 -0.953324 89 0 0
3 0.367495 0 68 0
4 -1.537818 70 0 0
5 -1.118238 76 0 0
6 -0.017608 46 0 0
7 1.571796 0 95 0
8 0.683234 0 7 0
9 -1.320751 50 0 0
10 -0.463197 19 0 0
11 0.786541 0 32 0
12 -0.171833 28 0 0
13 1.668763 0 7 0
14 0.846662 0 56 0
>>> df.head(10)
data a b data2
0 1.046114 41 66 0
1 0.156532 65 46 0
2 -0.768515 56 36 0
3 0.640834 36 89 0
4 0.008113 39 26 0
5 -0.528028 63 49 0
6 -1.343293 87 94 0
7 1.076804 5 26 0
8 0.172443 9 57 0
9 -0.375729 84 47 0
>>> gt = df.data > 0
>>> lt = df.data < 0
>>> df.b.where(gt,df.data2,inplace=True)
>>> df.a.where(lt,df.data2,inplace=True)
>>> df.head(10)
data a b data2
0 1.046114 0 66 0
1 0.156532 0 46 0
2 -0.768515 56 0 0
3 0.640834 0 89 0
4 0.008113 0 26 0
5 -0.528028 63 0 0
6 -1.343293 87 0 0
7 1.076804 0 26 0
8 0.172443 0 57 0
9 -0.375729 84 0 0
>>>