在python中调用Nan值并更改为数字
我有一个数据帧,比如说在python中调用Nan值并更改为数字,python,pandas,Python,Pandas,我有一个数据帧,比如说df,它看起来像这样: id property_type1 property_type pro 1 Condominium 2 2 2 Farm 14 14 3 House 7 7 4 Lots/Land
df
,它看起来像这样:
id property_type1 property_type pro
1 Condominium 2 2
2 Farm 14 14
3 House 7 7
4 Lots/Land 15 15
5 Mobile/Manufactured Home 13 13
6 Multi-Family 8 8
7 Townhouse 11 11
8 Single Family 10 10
9 Apt/Condo 1 1
10 Home 7 7
11 NaN 29 NaN
现在,我需要pro
列具有与property\u type
列相同的值,只要property\u type1
列具有NaN
值。应该是这样的:
id property_type1 property_type pro
1 Condominium 2 2
2 Farm 14 14
3 House 7 7
4 Lots/Land 15 15
5 Mobile/Manufactured Home 13 13
6 Multi-Family 8 8
7 Townhouse 11 11
8 Single Family 10 10
9 Apt/Condo 1 1
10 Home 7 7
11 NaN 29 29
也就是说,在第11行中,
property\u type 1
是NaN
,pro
列的值变为29,这是property\u type
的值。如何执行此操作?查找property\u type 1
列为NaN
的行,对于这些行:将property\u type
值分配给pro
列
df.ix[df.property_type1.isnull(), 'pro'] = df.ix[df.property_type1.isnull(), 'property_type']
ix
已弃用,请勿使用
选项1我会用
np.where
-
df = df.assign(pro=np.where(df.pro.isnull(), df.property_type, df.pro))
df
id property_type1 property_type pro
0 1 Condominium 2 2.0
1 2 Farm 14 14.0
2 3 House 7 7.0
3 4 Lots/Land 15 15.0
4 5 Mobile/Manufactured Home 13 13.0
5 6 Multi-Family 8 8.0
6 7 Townhouse 11 11.0
7 8 Single Family 10 10.0
8 9 Apt/Condo 1 1.0
9 10 Home 7 7.0
10 11 NaN 29 29.0
m = df.pro.isnull()
df.loc[m, 'pro'] = df.loc[m, 'property_type']
df
id property_type1 property_type pro
0 1 Condominium 2 2.0
1 2 Farm 14 14.0
2 3 House 7 7.0
3 4 Lots/Land 15 15.0
4 5 Mobile/Manufactured Home 13 13.0
5 6 Multi-Family 8 8.0
6 7 Townhouse 11 11.0
7 8 Single Family 10 10.0
8 9 Apt/Condo 1 1.0
9 10 Home 7 7.0
10 11 NaN 29 29.0
选项2
如果要执行就地分配,请使用
loc
-
df = df.assign(pro=np.where(df.pro.isnull(), df.property_type, df.pro))
df
id property_type1 property_type pro
0 1 Condominium 2 2.0
1 2 Farm 14 14.0
2 3 House 7 7.0
3 4 Lots/Land 15 15.0
4 5 Mobile/Manufactured Home 13 13.0
5 6 Multi-Family 8 8.0
6 7 Townhouse 11 11.0
7 8 Single Family 10 10.0
8 9 Apt/Condo 1 1.0
9 10 Home 7 7.0
10 11 NaN 29 29.0
m = df.pro.isnull()
df.loc[m, 'pro'] = df.loc[m, 'property_type']
df
id property_type1 property_type pro
0 1 Condominium 2 2.0
1 2 Farm 14 14.0
2 3 House 7 7.0
3 4 Lots/Land 15 15.0
4 5 Mobile/Manufactured Home 13 13.0
5 6 Multi-Family 8 8.0
6 7 Townhouse 11 11.0
7 8 Single Family 10 10.0
8 9 Apt/Condo 1 1.0
9 10 Home 7 7.0
10 11 NaN 29 29.0
只计算一次掩码,然后使用它多次索引,这应该比计算两次更有效。1。不要使用
ix
,它已被弃用。2.不要计算掩码两次,效率很低。没问题@cᴏʟᴅsᴘᴇᴇᴅ, 你是对的。我不是最新的ix
,双重输入掩码是我的错误。只要你明白,那太棒了。我不知道是谁投了反对票,但我已经投了一张赞成票。祝你好运