Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/309.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 如果缺少另一列,则替换其中一列中的值_Python_Pandas - Fatal编程技术网

Python 如果缺少另一列,则替换其中一列中的值

Python 如果缺少另一列,则替换其中一列中的值,python,pandas,Python,Pandas,数据 我有一个名为data的数据框,如下所示: Name ID JAMES 252 STEPHEN 578 JOY nan ROGELIO 473 FACS nan CLIFFORD 793 df = df.dropna(how='any') print(df) 目标 每当数据['ID']丢失时,我想用丢失的值NaN替换数据['Na

数据

我有一个名为data的数据框,如下所示:

Name              ID
JAMES             252
STEPHEN           578
JOY               nan
ROGELIO           473
FACS              nan
CLIFFORD          793
df = df.dropna(how='any')
print(df)
目标

每当数据['ID']丢失时,我想用丢失的值NaN替换数据['Name'],即NaN

结果将是:

Name              ID
JAMES             252
STEPHEN           578
NaN               nan
ROGELIO           473
NaN               nan
CLIFFORD          793

我在网上搜索过,但类似的答案都是关于使用fillna(),这不是我想要的。你对如何做到这一点有什么建议吗

您可以使用.loc函数查找
df['ID']
为空的所有索引,并将
df['NAME']
设置为np.nan

import numpy as np

df.loc[df['ID'].isnull() , 'NAME'] = np.nan
这个方法怎么样

import pandas as pd
import numpy as np
a = {'Name':['JAMES','STEPHEN','JOY','ROGELIO','FACS','CLIFFORD'],'ID':[252,578,np.nan,473,np.nan,793]}
df = pd.DataFrame(a)

df.loc[df['ID'].isnull() , 'Name'] = np.nan
print(df)
输出:

       Name     ID
0     JAMES  252.0
1   STEPHEN  578.0
2       NaN    NaN
3   ROGELIO  473.0
4       NaN    NaN
5  CLIFFORD  793.0
       Name     ID
0     JAMES  252.0
1   STEPHEN  578.0
3   ROGELIO  473.0
5  CLIFFORD  793.0
       Name     ID
0     JAMES  252.0
1   STEPHEN  578.0
2       NaN    NaN
3   ROGELIO  473.0
4       NaN    NaN
5  CLIFFORD  793.0
如果要删除NaN值,请添加以下内容:

Name              ID
JAMES             252
STEPHEN           578
JOY               nan
ROGELIO           473
FACS              nan
CLIFFORD          793
df = df.dropna(how='any')
print(df)
输出:

       Name     ID
0     JAMES  252.0
1   STEPHEN  578.0
2       NaN    NaN
3   ROGELIO  473.0
4       NaN    NaN
5  CLIFFORD  793.0
       Name     ID
0     JAMES  252.0
1   STEPHEN  578.0
3   ROGELIO  473.0
5  CLIFFORD  793.0
       Name     ID
0     JAMES  252.0
1   STEPHEN  578.0
2       NaN    NaN
3   ROGELIO  473.0
4       NaN    NaN
5  CLIFFORD  793.0
编辑:我用了另一种方法,现在它是正确的。

非常适合:

df.mask(df['ID'].isnull())
输出:

       Name     ID
0     JAMES  252.0
1   STEPHEN  578.0
2       NaN    NaN
3   ROGELIO  473.0
4       NaN    NaN
5  CLIFFORD  793.0
       Name     ID
0     JAMES  252.0
1   STEPHEN  578.0
3   ROGELIO  473.0
5  CLIFFORD  793.0
       Name     ID
0     JAMES  252.0
1   STEPHEN  578.0
2       NaN    NaN
3   ROGELIO  473.0
4       NaN    NaN
5  CLIFFORD  793.0

df['Name'].where(df['ID'].notnull())
非常感谢!!这很有魅力!dropna()函数也非常有用!非常感谢你!!这正是我所需要的!这不起作用,因为“NAME”创建了一个新列,其中np.nan作为它的向量。“NAME”列已经存在于数据帧中,为什么它会创建新列?