Python 基于表中其他列的值为新列赋值_Python_Pandas

Python 基于表中其他列的值为新列赋值

python pandas

Python 基于表中其他列的值为新列赋值,python,pandas,Python,Pandas,以下是数据帧的子集： id words A B C D E 1 new 1 1 2 good 1 3 star 1 4 never 5 final def fill_if_nan(row): if row[['A', 'B', 'C', 'D', 'E']].isnull().all(): return 1 return N

以下是数据帧的子集：

id  words  A   B   C   D  E  
1   new    1       1   
2   good   1  
3   star            1
4   never                  
5   final

def fill_if_nan(row):
    if row[['A', 'B', 'C', 'D', 'E']].isnull().all():
        return 1

    return None

df['FF'] = df.apply(fill_if_nan, axis=1)

我想将一个新变量（称为FF）定义为一个新列，如果所有其他变量（列）的值都为“null”，则为其赋值1。新的数据帧如下所示：

id  words  A   B   C   D  E  FF
1   new    1       1   
2   good   1  
3   star            1
4   never                     1                
5   final                     1

我如何使用python和Pandas实现这一点？谢谢

您可以定义一个按行应用于数据帧的函数：

id  words  A   B   C   D  E  
1   new    1       1   
2   good   1  
3   star            1
4   never                  
5   final

def fill_if_nan(row):
    if row[['A', 'B', 'C', 'D', 'E']].isnull().all():
        return 1

    return None

df['FF'] = df.apply(fill_if_nan, axis=1)

或者更优雅的基于numpy的解决方案：

df['FF'] = np.where(df[['A', 'B', 'C', 'D', 'E']].isnull().all(1), 1, np.nan)

非常感谢。程序无法识别空值。对于某些行，变量的所有值都为null，但FF变量没有“1”。我想我需要将所有的空格重新设置为空值。你有解决方案吗？如果你想用

nan

替换空格，你可以使用

df.replace（r'\s+'，np.nan，regex=True）

。看这个。我试过了，但它也用空值替换了单词的列。如果列中有几个单词，并且它们之间有空格，我怎么说除了“单词”列之外。@Mary您需要显式指定列，如

df[['A'，B'，…]中所示。替换（…）

并将其分配回df。