Python 计算在所有行中使用lambda函数填充值(NULL除外)的字段数

Python 计算在所有行中使用lambda函数填充值(NULL除外)的字段数,python,pandas,Python,Pandas,谢谢你看这个问题。我正在使用lambda创建一个逻辑,该逻辑贯穿所有行,并统计除NA之外具有值的字段数。正如您在给定的示例中所看到的 Input : project_id project_a project_b project_c project_d project_e 1 Yes Yes Yes NA Yes 2 Yes Yes

谢谢你看这个问题。我正在使用lambda创建一个逻辑,该逻辑贯穿所有行,并统计除NA之外具有值的字段数。正如您在给定的示例中所看到的

Input : 

project_id   project_a   project_b   project_c   project_d   project_e
     1             Yes         Yes        Yes         NA       Yes
     2             Yes         Yes        Yes         NA       Yes
     3             NA          Yes        Yes         NA       Yes
     4             Yes         Yes        Yes         NA       Yes
     5             NA          Yes        Yes         NA       Yes

Desired Output :

project_id   project_a   project_b   project_c   project_d   project_e    field_populated
     1             Yes         Yes        Yes         NA       Yes              5
     2             Yes         Yes        Yes         NA       Yes              5
     3             NA          Yes        Yes         NA       Yes              3
     4             Yes         Yes        Yes         NA       Yes              5
     5             NA          Yes        Yes         NA       Yes              4
我尝试过使用以下代码,但遇到了一些问题

proj_table['field_populated'] = proj_table['project_id', 'project_a', 'project_b','project_c, 'project_d','project_e].apply(lambda x: x+1 if x != "NA" or np.nan else x) 

如果将其过度复杂化,则可以使用、
count
计算数据帧的非空值并填充新列,并沿行执行操作(
axis=1

<代码>过滤器(like=Project) >只考虑“Project”列,以防您的实际<代码> df中有更多列。

df['field_populated'] = df.filter(like='project').count(axis=1)
其中打印:

df

   project_id project_a project_b  ... project_d  project_e field_populated
0           1       Yes       Yes  ...       NaN        Yes               5
1           2       Yes       Yes  ...       NaN        Yes               5
2           3       NaN       Yes  ...       NaN        Yes               4
3           4       Yes       Yes  ...       NaN        Yes               5
4           5       NaN       Yes  ...       NaN        Yes               4
  col1 col2 col3 col4 col5  field_populated
0    1    2    3    4    5                5
1    1    2   NA    4    5                4
2    1    2    3    4   NA                4
3   NA    2   NA    4    5                3

正如sophocles所说,这里不需要lambda函数。但是,如果您出于学习的原因打算应用lambda函数,您可以执行一个简单的lambda,计算每行中有多少个非“X”值。此外,如果您想做得更“lambdish”,您也可以用另一个函数检查值是否为“NA”

这里有一个小例子:

import pandas as pd

not_NA = lambda column: 1 if column != "NA" else 0

count_NA = lambda row: sum(row.apply(not_NA))

df = pd.DataFrame([["1","2","3","4","5"], ["1","2","NA","4","5"], ["1","2","3","4","NA"], ["NA","2","NA","4","5"]], columns=list(["col1","col2","col3","col4","col5"]))

df['field_populated'] = df.apply(count_NA, axis=1)
其中打印:

df

   project_id project_a project_b  ... project_d  project_e field_populated
0           1       Yes       Yes  ...       NaN        Yes               5
1           2       Yes       Yes  ...       NaN        Yes               5
2           3       NaN       Yes  ...       NaN        Yes               4
3           4       Yes       Yes  ...       NaN        Yes               5
4           5       NaN       Yes  ...       NaN        Yes               4
  col1 col2 col3 col4 col5  field_populated
0    1    2    3    4    5                5
1    1    2   NA    4    5                4
2    1    2    3    4   NA                4
3   NA    2   NA    4    5                3

请注意,在lambda函数“not_NA”中,您还可以添加其他值,以便在需要时丢弃。

我猜您输入了一个错误,
project_id=3
的值应该是
4
?是的,有@ShubhamSharma。谢谢