Python 计算在所有行中使用lambda函数填充值(NULL除外)的字段数
谢谢你看这个问题。我正在使用lambda创建一个逻辑,该逻辑贯穿所有行,并统计除NA之外具有值的字段数。正如您在给定的示例中所看到的Python 计算在所有行中使用lambda函数填充值(NULL除外)的字段数,python,pandas,Python,Pandas,谢谢你看这个问题。我正在使用lambda创建一个逻辑,该逻辑贯穿所有行,并统计除NA之外具有值的字段数。正如您在给定的示例中所看到的 Input : project_id project_a project_b project_c project_d project_e 1 Yes Yes Yes NA Yes 2 Yes Yes
Input :
project_id project_a project_b project_c project_d project_e
1 Yes Yes Yes NA Yes
2 Yes Yes Yes NA Yes
3 NA Yes Yes NA Yes
4 Yes Yes Yes NA Yes
5 NA Yes Yes NA Yes
Desired Output :
project_id project_a project_b project_c project_d project_e field_populated
1 Yes Yes Yes NA Yes 5
2 Yes Yes Yes NA Yes 5
3 NA Yes Yes NA Yes 3
4 Yes Yes Yes NA Yes 5
5 NA Yes Yes NA Yes 4
我尝试过使用以下代码,但遇到了一些问题
proj_table['field_populated'] = proj_table['project_id', 'project_a', 'project_b','project_c, 'project_d','project_e].apply(lambda x: x+1 if x != "NA" or np.nan else x)
如果将其过度复杂化,则可以使用、
count
计算数据帧的非空值并填充新列,并沿行执行操作(axis=1
)
<代码>过滤器(like=Project)> >只考虑“Project”列,以防您的实际<代码> df中有更多列。
df['field_populated'] = df.filter(like='project').count(axis=1)
其中打印:
df
project_id project_a project_b ... project_d project_e field_populated
0 1 Yes Yes ... NaN Yes 5
1 2 Yes Yes ... NaN Yes 5
2 3 NaN Yes ... NaN Yes 4
3 4 Yes Yes ... NaN Yes 5
4 5 NaN Yes ... NaN Yes 4
col1 col2 col3 col4 col5 field_populated
0 1 2 3 4 5 5
1 1 2 NA 4 5 4
2 1 2 3 4 NA 4
3 NA 2 NA 4 5 3
正如sophocles所说,这里不需要lambda函数。但是,如果您出于学习的原因打算应用lambda函数,您可以执行一个简单的lambda,计算每行中有多少个非“X”值。此外,如果您想做得更“lambdish”,您也可以用另一个函数检查值是否为“NA” 这里有一个小例子:
import pandas as pd
not_NA = lambda column: 1 if column != "NA" else 0
count_NA = lambda row: sum(row.apply(not_NA))
df = pd.DataFrame([["1","2","3","4","5"], ["1","2","NA","4","5"], ["1","2","3","4","NA"], ["NA","2","NA","4","5"]], columns=list(["col1","col2","col3","col4","col5"]))
df['field_populated'] = df.apply(count_NA, axis=1)
其中打印:
df
project_id project_a project_b ... project_d project_e field_populated
0 1 Yes Yes ... NaN Yes 5
1 2 Yes Yes ... NaN Yes 5
2 3 NaN Yes ... NaN Yes 4
3 4 Yes Yes ... NaN Yes 5
4 5 NaN Yes ... NaN Yes 4
col1 col2 col3 col4 col5 field_populated
0 1 2 3 4 5 5
1 1 2 NA 4 5 4
2 1 2 3 4 NA 4
3 NA 2 NA 4 5 3
请注意,在lambda函数“not_NA”中,您还可以添加其他值,以便在需要时丢弃。我猜您输入了一个错误,
project_id=3
的值应该是4
?是的,有@ShubhamSharma。谢谢