Python 基于多个条件将列的值填充到数据帧的新列
假设我有一个如下的数据框Python 基于多个条件将列的值填充到数据帧的新列,python,pandas,lambda,Python,Pandas,Lambda,假设我有一个如下的数据框 df.head() col1 col2 col3 start end gs chr1 HAS GEN 11869 14409 DDX chr1 HAS TRANS 11869 14409 NaN chr1 HAS EX 11869 12227 NaN chr1 HAS GEN 12613 12721 FXBZ chr1
df.head()
col1 col2 col3 start end gs
chr1 HAS GEN 11869 14409 DDX
chr1 HAS TRANS 11869 14409 NaN
chr1 HAS EX 11869 12227 NaN
chr1 HAS GEN 12613 12721 FXBZ
chr1 HAS EX 13221 14409 NaN
chr1 HAS EX 12010 12057 NaN
现在,我需要根据两个条件添加一个新列,并且必须从一个列中使用值
例如,条件是
- 如果
等于col3
或GEN
。然后添加一个新列EX
,该列的值来自列col7
gs
- 当
等于col3
时,GEN
中的值必须始终为该值。这绝不是gs
NaNs
col1 col2 col3 start end gs col7
chr1 HAS GEN 11869 14409 DDX DDX
chr1 HAS EX 11869 12227 NaN DDX
chr1 HAS TRANS 11869 14409 no
chr1 HAS GEN 12613 12721 FXBZ FXBZ
chr1 HAS EX 13221 14409 NaN FXBZ
chr1 HAS EX 12010 12057 NaN FXBZ
我尝试使用lambda
:
df.apply(
lambda row: row['gs'] if (row['col3'] =="EX" and row['gs'] !=NaN) else "no",
axis=1)
但是,我无法将gs
列中的值填充到新列中。它设置NaN
值。这是我不想要的
任何建议都将不胜感激 我相信您可以使用with condition by并向前填充列gs
中缺少的值:
df['col7'] = np.where(df['col3'].isin(['GEN','EX']), df['gs'].ffill(), 'no')
print (df)
col1 col2 col3 start end gs col7
0 chr1 HAS GEN 11869 14409 DDX DDX
1 chr1 HAS EX 11869 14409 NaN DDX
2 chr1 HAS TRANS 11869 12227 NaN no
3 chr1 HAS GEN 12613 12721 FXBZ FXBZ
4 chr1 HAS EX 13221 14409 NaN FXBZ
5 chr1 HAS EX 12010 12057 NaN FXBZ
详细信息:
print (df['gs'].ffill())
0 DDX
1 DDX
2 DDX
3 FXBZ
4 FXBZ
5 FXBZ
Name: gs, dtype: object