Pandas 根据环境中的特定多个条件进行过滤_Pandas

Pandas 根据环境中的特定多个条件进行过滤

pandas

Pandas 根据环境中的特定多个条件进行过滤,pandas,Pandas,我从csv文件中读取了以下数据帧： gene name annotation ng DNA 0 HRAS G12S 3.00 1 PIK3CA R88L 3.00 2 BRAF E474A 3.00 3 EGFR E734Q 3.00 4 EGFR V769 3.00 5 BRAF LQ599PE 4.00 6 BRA

我从csv文件中读取了以下数据帧：

   gene name  annotation  ng DNA
 0  HRAS       G12S        3.00
 1  PIK3CA     R88L        3.00
 2  BRAF       E474A       3.00
 3  EGFR       E734Q       3.00
 4  EGFR       V769        3.00
 5  BRAF       LQ599PE     4.00
 6  BRAF       KT587NA     4.00
 7  HRAS       G12S        17.70

我想根据两列中的多个条件进行筛选：例如，根据“BRAF”+“E474A”和“HRA”+“G12S”进行过滤，因此将创建以下df：

   gene name  annotation  ng DNA
 0  HRAS       G12S        3.00
 2  BRAF       E474A       3.00
 7  HRAS       G12S        17.70

有没有优雅解决方案的想法？

使用并通过以下方式将所有遮罩连接到一个：

更具动态性的解决方案，为

列表理解中的筛选器值提供元组列表

：

tup = [('BRAF','E474A'), ('HRAS', 'G12S')]
df = df[np.logical_or.reduce([(df['gene name']== a)&(df['annotation']== b) for a, b in tup])]
print (df)
  gene name annotation  ng DNA
0      HRAS       G12S     3.0
2      BRAF      E474A     3.0
7      HRAS       G12S    17.7

你知道如何把两列合并成一列吗？我想将“gene name”列与“annotation”合并。您可以使用

df['merged']=df['gene name']+'-'+df['annotation']

已经尝试过了，没有成功。给了我以下警告消息：SettingWithCopyWarning:试图在数据帧切片的副本上设置值。尝试使用.loc[row\u indexer，col\u indexer]=value而不是deadok，这是另一个问题。您需要

copy（）

像

df1=df[np.logical\u或.reduce（[m1，m2]）.copy（）

-如果您稍后修改

df

中的值，您会发现修改不会传播回原始数据（

df1

），而且熊猫会发出警告。@Bella-很高兴能帮助您！

tup = [('BRAF','E474A'), ('HRAS', 'G12S')]
df = df[np.logical_or.reduce([(df['gene name']== a)&(df['annotation']== b) for a, b in tup])]
print (df)
  gene name annotation  ng DNA
0      HRAS       G12S     3.0
2      BRAF      E474A     3.0
7      HRAS       G12S    17.7