Python 熊猫通过多个“过滤”；包括「；不是一个单元格而是整个列_Python_Pandas_Dataframe

Python 熊猫通过多个“过滤”；包括「；不是一个单元格而是整个列

python pandas dataframe

Python 熊猫通过多个“过滤”；包括「；不是一个单元格而是整个列,python,pandas,dataframe,Python,Pandas,Dataframe,我有一堆数据帧，我想找到包含我指定的两个单词的数据帧。例如，我想查找包含单词hello和world的所有数据帧。A&B有资格，C没有资格我试过： df[（df[column].str.contains（'hello'））和（df[column].str.contains（'world'））]只拾取B和df[（df[column].str.contains（'hello'））|（df[column].str.contains（'world'））]，拾取所有三者我需要只选A&B的东西 A= B=

我有一堆数据帧，我想找到包含我指定的两个单词的数据帧。例如，我想查找包含单词

hello

和

world

的所有数据帧。A&B有资格，C没有资格

我试过：

df[（df[column].str.contains（'hello'））和（df[column].str.contains（'world'））]

只拾取B和

df[（df[column].str.contains（'hello'））|（df[column].str.contains（'world'））]

，拾取所有三者

我需要只选A&B的东西

如果

'hello'

在任何地方都可以找到，并且

'world'

在一列中的任何地方都可以找到，则需要一个布尔值：

df.Data.str.contains('hello').any() & df.Data.str.contains('world').any()

如果您有一个单词列表，需要检查整个

数据帧

请尝试：

import numpy as np

lst = ['hello', 'world']
np.logical_and.reduce([any(word in x for x in df.values.ravel()) for word in lst])

样本数据使用

如果hello和world是数据中的独立字符串，则df.eq（）应该执行此操作，而不需要str.contains。它不是一个字符串方法，可以在整个数据帧上工作

(((df == 'hello').any()) & ((df == 'world').any())).any()

True

我注意到你需要一列一列地看。是否有

df.str.contains（'xxx'）。是否有任何（）类型的函数应用于整个df
？现在我正在循环浏览各个列。@jason请查看我的更新，不过您需要事先确保所有内容都是字符串，这可以通过.astype
df.Data.str.contains('hello').any() & df.Data.str.contains('world').any()

import numpy as np

lst = ['hello', 'world']
np.logical_and.reduce([any(word in x for x in df.values.ravel()) for word in lst])

print(df)
   Name   Data   Data2
0  Mike  hello  orange
1  Mike  world  banana
2  Mike  hello  banana
3  Fred  world  apples
4  Fred  hello   mango
5   Ted  world    pear

lst = ['apple', 'hello', 'world']
np.logical_and.reduce([any(word in x for x in df.values.ravel()) for word in lst])
#True

lst = ['apple', 'hello', 'world', 'bear']
np.logical_and.reduce([any(word in x for x in df.values.ravel()) for word in lst])
# False

import re 

bool(re.search(r'^(?=.*hello)(?=.*world)', df.sum().sum())
Out[461]: True

(((df == 'hello').any()) & ((df == 'world').any())).any()

True