Python 如何只保留'；单元格'；在数据帧上有特定文本的？_Python_Pandas_Dataframe

Python 如何只保留'；单元格'；在数据帧上有特定文本的？

python pandas dataframe

Python 如何只保留'；单元格'；在数据帧上有特定文本的？,python,pandas,dataframe,Python,Pandas,Dataframe,我想知道是否可以只在数据框中保留具有特定文本的“单元格”，例如，如果我有以下数据框： import pandas as pd import numpy as np df = pd.DataFrame(np.array([['12hello2', '12hey2', 'hello', '12hey2', '1hello'], ['12hey2', '12hey2', 'hello', '1hello', '1hello'], ['12hey2', '12hey2', 'hello', '1he

我想知道是否可以只在数据框中保留具有特定文本的“单元格”，例如，如果我有以下数据框：

import pandas as pd
import numpy as np


df = pd.DataFrame(np.array([['12hello2', '12hey2', 'hello', '12hey2', '1hello'], ['12hey2', '12hey2', 'hello', '1hello', '1hello'], ['12hey2', '12hey2', 'hello', '1hello', '1hello']]),
                   columns=['a', 'b', 'c','d','e'])

除了包含字符串“hello”的“cells”之外，我如何删除所有内容？我知道如何对特定列或特定行执行此操作，但不知道如何对这两个列执行此操作，因此我只剩下字符串中有“hello”的实例。

我能想到的最简单的方法是使用

apply

按列筛选，然后使用

where

屏蔽：

df.where(df.apply(lambda x: x.str.contains('hello')))

输出：

          a    b      c       d       e
0  12hello2  NaN  hello     NaN  1hello
1       NaN  NaN  hello  1hello  1hello
2       NaN  NaN  hello  1hello  1hello

使用

替换

堆栈
/取消堆栈
类似于replace

df.replace({"^(.(?<!hello))*?$":np.nan},regex=True)
          a   b      c       d       e
0  12hello2 NaN  hello     NaN  1hello
1       NaN NaN  hello  1hello  1hello
2       NaN NaN  hello  1hello  1hello

df.replace（{“^（（？df[df.applymap（lambda x:hello'in x）]
更容易；）在这种情况下是的。一般来说，如果函数是矢量化的，我认为apply
比applymap快一点。
df[df.stack().str.contains('hello').unstack()]

          a    b      c       d       e
0  12hello2  NaN  hello     NaN  1hello
1       NaN  NaN  hello  1hello  1hello
2       NaN  NaN  hello  1hello  1hello

df.replace({"^(.(?<!hello))*?$":np.nan},regex=True)
          a   b      c       d       e
0  12hello2 NaN  hello     NaN  1hello
1       NaN NaN  hello  1hello  1hello
2       NaN NaN  hello  1hello  1hello