Python 根据开始时显示的内容筛选数据_Python_Python 3.x_Pandas

Python 根据开始时显示的内容筛选数据

python python-3.x pandas

Python 根据开始时显示的内容筛选数据,python,python-3.x,pandas,Python,Python 3.x,Pandas,我有一个如下所示的数据帧： df4 = pd.DataFrame({'Q':['chair', 'desk', '-----monitor', 'chair'], 'R':['red', '-- use blue or dark blue', 'yellow', 'purple'], 'S': ['-- is english spoken?', 'german', 'spanish', 'english']}) Q R

我有一个如下所示的数据帧：

df4 = pd.DataFrame({'Q':['chair', 'desk', '-----monitor', 'chair'], 'R':['red', '-- use blue  or dark blue', 'yellow', 'purple'], 'S': ['-- is english spoken?', 'german', 'spanish', 'english']})


              Q                       R                                S
0         chair                     Red            -- is english spoken?
1          desk    -- blue or dark blue                           german
2  -----monitor                  yellow                          spanish
3         chair                  purple                          english

              Q                       R                                S
0         chair                     Red            -- is english spoken?
1          desk       blue or dark blue                           ger--man
2  -----monitor                  yellow                          spanish
3         chair                  purple                          english

我想要的回报是：

              Q                       R                                S
3         chair                  purple                          english

              Q                       R                                S
1          desk       blue or dark blue                           ger--man
3         chair                  purple                          english

如果任何列的“-”值在开头出现2次或更多次，我希望过滤掉整行

df4[~df4.applymap(lambda x : str.startswith(x,'--')).any(1)]

我找到了一个用于过滤数值的线程，但是有没有办法过滤掉特殊字符？特别是正则表达式

编辑#1:

我只希望在开始时出现两次或更多次“-”时删除行。如果这个值出现在某个文本的中间，那就好了。假设我的数据框如下所示：

df4 = pd.DataFrame({'Q':['chair', 'desk', '-----monitor', 'chair'], 'R':['red', '-- use blue  or dark blue', 'yellow', 'purple'], 'S': ['-- is english spoken?', 'german', 'spanish', 'english']})


              Q                       R                                S
0         chair                     Red            -- is english spoken?
1          desk    -- blue or dark blue                           german
2  -----monitor                  yellow                          spanish
3         chair                  purple                          english

              Q                       R                                S
0         chair                     Red            -- is english spoken?
1          desk       blue or dark blue                           ger--man
2  -----monitor                  yellow                          spanish
3         chair                  purple                          english

我会把这个还给你：

              Q                       R                                S
3         chair                  purple                          english

              Q                       R                                S
1          desk       blue or dark blue                           ger--man
3         chair                  purple                          english

编辑#2:

我试过这个：

df4[~df4.Q.str.startswith(('--'))]

但这只适用于一列，而不是所有列。

使用

applymap

和

中的和任何
df4[~df4.applymap(lambda x : '--' in x).any(1)]
Out[287]: 
       Q       R        S
3  chair  purple  english

仅更新在开始时排除特定的
df4[~df4.applymap(lambda x : str.startswith(x,'--')).any(1)]

可能更复杂的操作需要如果任何列的“-”值在开头出现2次或更多次，我想过滤掉整行。
我目前正在测试温的解决方案。这可能是另一个线程的主题，但我不完全理解这一部分：“lambda x:-in x”我的列中没有一个被命名为x？如果至少有1个，这会过滤掉，但我需要两个或更多。我在玩弄你给我的东西，你能不能把上面的代码改成：df4[~df4.applymap（lambda x:'--'in x.any（1）]
实际上，这是个好主意