Python 完成或使用FOR循环_Python_String_Pandas

Python 完成或使用FOR循环

python string pandas

Python 完成或使用FOR循环,python,string,pandas,Python,String,Pandas,我有一个如下的数据帧 Script Reco Rating Suggestion Mood Rel Buy Sell BuyL Sell ITC Sell Sell Sell Sell INFO Sell BuyN Sell Sell TCS Sell Sell Sell Sell 我想在“Reco”、“Rating”、“Suggestion”或

我有一个如下的数据帧

Script  Reco    Rating  Suggestion  Mood
Rel     Buy     Sell    BuyL        Sell
ITC     Sell    Sell    Sell        Sell
INFO    Sell    BuyN    Sell        Sell
TCS     Sell    Sell    Sell        Sell

我想在“Reco”、“Rating”、“Suggestion”或“Mood”列中获取字符串为“Buy”的行

我可以用下面的代码完成这项工作

df[(df['Reco'].str.contains('Buy', regex=True) | df['Rating'].str.contains('Buy', regex=True) | df['Suggestion'].str.contains('Buy', regex=True) | df['Mood'].str.contains('Buy', regex=True))]

但是，问题是我必须键入除“Script”之外的所有列的名称。为了避免这种情况，尝试做如下操作

cols_to_include = df.columns[df.columns != 'Script']
df[(df[i].str.contains('Buy') for i in cols_to_include)]

这不起作用&那是因为

(df['Reco'].str.contains('Buy', regex=True) | df['Rating'].str.contains('Buy', regex=True) | df['Suggestion'].str.contains('Buy', regex=True) | df['Mood'].str.contains('Buy', regex=True))

0     True
1    False
2     True
3    False
dtype: bool

鉴于

[df[i].str.contains('Buy') for i in cols_to_include]

[0     True
 1    False
 2    False
 3    False
 Name: Reco, dtype: bool, 0    False
 1    False
 2     True
 3    False
 Name: Rating, dtype: bool, 0     True
 1    False
 2    False
 3    False
 Name: Suggestion, dtype: bool, 0    False
 1    False
 2    False
 3    False
 Name: Mood, dtype: bool]

如何使

[df[i].str.contains（'Buy'）for i in cols\u to\u include]

返回如下值

0     True
1    False
2     True
3    False
dtype: bool

PS: 我知道可以通过以下输出来完成。但我正在寻找一种使用

for

循环的解决方案

cols_to_include = df.columns[df.columns != 'Script']
a = df[cols_to_include].astype(str).sum(axis=1)
df[a.str.contains('BUY', regex=True)]

应用字符串contains check元素可能更容易，然后使用

.any

聚合结果。因此：

df[cols\u to\u include].applymap（lambda x:Buy'in x）。any（axis=1）

您可以过滤掉“Script”，然后使用apply函数检查所需字符串

df.loc[df[[e for e in df.columns if e!='Script']].apply(lambda x: x.str.contains('Buy')).any(1)]

Script  Reco    Rating  Suggestion  Mood
0   Rel     Buy     Sell    BuyL    Sell
2   INFO    Sell    BuyN    Sell    Sell

您可以使用

堆栈

和

任意

m = df.drop('Script',1).stack().str.contains('Buy').any(level=0)

Out[1021]:
0     True
1    False
2     True
3    False
dtype: bool

下一步，使用它根据需要进行切片

df[m]

Out[1022]:
  Script  Reco Rating Suggestion  Mood
0    Rel   Buy   Sell       BuyL  Sell
2   INFO  Sell   BuyN       Sell  Sell

应该是

或@Barmar不在熊猫索引中。这不起作用。这将导致说明每一列“['Reco'、”评级“、”建议“、”情绪“]”是否具有购买价值。我想做的是按行排列。抱歉，我错过了axis的论点。现在试试，谢谢。还有一种方法（不使用as-for循环）