Python 数据帧-列选择_Python_Pandas

Python 数据帧-列选择

python pandas

Python 数据帧-列选择,python,pandas,Python,Pandas,我有以下输入数据帧： PRECISE 1 RE=130 VAL=5 LENGHT=8 TYPE=DEL AF=0.0005 PRECISE 8 RE=30 VAL=8 LENGHT=8 TYPE=INS AF=0.05 PRECISE 3 RE=13 VAL=85 LENGHT=8 TYPE=INV AF=0.08 PRECISE 7 RE=10 VAL=18 LENGHT=8 TYPE=DEL AF=0.

我有以下输入数据帧：

PRECISE 1   RE=130  VAL=5   LENGHT=8    TYPE=DEL    AF=0.0005
PRECISE 8   RE=30   VAL=8   LENGHT=8    TYPE=INS    AF=0.05
PRECISE 3   RE=13   VAL=85  LENGHT=8    TYPE=INV    AF=0.08
PRECISE 7   RE=10   VAL=18  LENGHT=8    TYPE=DEL    AF=0.001

data = {'data1': ['A', 'B', 'Cz', 'D'], 'data2': ['az', 'za', 'c', 'd']}
df = pd.DataFrame.from_dict(data, orient='index',
                       columns=['col1', 'col2', 'col3', 'col4'])

如果panda.Series包含以下值

（'RE='，'AF='）

，我想选择列。我不能按列名选择，因为根据用于生成文件的工具的版本，它可能是可变的。但是不同版本的工具中的标签保持不变

预期产出：

RE=130  AF=0.0005
RE=30   AF=0.05
RE=13   AF=0.08
RE=10   AF=0.001

我尝试了以下代码：

RE_cols=[col for col for col in df_b.columns if df_b[col].str.contains（'RE='）]

但我没有设法解决以下错误消息：

ValueError：序列的真值不明确。使用a.empty、a.bool（）、a.item（）、a.any（）或a.all（）。

有什么帮助吗

#import pandas
import pandas as pd

安装程序假设您有一个数据帧：

PRECISE 1   RE=130  VAL=5   LENGHT=8    TYPE=DEL    AF=0.0005
PRECISE 8   RE=30   VAL=8   LENGHT=8    TYPE=INS    AF=0.05
PRECISE 3   RE=13   VAL=85  LENGHT=8    TYPE=INV    AF=0.08
PRECISE 7   RE=10   VAL=18  LENGHT=8    TYPE=DEL    AF=0.001

data = {'data1': ['A', 'B', 'Cz', 'D'], 'data2': ['az', 'za', 'c', 'd']}
df = pd.DataFrame.from_dict(data, orient='index',
                       columns=['col1', 'col2', 'col3', 'col4'])

看起来是这样的：

解决方案例如，如果要选择包含字母

的列：

你可以做：

some_string_the_column_needs_to_contain_to_be_selected = 'z'
filtered_df=df[[col for col in df.columns if any(df[col].str.contains(some_string_the_column_needs_to_contain_to_be_selected))]]

您的

过滤的\u df

将是：

正如所料

安装程序假设您有一个数据帧：

PRECISE 1   RE=130  VAL=5   LENGHT=8    TYPE=DEL    AF=0.0005
PRECISE 8   RE=30   VAL=8   LENGHT=8    TYPE=INS    AF=0.05
PRECISE 3   RE=13   VAL=85  LENGHT=8    TYPE=INV    AF=0.08
PRECISE 7   RE=10   VAL=18  LENGHT=8    TYPE=DEL    AF=0.001

data = {'data1': ['A', 'B', 'Cz', 'D'], 'data2': ['az', 'za', 'c', 'd']}
df = pd.DataFrame.from_dict(data, orient='index',
                       columns=['col1', 'col2', 'col3', 'col4'])

看起来是这样的：

解决方案例如，如果要选择包含字母

的列：

你可以做：

some_string_the_column_needs_to_contain_to_be_selected = 'z'
filtered_df=df[[col for col in df.columns if any(df[col].str.contains(some_string_the_column_needs_to_contain_to_be_selected))]]

您的

过滤的\u df

将是：

正如所料。

自行评估

f_b[col].str.contains（'RE='）

。您将得到一系列布尔值。这就是你的错误。您可以使用

.any（）

或

.all（）

单独计算

f_b[col].str.contains（'RE='）

将其压缩为标量。您将得到一系列布尔值。这就是你的错误。您可以使用

.any（）

或

.all（）