Python 如何在pandas中跨列选择具有相同值的行？_Python_Pandas

Python 如何在pandas中跨列选择具有相同值的行？

python pandas

Python 如何在pandas中跨列选择具有相同值的行？,python,pandas,Python,Pandas,我有一个9列的df。每列的值为0,1 1-表示异常值根据9种不同的算法，它是异常值。我想选择那些真正的异常值，下面的查询确实有效 true_outliers= outliers[ (outliers['isolation_forest_300000']==1) & (outliers['knn_1000']==1) & (outliers['knn_10000']==1)& (outliers['ab

我有一个9列的df。每列的值为0,1

1-表示异常值

根据9种不同的算法，它是异常值。我想选择那些真正的异常值，下面的查询确实有效

true_outliers= outliers[ 
        (outliers['isolation_forest_300000']==1) & 
        (outliers['knn_1000']==1) &
        (outliers['knn_10000']==1)&
        (outliers['abod_neighbors_5_1000']==1)&
        (outliers['abod_neighbors_5_10000']==1)&
        (outliers['abod_neighbors_10_1000']==1)&
        (outliers['hbos_1000']==1)&
        (outliers['hbos_10000']==1)&
        (outliers['hbos_100000']==1)]

但是，我如何像这样重构它：

for col in outliers.columns.tolist():
     s= outliers[outliers[col] == 1]

我希望它通过循环，只选择每列中为“1”的行

如果要选择每列上为

的行，最好使用掩码

样本df：

使用

eq

和

all

创建遮罩和切片

df[df.eq(1).all(1)]

Out[267]:
   isolation_forest_300000  knn_1000  knn_10000  abod_neighbors_5_1000  \
0                        1         1          1                      1
3                        1         1          1                      1

   abod_neighbors_5_10000  abod_neighbors_10_1000  hbos_1000  hbos_10000  \
0                       1                       1          1           1
3                       1                       1          1           1

   hbos_100000
0            1
3            1

我认为这可以帮助你：

import functools
import operator
import pandas as pd

data = [[0, 0, 0, 0, 0, 0, 0, 0, 0],
        [1, 0, 0, 0, 0, 0, 0, 0, 0],
        [0, 1, 0, 0, 0, 1, 0, 0, 0],
        [0, 0, 1, 0, 0, 0, 0, 1, 0],
        [0, 0, 0, 0, 0, 0, 0, 0, 0],
        [0, 1, 0, 0, 0, 1, 0, 0, 0],
        [0, 0, 0, 0, 0, 0, 0, 0, 0],
        [1, 1, 1, 1, 1, 1, 1, 1, 1],
        [0, 0, 0, 0, 0, 0, 0, 0, 0]]

df = pd.DataFrame(
    data, columns=[str(i) for i in range(9)]
)

condition = functools.reduce(
    operator.and_,
    (df[col] == 1 for col in df.columns)
)

print(df[condition])

提供样本数据和预期结果output@AkshayNevrekar，thx u，我的机器在工作时没有交通，所以从电话上传，很抱歉给您带来不便，请尝试添加

.loc

，如下

true\u outliers=outliers.loc[您的情况]

import functools
import operator
import pandas as pd

data = [[0, 0, 0, 0, 0, 0, 0, 0, 0],
        [1, 0, 0, 0, 0, 0, 0, 0, 0],
        [0, 1, 0, 0, 0, 1, 0, 0, 0],
        [0, 0, 1, 0, 0, 0, 0, 1, 0],
        [0, 0, 0, 0, 0, 0, 0, 0, 0],
        [0, 1, 0, 0, 0, 1, 0, 0, 0],
        [0, 0, 0, 0, 0, 0, 0, 0, 0],
        [1, 1, 1, 1, 1, 1, 1, 1, 1],
        [0, 0, 0, 0, 0, 0, 0, 0, 0]]

df = pd.DataFrame(
    data, columns=[str(i) for i in range(9)]
)

condition = functools.reduce(
    operator.and_,
    (df[col] == 1 for col in df.columns)
)

print(df[condition])