Python 过滤掉np百分位数中的零_Python_Pandas_Numpy_Percentile

Python 过滤掉np百分位数中的零

python pandas numpy

Python 过滤掉np百分位数中的零,python,pandas,numpy,percentile,Python,Pandas,Numpy,Percentile,我正在尝试对数据帧的列的分数进行小数我使用以下代码： np.percentile(df['score'], np.arange(0, 100, 10)) 我的问题是分数，有很多零。如何筛选出这0个值，而只对其余值进行十分位？使用布尔索引筛选它们： df.loc[df['score']!=0, 'score'] 或并将其传递给百分位函数 np.percentile(df['score'][lambda x: x!=0], np.arange(0,100,10)) 使用布尔索引筛选它们：

我正在尝试对

数据帧的列的分数
进行小数
我使用以下代码：
np.percentile(df['score'], np.arange(0, 100, 10))

我的问题是分数，有很多零。如何筛选出这0个值，而只对其余值进行十分位？
使用布尔索引筛选它们：
df.loc[df['score']!=0, 'score']

或
并将其传递给百分位函数
np.percentile(df['score'][lambda x: x!=0], np.arange(0,100,10))

使用布尔索引筛选它们：
df.loc[df['score']!=0, 'score']

或
并将其传递给百分位函数
np.percentile(df['score'][lambda x: x!=0], np.arange(0,100,10))

您可以简单地屏蔽零，然后使用以下命令将其从列中删除：
或者一步到位：
np.percentile(df['score'][df['score'] != 0], np.arange(0,100,10))

您可以简单地屏蔽零，然后使用以下命令将其从列中删除：
或者一步到位：
np.percentile(df['score'][df['score'] != 0], np.arange(0,100,10))

考虑数据帧df

df = pd.DataFrame(
    dict(score=np.random.rand(20))
).where(
    np.random.choice([True, False], (20, 1), p=(.8, .2)),
    0
)

       score
0   0.380777
1   0.559356
2   0.103099
3   0.800843
4   0.262055
5   0.389330
6   0.477872
7   0.393937
8   0.189949
9   0.571908
10  0.133402
11  0.033404
12  0.650236
13  0.593495
14  0.000000
15  0.013058
16  0.334851
17  0.000000
18  0.999757
19  0.000000

使用pd.qcut
进行小数
pd.qcut(df.loc[df.score != 0, 'score'], 10, range(10))

0     4
1     6
2     1
3     9
4     3
5     4
6     6
7     5
8     2
9     7
10    1
11    0
12    8
13    8
15    0
16    3
18    9
Name: score, dtype: category
Categories (10, int64): [0 < 1 < 2 < 3 ... 6 < 7 < 8 < 9]

考虑数据帧df

df = pd.DataFrame(
    dict(score=np.random.rand(20))
).where(
    np.random.choice([True, False], (20, 1), p=(.8, .2)),
    0
)

       score
0   0.380777
1   0.559356
2   0.103099
3   0.800843
4   0.262055
5   0.389330
6   0.477872
7   0.393937
8   0.189949
9   0.571908
10  0.133402
11  0.033404
12  0.650236
13  0.593495
14  0.000000
15  0.013058
16  0.334851
17  0.000000
18  0.999757
19  0.000000

使用pd.qcut
进行小数
pd.qcut(df.loc[df.score != 0, 'score'], 10, range(10))

0     4
1     6
2     1
3     9
4     3
5     4
6     6
7     5
8     2
9     7
10    1
11    0
12    8
13    8
15    0
16    3
18    9
Name: score, dtype: category
Categories (10, int64): [0 < 1 < 2 < 3 ... 6 < 7 < 8 < 9]

@MSeifert自从他们最近添加它以来，我一直认为它会很有效，但从未实际测试过它（当我有很长的数据帧名称时，我一直在使用它）。让我四处看看。：）你说得对。它在整列上运行，所以它的执行速度和掩蔽一样快。@MSeifert自从他们最近添加它以来，我一直认为它会很有效，但从来没有实际测试过它（当我有很长的数据帧名称时，我一直在使用它）。让我四处看看。：）你说得对。它对整列进行操作，因此它的执行速度与掩蔽一样快。