Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/329.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 如何删除Pandas中不到1%的行数中包含非零的列?_Python_Pandas_Dataframe_Data Analysis_Data Filtering - Fatal编程技术网

Python 如何删除Pandas中不到1%的行数中包含非零的列?

Python 如何删除Pandas中不到1%的行数中包含非零的列?,python,pandas,dataframe,data-analysis,data-filtering,Python,Pandas,Dataframe,Data Analysis,Data Filtering,我有以下数据集: Col1 Col2 Col3 Col4 Col5 Col6 Col7 Col8 Col9 Col10 ... Col991 Col992 Col993 Col994 Col995 Col996 Col997 Col998 Col999 Col1000 rows

我有以下数据集:

    Col1    Col2    Col3    Col4    Col5    Col6    Col7    Col8    Col9    Col10   ... 

Col991  Col992  Col993  Col994  Col995  Col996  Col997  Col998  Col999  Col1000
rows                                                                                    
Row1    0   0   0   0   0   0   0   0   0   0   ... 0   0   0   0   0   0   0   0   0   0
Row2    0   0   0   0   0   23  0   0   0   0   ... 0   0   0   0   7   0   0   0   0   0
Row3    97  0   0   0   0   0   0   0   0   0   ... 0   0   0   0   0   0   0   0   0   0
Row4    0   0   0   0   0   0   0   0   0   0   ... 0   0   0   0   0   0   0   0   0   0
Row5    0   0   0   0   0   0   0   0   0   0   ... 0   0   0   0   0   0   0   0   0   0
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
Row496  182 0   0   0   0   0   0   0   0   0   ... 0   0   0   0   0   0   116 0   0   0
Row497  0   0   0   0   0   0   0   0   0   0   ... 0   0   0   0   0   0   0   0   0   0
Row498  0   0   0   0   0   0   0   0   0   0   ... 0   0   0   0   0   0   0   0   0   0
Row499  0   0   0   0   0   0   0   0   0   0   ... 0   0   0   0   0   0   0   0   0   0
Row500  0   0   0   0   0   0   0   0   0   0   ... 0   0   0   0   0   0   125 0   0   0
我试图删除非零条目总数小于行数1%的列

我可以按列计算非零项的百分比

(df[df > 0.0].count()/df.shape[0])*100

对于那些列的数量仅在超过1%的行中具有非零的列,我应该如何使用此函数获取
df
?此外,如何更改代码以删除非零小于列的1%的行?

使用
mean
计算零的百分比:

df[df.eq(0).mean() >= 0.01]

您可以使用loc获取新df的指定列或行,如答案所示,基本上您可以这样做:

df.loc[rows, cols]  # accepts boolean lists/arrays
因此,可以通过以下方法实现带移除柱的df:

col_condition = df[df > 0].count() / df.shape[0] >= .01
df_ = df[:, col_condition]
如果需要在列和行之间切换,只需使用

df.T
因此,对于非零数量小于列长度1%的行,情况也是如此:

row_condition = df.T[df.T > 0].count() / df.shape[1] >= .01
df_ = df[row_condition]
以更简洁的形式:

df_ = df.loc[:, df.gt(0).mean() >= .01]  # keep columns
df_ = df[df.T.gt(0).mean() >= .01]  # keep rows