Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/322.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/vim/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 如何从数据帧中删除少数行?_Python_Pandas_Dataframe_Delete Row - Fatal编程技术网

Python 如何从数据帧中删除少数行?

Python 如何从数据帧中删除少数行?,python,pandas,dataframe,delete-row,Python,Pandas,Dataframe,Delete Row,我有一个数据框。对于前两列中所有可能的值组合,我想删除那些行数小于100的行 例如,在第一列中有5行,在第二列中有“A”和“B”。我想从数据框中删除所有这些行。共有110行,其中第一行和第二行分别包含“C”和“D”。这些行我不想删除,因为110>5 最优雅、最快捷的方法是什么 这就是我目前的解决方案: gr = df.groupby(['L_ID', 'P_ID']) for group in gr.groups: df_tmp = gr.get_group(group) n_v

我有一个数据框。对于前两列中所有可能的值组合,我想删除那些行数小于100的行

例如,在第一列中有5行,在第二列中有“A”和“B”。我想从数据框中删除所有这些行。共有110行,其中第一行和第二行分别包含“C”和“D”。这些行我不想删除,因为110>5

最优雅、最快捷的方法是什么

这就是我目前的解决方案:

gr = df.groupby(['L_ID', 'P_ID'])
for group in gr.groups:
    df_tmp = gr.get_group(group)
    n_vals = len(df_tmp)
    if n_vals < min_n:
        df = df[(df['L_ID'] != group[0]) | (df['P_ID'] != group[1])]
gr=df.groupby(['L_ID','P_ID']))
对于gr.groups中的组:
df_tmp=gr.get_组(组)
n_vals=len(df_tmp)
如果n值<最小值:
df=df[(df['L_ID']!=组[0])|(df['P_ID']!=组[1])]
您可以使用以下方法:

更新 当有更多列时,此方法似乎不起作用:

>>> df1 = pd.DataFrame({'a':list('AAABB'), 'b':list('BBBAA'), 'c':range(5), 'd':range(5)})
>>> df1
   a  b  c  d
0  A  B  0  0
1  A  B  1  1
2  A  B  2  2
3  B  A  3  3
4  B  A  4  4
>>> df1.groupby(['a','b']).filter(lambda x: len(x) > 2)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Python27\lib\site-packages\pandas\core\groupby.py", line 2094, in filter
    if res:
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
您还可以使用:

>>> df1 = pd.DataFrame({'a':list('AAABB'), 'b':list('BBBAA'), 'c':range(5), 'd':range(5)})
>>> df1
   a  b  c  d
0  A  B  0  0
1  A  B  1  1
2  A  B  2  2
3  B  A  3  3
4  B  A  4  4
>>> df1.groupby(['a','b']).filter(lambda x: len(x) > 2)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Python27\lib\site-packages\pandas\core\groupby.py", line 2094, in filter
    if res:
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
>>> df1.groupby(['a','b']).filter(lambda x: len(x['c']) > 2)
   a  b  c  d
0  A  B  0  0
1  A  B  1  1
2  A  B  2  2
>>> df1[df1.groupby(['a','b'])['c'].transform(lambda x: len(x) > 2).astype(bool)]
   a  b  c  d
0  A  B  0  0
1  A  B  1  1
2  A  B  2  2