Python Pandas:如何提取与Filter1或filter2匹配的数据帧行

Python Pandas:如何提取与Filter1或filter2匹配的数据帧行,python,pandas,Python,Pandas,我有一个熊猫数据框,看起来像这样,例如: label Y88_N diff div fold 0 25273.626713 17348.581851 2.016404 2.016404 1 29139.510491 -4208.868050 0.604304 -0.604304 2 34388.439717 -30147.834699 0.458903 -0.458903 3

我有一个熊猫数据框,看起来像这样,例如:

label          Y88_N          diff       div      fold  
0       25273.626713  17348.581851  2.016404  2.016404  
1       29139.510491  -4208.868050  0.604304 -0.604304  
2       34388.439717 -30147.834699  0.458903 -0.458903  
3       69704.254089 -32976.152490  0.116894 -0.116894  
4      193717.440783 -71359.494098  0.286045 -0.286045  
5       28996.634708  10934.944533  2.031293  2.031293  
6       45021.782930    680.437629  1.056383  1.056383  
但是有数千行。 当值位于“fold”列中时,我希望获得一个新的数据框,其中包含行 大于2或小于0.6。 因此,在最后,数据帧应该如下所示:

label          Y88_N          diff       div      fold  
0       25273.626713  17348.581851  2.016404  2.016404  
1       29139.510491  -4208.868050  0.604304 -0.604304  
5       28996.634708  10934.944533  2.031293  2.031293
我尝试过不同的方法,比如:

def ranged(start, end, step):
x = start
    while x < end:
        yield x
        x += step
df2 = df[~df['fold'].isin(ranged(-0.6, 2, 0.000001))]
def范围(开始、结束、步骤):
x=开始
而x


df2=df[(df['fold']>=2)和(df['fold']在第二个示例中,您只需要使用
(或)而不是
&
(和):

df2 = df[(df['fold'] >= 2) | (df['fold'] <= -0.6)]

df2
Out[6]: 
   label         Y88_N          diff       div      fold
0      0  25273.626713  17348.581851  2.016404  2.016404
1      1  29139.510491  -4208.868050  0.604304 -0.604304
5      5  28996.634708  10934.944533  2.031293  2.031293

df2=df[(df['fold']>=2)|(df['fold']在第二个示例中,您只需要使用
(或)而不是
&
(和):

df2 = df[(df['fold'] >= 2) | (df['fold'] <= -0.6)]

df2
Out[6]: 
   label         Y88_N          diff       div      fold
0      0  25273.626713  17348.581851  2.016404  2.016404
1      1  29139.510491  -4208.868050  0.604304 -0.604304
5      5  28996.634708  10934.944533  2.031293  2.031293
df2=df[(df['fold']>=2)|(df['fold']你可以做

In [276]: df[(df['fold'] >= 2) | (df['fold'] <= -0.6)]
Out[276]:
   label         Y88_N          diff       div      fold
0      0  25273.626713  17348.581851  2.016404  2.016404
1      1  29139.510491  -4208.868050  0.604304 -0.604304
5      5  28996.634708  10934.944533  2.031293  2.031293
而且,
pd.eval()
适用于包含大型数组的表达式

In [278]: df[pd.eval('df.fold >=2 | df.fold <=-0.6')]
Out[278]:
   label         Y88_N          diff       div      fold
0      0  25273.626713  17348.581851  2.016404  2.016404
1      1  29139.510491  -4208.868050  0.604304 -0.604304
5      5  28996.634708  10934.944533  2.031293  2.031293
[278]中的
:df[pd.eval('df.fold>=2 | df.fold你可以做

In [276]: df[(df['fold'] >= 2) | (df['fold'] <= -0.6)]
Out[276]:
   label         Y88_N          diff       div      fold
0      0  25273.626713  17348.581851  2.016404  2.016404
1      1  29139.510491  -4208.868050  0.604304 -0.604304
5      5  28996.634708  10934.944533  2.031293  2.031293
而且,
pd.eval()
适用于包含大型数组的表达式

In [278]: df[pd.eval('df.fold >=2 | df.fold <=-0.6')]
Out[278]:
   label         Y88_N          diff       div      fold
0      0  25273.626713  17348.581851  2.016404  2.016404
1      1  29139.510491  -4208.868050  0.604304 -0.604304
5      5  28996.634708  10934.944533  2.031293  2.031293

In[278]:df[pd.eval('df.fold>=2 | df.fold)这看起来不错,但每种方法(运行速度、内存等)的优点/缺点是什么?非常感谢您的回答,这真的很完整。我不知道为什么,但我尝试过类似于df[(df['fold'>=2)|(df['fold']这看起来不错,但每种方法(运行速度、内存等)的优点/缺点是什么?非常感谢您的回答,这真的很完整。我不知道为什么,但我尝试过类似于df[(df['fold]>=2)|(df['fold']只是一个技术点
df2=df[(df['fold]>=2)和(df['fold']仅仅是一个技术点
df2=df[(df['fold']>=2)和(df['fold']